0 évaluation0% ont trouvé ce document utile (0 vote)
48 vues21 pages
This chapter introduces techniques for studying DNA, including restriction enzymes, cloning, libraries, identification of clones, DNA sequencing, and PCR. These techniques allow biologists to examine DNA in test tubes or cells. Hybridization is central to many techniques and involves probes finding complementary DNA sequences. The chapter describes cloning DNA, including types of insert DNA, vector criteria, and identifying desired clones. It also discusses restriction mapping, gel electrophoresis, blotting, DNA sequencing, and PCR. Students should understand the basic steps and applications of these techniques.
This chapter introduces techniques for studying DNA, including restriction enzymes, cloning, libraries, identification of clones, DNA sequencing, and PCR. These techniques allow biologists to examine DNA in test tubes or cells. Hybridization is central to many techniques and involves probes finding complementary DNA sequences. The chapter describes cloning DNA, including types of insert DNA, vector criteria, and identifying desired clones. It also discusses restriction mapping, gel electrophoresis, blotting, DNA sequencing, and PCR. Students should understand the basic steps and applications of these techniques.
This chapter introduces techniques for studying DNA, including restriction enzymes, cloning, libraries, identification of clones, DNA sequencing, and PCR. These techniques allow biologists to examine DNA in test tubes or cells. Hybridization is central to many techniques and involves probes finding complementary DNA sequences. The chapter describes cloning DNA, including types of insert DNA, vector criteria, and identifying desired clones. It also discusses restriction mapping, gel electrophoresis, blotting, DNA sequencing, and PCR. Students should understand the basic steps and applications of these techniques.
Synopsis: This chapter introduces you to many of the recombinant DNA techniques that have provided a powerful new approach for studying the mechanisms of inheritance and functions of specific genes !estriction en"ymes# cloning DNA# ma$ing libraries# identifying clones of interest# DNA sequencing and %C! amplification are now &ust a part of the tool$it that all biologists 'not &ust geneticists( use These techniques will be referred to over and over throughout this te)tboo$ 'and probably in your other biology courses as well( so it is worthwhile to get a solid understanding of these techniques from this chapter As you read about the various techniques and apply them to solve problems# try to $eep in mind which techniques are done in solutions in test tubes 'restriction en"yme digests# ligating fragments together# %C!# DNA sequencing# ma$ing cDNA( and which techniques involve analy"ing or manipulating DNA in cells 'transformations# screening libraries# preparing large amounts of cloned DNA# total genomic DNA or cellular !NA( This should help your understanding of the techniques and their uses *ybridi"ation of nucleic acids is central to many techniques but is often challenging to understand The basis of hybridi"ation is complementarity of bases in forming double stranded nucleic acids A probe DNA or !NA molecule is used to locate a specific sequence 'on a nitrocellulose or membrane based blot after electrophoresis in a gel# as a clone inside a cell# or in a chromosome squash( based on hybridi"ation A probe contains a recogni"able radioactive or fluorescent tag that ma$es it possible to identify the place where the probe found a complementary sequence Signifcant Elements: After reading the chapter and thin$ing about the concepts# you should be able to+ Describe the essential steps in cloning Describe the basic components and uses of different types of cloning vectors ,a$e a map of restriction en"yme sites !ead and interpret DNA sequencing gels 'Feature Figure 9.13( and automated DNA sequencing results 'Figure 9.14( Design %C! primers Determine which technique's( you must use to achieve a desired goal There is often more than one way to reach a goal *owever# there is usually one most efficient# preferred way to solve a problem The technique used determines what is being e)amined and limits the interpretation of the data -or instance# probing a genomic library will give you a clone that is homologous to the probe# but this 17. Chapter 9 clone probably won/t be transcribed and translated in E. coli %robing a cDNA library will give you a clone which can be translated and transcribed in E. coli Problem Solving Tips: Essential Steps in Cloning: Cloning is basically a straightforward process that has lots of options and variations that can be used depending on what is desired 0asic components are insert DNA and vector There are relatively few sources for the insert DNAs *owever there are many# many types of vectors that have been developed for various purposes Types of insert DNA cDNAs contain only the regions of genes that are present in processed 'spliced( transcripts synthesi"ed in the cell from which they were isolated 'Figure 9.8( genomic DNAs are digested fragments of the genomic DNA of an organism# and so contain all of the DNA 'genes and non1coding regions( from the cells 0asic vector criteria vectors must have an origin of replication so they can be replicated in the host organism# usually E. coli vectors must have a selectable mar$er's( so you can determine that they are present in the host organism2 the selectable mar$er is often an antibiotic resistance vectors also often have multiple cloning sites with $nown restriction sites and ways to detect the presence of an insert DNA after cloning 3ne e)ample of an insert detection system is the 41 galactosidase 5 61gal detection system 7nsertion of a fragment into the middle of the lacZ gene inactivates the gene Cells carrying an insert within the lacZ gene are unable to cleave a lactose1 li$e substrate '61gal( and are phenotypically 8ac 1 They are recogni"ed as white colonies while colonies that received intact copies of the vector 'no insert interrupting the lacZ gene( can cleave the substrate# turning the cells blue Types of vectors5purpose of cloning ' Table 9.2( plasmid vectors accept small pieces of insert DNA '19 $b or less( %lasmid vectors may be used to amplify large amounts of specific DNA sequences :peciali"ed plasmid vectors called e)pression vectors allow transcription and translation of cloned genes2 must be used with cDNA inserts ';enetics and :ociety# !ecombinant DNA Technology and %est1resistant Crops Figure A( <se your $nowledge of the requirements for transcription and translation when considering if genes cloned into e)pression vectors will be e)pressed in the host cell 0AC vectors 'bacterial artificial chromosomes( accept very large inserts of =99 $b Cloning Chapter 9 17= after restriction en"yme digestion# mi) insert and vector DNAs and ligate together stic$y ends that have complementary overhanging single1stranded bases can be 7t may be helpful to draw out the >/ and =/ ends generated 'including the individual bases of the recognition site( when a double stranded DNA is cut by a restriction en"yme 'Figure 9.2( transform the ligation mi) into the host cells# usually E. coli select for presence of vector 'may also be able to isolate those vectors that you $now have an insert( grow up a large amount of the clone's( 7dentifying the desired clone often you must identify a particular desired clone from a large variety of different inserts2 this usually involves probing# or hybridi"ation with a labeled DNA 3ther Techniques gel electrophoresis separates DNA fragments according to their si"e 'Feature Figure 9.4( blotting is the process of transferring the material in the gel to a nitrocellulose filter or a nylon membrane and covalently binding the material from the gel to the filter or membrane A :outhern blot has DNA on the membrane 'a genomic :outhern has genomic DNA(# a Northern blot has m!NA on the membrane and a ?estern blot has protein on the membrane 'Feature Figure 9.11( !estriction mapping is part science and part art# li$e putting together a &igsaw pu""le <se a pencil and an eraser 0e patient The first step is usually ascertaining if you began with a linear or a circular piece of DNA <sually this is gotten out of conte)t 1 a plasmid clone is circular# for instance 0egin the map by e)amining a single digestion lane on the gel and determining the total si"e of the DNA 'the sum of all the fragments( and the number of restriction sites for that en"yme '. fragments when you digested a circular piece of DNA means there were . restriction sites2 . fragments when you digested a linear piece of DNA means there was only 1 restriction site( Ne)t# loo$ at the double digestion lane Determine which bands from the single digestion are left undigested in the double en"yme digestion The fragments from the single en"yme digestion that disappear in the double digestion must have a restriction site for the second en"yme within them -igure out which smaller fragments they have been bro$en into# then begin mi)ing and matching various combinations of bands until you find one that gives you an order that will give the correct pattern of bands when you digest the DNA with the second restriction en"yme alone 'see problems 91> and 91@( ,a$e sure the final sites you put on a map are consistent with results from all digests DNA sequencing provides the ultimate description of a cloned fragment of DNA ,a$e sure you can e)plain the :anger sequencing method 'dideo)y sequencing( to a friend 'Feature Figure 9.13( %C! rapidly purifies and amplifies a single DNA fragment from a comple) mi)ture 'Feature Figure 9.12( 7n order to do %C! you must $now something about the DNA sequence of . short 17A Chapter 9 stretches of the DNA to be amplified The DNA fragment to be amplified is defined by a pair of oligonucleotide primers that are each complementary to one of the strands of the DNA template These primers are e)tended at their =/ ends The si"e of the final product of the %C! reaction is determined by the distance between the >/ ends of the primer pair Solutions to Problems: Vocabulary 9-1. a 102 b 12 c 92 d 72 e 62 f 22 g 82 h 32 i 52 & 4 Section 9.1 Sequence-Specifc DNA Fragmentation 9-2. a Sau=A recognition sites are A bases long and are e)pected to occur randomly every A A or .>@ bases The human genome contains about = 19 9 bases# one would e)pect =)19 9 5.>@ B 1.)19 7
~12,000,000 fragments b Bam*7 recognition sites are @ bases long and would be e)pected every A @ or A99@ bases =)19 9 5A#199 B7=)19 > ~700,000 fragments are e!e"te# c The Sfi7 recognition site is C specific bases The N indicates that any of the four bases is possible at that site and therefore does not enter into the calculations !ecognition sites would be e)pected every A C or @>#>=@ bases2 =)19 9 5@>#>99 B A@)19 A ~46,000 fragments are e!e"te# 9-3. :ee Feature Figure 9.4 and the section in the chapter /;el electrophoresis distinguishes DNA fragments according to si"e/ The rate at which a piece of DNA moves through a gel is dependent on the strength of the electric field# the gel composition# the charge density and the physical si"e of the molecule ?hen electrophoresing DNA the only variable is the si"e of the molecule 1 all the rest of the variables are the same for each molecule $%nger &'A m%le"ules ta(e u! m%re )%lume an# t*eref%re bum! int% t*e gel matri, sl%+ing #%+n t*e m%le"ule,s m%)ement :horter molecules can easily slip through many pore si"es in the gel matri) 9-4. ?hen you digest a circular DNA one fragment indicates that the DNA has 1 restriction site for the en"yme Thus# Bam*7 and Eco!7 each cut the plasmid once The double digest gives information about the relative positions of these two sites The . restriction sites are at two different positions on the Chapter 9 17> plasmid The Eco!7 site is = $b away from the Bam*7 site and it is @ $b around the rest of the circle bac$ to the Eco!7 9-5. a !emember the %roblem :olving Tips at the beginning of this chapterD 7f there is one restriction site then digesting a circular molecule results in one fragment# while digesting a linear molecule generates two fragments Digestion of a circular molecule will always result in one fewer restriction fragments than the digest of a linear molecule -am!le A is t*eref%re t*e "ir"ular f%rm of the bacteriophage DNA b The length of the linear molecule is determined by adding the lengths of the fragments from one digest >9E=9E.9 $b B 10.0 (b 'This si"e is not realistic 1 F DNA is# in fact# about >9 $b in length( c The circular form is the same length 1 10.0 (b d Comparison of the circular and linear maps gives you information on which fragments contain the ends of the linear molecule The >9 $b Eco!7 fragment is present in the circular but not the linear digest so the A9 and 19 $b fragments must be &oined in the circular map while they are at either end of the linear molecule 0egin drawing a picture of the molecule for yourself at this point The same logic applies to the .7 $b Bam*7 fragment G it is present in the circular but not the linear digest so the ..$b and 9> $b pieces must be at the ends of the linear molecule 7f the 9> $b Bam*7 fragment was at the end where the Eco!7 19 $b fragment is# the 19 $b Eco!7 fragment would have been cut by Bam*7 in the double digest *owever# the 19 $b fragment is still in the double digest# so the 9> $b fragment must be within the A9 $b Eco!7 fragment The remaining Eco!7 site is placed based on the double digests The .9 $b Eco!7 fragment is not cut by Bam*7 but the =9 $b fragment is# so place the site within the =9 $b Now double chec$ that all the Bam*7EEco!7 fragment si"es are as seen in the different double digests 17@ Chapter 9 9-6. %lasmids are circular pieces of DNA# thus the Eco!7 and Sal7 digests indicate that there is one site for each of these en"ymes Hind777# in contrast# cuts the molecule at three sites Draw a circle showing the three Hind777 sites 7n the Sal7EHind777 digest the A9 $b Hind777 fragment is cut into .> and 1> $b fragments The Sal7 site is therefore 1> $b from one end or the other in the A9 $b Hind777 fragment :imilarly the Eco!7EHind777 double digest splits the 19 $b Hind777 fragment into 9@ and 9A $b fragments# but the orientation of the Eco!7 site within the 19 $b Hind777 is ambiguous Try placing the Eco!7 site in the two different positions in the 1 $b Hind777 fragment 7n each case see how this fits with the Eco!7ESal7 digestion results The orientation that wor$s places the 9A $b Hind7771Eco!7 fragment ad&acent to the .> $b Sal71Hind777 fragment Section 9.2 Cloning Fragments of DNA 9-7. :electable mar$ers in vectors provide a means %f #etermining +*i"* "ells in t*e transf%rmati%n mi ta(e u! t*e )e"t%r These mar$ers are often drug resistance genes so a drug can be added to the media and only those cells that have received and maintained the vector will grow 9-8. The study of genes often involves studying mutations in the genes and the phenotypes 'or diseases( associated with these mutations 7f you are interested in studying mutations and diseases then you want to focus on the protein1coding part of the genes .u(ar/%ti" genes are %ften )er/ large *owever t*e ma0%rit/ %f t*is &'A "%nsists %f intr%ni" se1uen"es which do not end up in the m!NA -or e)ample the human dystrophin gene in humans is .#>99 $b '.> ,b# see -igure C1>( The gene has more than C9 introns which are spliced out to give an m!NA that is 1A$b long Therefore .#AC@ $b of the dystrophin gene is intronsD Thus# m%st %f t*e &'A in eu(ar/%ti" gen%mi" libraries #%es n%t "%#e f%r Chapter 9 177 !r%teins 7t can be difficult to figure out which sequences of the genomic DNA are actually part of the m!NA so it can be difficult to figure out which gene sequences are important to the protein and which are unimportant "&'A libraries, +*i"* are ma#e fr%m t*e m2'As, all%+ /%u t% ign%re all %f t*ese intr%ni" se1uen"es All eu$aryotic m!NAs have polyA tails at their =/ end and this is used to ma$e cDNAs The process begins by isolating m!NAs from an organism or a tissue in an organism and then using polyT primer with reverse transcriptase 'Figure 9.8( 7n pro$aryotes most of the DNA in the genome codes for m!NA G there is very little non1 transcribed DNA %ro$aryotes also lac$ introns# so without processing the transcript is the same thing as the m!NA 7n general the >/ and =/ <T!s are small# so most of the m!NA consists of coding sequences 7t would also be difficult to ma$e cDNA libraries in !r%(ar/%tes be"ause t*ere is n% !%l/A tail n%r an/ %t*er "%mm%n se1uen"e bet+een all m2'As 9-9. -irst# wor$ through the digestion and ligation of the DNA fragments and the vector The vector is cut with Bam*7# leaving the following ends+ >/ H; ;ATCCH =/ HCCTA; ;H The insert DNA is cut with Mbo7# leaving the following stic$y ends+ >/ 3 4AT53 =/ 35TA4 3 The ligation of an Mbo7 fragment to a BamH7 stic$y end will only occasionally create a sequence that can be digested by Bam*7 7t depends on the e)act base sequence at the ends of the Mbo7 fragment The /6/ in the sequence below indicates this ambiguity 7n all cases the following sequence will be found+ The sequences from the inserted Mbo7 fragment are in bold >/ H;4AT56HHHHHHHH6;ATCCH =/ HCCTA;6 HHHHHHHH65TA4;H a 1007 of the &unctions can be digested with Mbo7 b A &unction that can be digested with Bam*7 must have a C at the =/ end of the Mbo7 recognition sequence This would occur 15A or 257 %f t*e time c '%ne of the &unctions will be cleavable by Xor77 d The first five bases fit the recognition site for Eco!77 The final position must be a pyrimidine 'C or T( There is a 182 "*an"e that the &unction will contain an EcoR77 site e -or the restriction site to be a Bam*7 site in the human genome it must have had a ; at the >/ end This ; was in the vector sequence in the clones created The chance that the >/ end was N3T a ;B384 9-10. 17C Chapter 9 a The gen%mi" librar/ is based on the most inclusive and comple) starting material# so it would consist of the greatest number of different clones b All %f t*ese libraries +%ul# %)erla! ea"* %t*er t% s%me etent The genomic library contains all the DNA sequences# while the other libraries are made up of subsets of the genomic sequences All cells e)press a common subset of genes 'house$eeping genes( These genes would result in some overlap of clones# although the cDNA libraries will each contain some unique sequences Although introns often have repeated DNA# the transcribed and translated portions of sequences are usually unique# so the library of unique genomic sequences will overlap with the cDNA libraries as well c 4T*e gen%mi" libraries/ are "reate# fr%m uses t*e t%tal "*r%m%s%mal &'A an# insi#e t*e "ell. T*e re!etiti)e se1uen"es in t*e gen%mi" &'A +%ul# *a)e t% be rem%)e# t% "reate at*e uni1ue &'A librar/. T*e "&'A libraries are t/!i"all/ "reate# fr%m all start +it* t*e m2'A !resent in t*e "ells an# t*us re!resent t*eref%re t*e e!resse# genes in t*ese "ells :ince genomic DNA libraries are created from all of the DNA in the cell# genomic DNA libraries from either the liver or brain should be identical *owever# cDNA libraries from liver and the brain should have some clones that are identical between them but they should also have clones that are entirely unique to each one as well as having clones that are derived from the same genes but represent splice variants 9-11. a Iou need 4-5 gen%me e1ui)alents to reach a 9>J confidence level that you will find a particular unique DNA sequence b The number of clones needed depends on the total si"e of the genome of your research organism and the average insert si"e in the vector 0AC inserts can be >99$b while plasmid vectors normally have inserts smaller than 1> $b &i)i#e t*e number %f base !airs in t*e gen%me b/ t*e a)erage insert si9e t*en multi!l/ b/ fi)e to get the number of clones in five genome equivalents 9-12. a An intact copy of the whole gene would be on a fragment larger than 1A9 $bp and would therefore have to be cloned into a :A5 )e"t%r b The entire coding sequence of 9=C7 $bp could be cloned into a :A5 !lasmi# )e"t%r 'K1>$bp( '=91A> $b inserts( as a cDNA copy of the gene c L)ons are usually small enough to clone into a !lasmi# )e"t%r 'K1> $bp inserts( 9-13. ;*en t*e )e"t%r <!;2590= is #igeste# +it* Eco2> /%u get %ne 2.4 (b fragment. ;*en t*e )e"t%r is #igeste# +it* Mbo> t*ere are 3 fragments - 0.3, 0.5 an# 1.6 (b The somatostatin insert was cloned into the vector at the Eco!7 site There is also an Eco!7 site very near one end of the insert Chapter 9 179 DNA Therefore, after #igesti%n %f t*e re"%mbinant !lasmi# +it* Eco2>, a small Eco2> insert fragment %f 49 b! an# t*e )e"t%r fragment %f 2.4 (b +ill be generate# Ne)t# consider the Mbo7 restriction pattern The insert fragment contains an Mbo7 site > bp from one end The insert fragment could ligate into the vector in either of . possible orientations >n %ne %rientati%n t*e Mbo> site in t*e insert is nearest t*e 700 b! Mbo> )e"t%r fragment, s% #igesti%n +it* Mbo> !r%#u"es 705, 300, 500 an# 944 <f%rme# fr%m t*e 900 b! )e"t%r fragment ? t*e rest %f t*e insert= b! fragments. >n t*e %t*er %rientati%n, t*e Mbo> #igest !r%#u"es 905, 500, 300 an# 744 b! fragments. 9-14. Draw the recombinant plasmid to help you determine the fragment si"es before s$etching the gel 9-15. a The goal of a ligation is to generate clones which have attached one piece of frog DNA to one vector molecule A ligation mi)ture consists of linear double stranded vector DNA with complementary Eco!7 stic$y ends 'Figure 9.2b and Figure 9.6( at both ends and linear double stranded frog DNA with complementary Eco!7 stic$y ends at both ends 8igase simply attaches a =/3* 'hydro)yl( group to a >/% 'phosphate( There are three different products that will occur in a ligation mi) 'i( The desired ligation is vector5frog 'intermolecular ligation( 'ii( 8igase will also &oin vector5vector 'intramolecular ligation which yields reconstituted vector molecules with no inserts( and 'iii( frog5frog 'intramolecular ligation# giving chains of insert DNA with no vector( 7n order to encourage the desired result you add more vector than insert G the vector DNA is easier to come by This decreases the li$elihood of chains of the insert DNA and increases the probability that any vector molecule that is ligated to an insert is only ligated to one insert molecule *owever adding more vector increases the li$elihood of reconstituted vector with N3 inserts To decrease the amount of reconstituted vector you treat the linear# digested vector with al$aline phosphatase Al(aline !*%s!*atase rem%)es t*e 5,-!*%s!*ate gr%u!s on the linear DNA molecule G see M below !emember that this represents the digested vector# so the DNA strands are contiguous 1C9 Chapter 9 e)cept for the bo)ed area This continuity is represented by the dashes at the ends of the lines The bo)ed area represents the stic$y ends created by Lco!7 Chapter 9 1C1 3'OH 5'P* 5'P* 3'OH After the treatment with al$aline phosphatase ligase "an n%t 0%in a */#r%/l gr%u! t% t*e #e- !*%s!*%r/late# 5, en#s Therefore the . ends of the vector can not be ligated to each other and this treated molecule will remain linear 7f insert DNA is added then the ligase will &oin the =/3* on the vector with the >/% on the insert 7n effect this will ligate the left end of the top strand of the vector shown above to the insert The left end of the bottom strand can not be ligated to the insert leaving a nic$ in the bottom strand at this point 3n the right end the bottom strand ligates to the insert and the top strand at the right end can not ligate leaving another nic$ The ligation mi) is then transformed into Escherichia coli These nic$s in the phosphate bac$bone of the cloned DNA are repaired after the ligated DNA enters the cells %lasmid vectors are constructed so that they contain the lacZ gene with a restriction site right in the middle of the gene 7f the vector reanneals to itself without inclusion of an insert# the lacZ gene will remain uninterrupted2 if an insert has been cloned into the vector the lacZ gene will be interrupted The ligation mi) is transformed into E. coli cells such that about one cell out of 1#999 cells ta$es up a plasmid The transformed cells are plated on media containing ampicillin 3nly the cells with a plasmid will grow# thus removing the intramolecular ligation products that consist of inserts The media also contains 61;al This is a substrate for the 41galactosidase protein that is coded for by the lacZ gene The 41galactosidase en"yme cleaves 61;al and produces a molecule that turns the cell blue Those cells that too$ up an intact# re1circulari"ed vector with no insert will produce 41galactosidase and form blue colonies The bacterial cells that too$ up a vector E insert 'clone( will not be able to produce functional 41galactosidase and will form white colonies T*e ligati%n +it* t*e n%n-!*%s!*%r/late# )e"t%r reanneals t% itself at a *ig* fre1uen"/, lea#ing t% 998100 blue "%l%nies. T*e !*%s!*%r/late# )e"t%r f%rme# 998100 +*ite "%l%nies, s*%+ing t*at alm%st all %f t*e )e"t%rs *a# an insert b @es# the suggestion was a good one T*e #e!*%s!*%r/lati%n %f t*e )e"t%r in"rease# t*e number %f "l%nes <)e"t%r ? insert= 100 f%l#. c The choice of whether to dephosphorylate the vector versus the insert DNA is based on an understanding of the mechanics of the bacterial transformation that is carried out after the ligation 7f the vector is dephosphorylated it cannot self1ligate The insert can self1ligate The self1ligated inserts do not have any vector DNA# so they do not have a bacterial origin of replication '3!7( nor do they have a gene encoding antibiotic resistance Therefore# these recirculari"ed DNA/s will not allow the transformed bacteria to grow on the selective media >f t*e insert +ere #e!*%s!*%r/late#, it +ill n%t self-ligate, but t*e )e"t%r ;>$$ self-ligate. T*e )e"t%r *as t*e 1C. Chapter 9 antibi%ti" resistan"e gene an# A2>, s% t*e Bem!t/B )e"t%r +ill be !r%!agate# in E. coli, generating a *ig* le)el %f Bba"(gr%un#.B Section 9.3 Hybridization 9-16. a <1= 3.1, 6.9 (bC <2= 4.3, 4.0, 1.7 (bC <3= 1.5, 0.6, 1.0, 6.9 (bC <4= 4.3, 2.1, 1.9, 1.7 (bC <5= 3.1, 1.2, 4.0, 1.7 (b b The 6.9 (b fragment in t*e Eco2>?Hin#>>> #igestC t*e 2.1 an# 1.9 (b fragments in t*e BamD>?Pst>, an# t*e 4.0 (b fragment in t*e Eco2>?BamD> #igest will hybridi"e with the A9 $b probe 9-17 a The fragment si"es are too large to be resolved appropriately on a polyacrylamide gel necessitating electrophoresis on an agarose gel b Digestion of human genomic DNA with these en"ymes will result in hundreds of thousands of fragments The si"es of these fragments will range from tens of thousands of base pairs to only a few base pairs in lenghlength Agarose gel electrophoresis is not able to resolve fragments that differ from each other by a few base pairs and so the digested DNA will appear as a smear c The probe that is used does not hybridi"e to all of the restriction fragments that are generated by the different digests d can not draw in e)cel# e No an orientation can not be established from the information given 9-18. %robes need to be at least 1> nucleotides to effectively anneal to DNA 7n this e)periment short probes are desirable# because the longer the probe the greater the degeneracy Thus# this type of e)periment is usually done with !r%bes bet+een ab%ut 15 an# 18 nu"le%ti#es l%ng The design of degenerate probes is based on reverse translation# and there are a few considerations to $eep in mind+ 'i( if you $now the amino acid sequence of the protein in one species then you can ma$e some guesses about the amino acid sequence of the corresponding gene in the second species Iou hope that the amino acid sequence of a particular# small region of the protein will be identical in the two species :ince there are .9 different amino acids even one amino acid difference would ma$e it hard to design a E E H H 4.0 K 1.0 K 0.5 1.5 1.0 Chapter 9 1C= probe >f /%u (ne+ t*e se1uen"e %f t*e !r%tein fr%m se)eral ba"terial s!e"ies /%u "%ul# "*%%se a )er/ *ig*l/ "%nser)e# regi%n %n +*i"* t% base a !r%be 7f the amino acids are identical in several different species then they might be identical in Beneckea nigripulchritudo. 'ii( 7f you don/t $now anything about the amino acid sequence of the protein in other species of bacteria then you would fin# a regi%n %f 5 %r 6 "%ntigu%us amin% a"i#s +it* l%+ #egenera"/ 1 that is amino acids that are encoded by the lowest possible number of codons The best choices are ,et and Trp which are each encoded by only a single codon <nfortunately# it is highly unli$ely that a region of > or @ amino acids would be composed solely of ,et and Trp The ne)t best choices are %he# Tyr# Cys# *is# ;ln# Asn# 8ys# Asp# or ;lu# which are each coded for by . codons The worst choices would be 8eu# Arg# and :er '@ codons( 7f you had a > amino acid region composed only of these three amino acids# then the number of different molecules in the degenerate probe would be @ > B 777@ 9-19. c# &# f 'although f could be perfornmed before c and &( These steps must be performed before the rest The order for the rest of the steps is d# a# $# l# g# b# e# h Section 9.4 PCR 9-20. a The human genome sequence shows the sequence of the normal allele of %N< @%u +is* t% (n%+ +*et*er t*e EFG s/n#r%me in t*is !atient is "ause# b/ a mutati%n in t*e !*en/lalanine */#r%/lase gene Iou suspect that there might be such a mutation in this particular e)on# so you will sequence the %C! product 7f there is a mutation in this 1 $b e)on# you want to $now e)actly what it is# how it affects the en"yme# and perhaps something about the history of this mutation in human populations -or e)ample# if you compare the sequence in many patients and trac$ where the patients are from# you might get an idea of where this mutation arose in time and geographical space 7f you do not find a mutation in this 1 $b e)on that changes the amino acid sequence of the en"yme# there might still be a mutation in a different e)on b 3ne haploid human genome contains = ) 19 9 bp Therefore '= ) 19 9 bp5haploid genome( ) '@@ ) 19 . g5mole( ) 'mole5@9. ) 19 .= bp( B == ) 19 11. g5haploid genome 7n other words# one haploid genome weighs == ) 19 11. g or == picograms Lach haploid genome will contain only one phenylalanine hydro)ylase gene to be used as the template for the %C! reaction Iou start the %C! reaction with 1 ng '1 ) 19 19 g( of human DNA Therefore '1 ) 19 19 g DNA( ) '1 haploid 1CA Chapter 9 genome5== ) 19 11. g( ) '1 template molecule51 haploid genome( B 9= ) 19 = template molecules B 300 tem!late m%le"ules in 1 ng %f &'A c Iou begin the %C! with =99 template molecules 7f the %C! runs for .> cycles then this number of molecules doubles e)ponentially .> times Therefore you will end up with =99 molecules ) . .> B 19 19 or about 19 billion molecules This result e)plains the power of %C!+ you started with only =99 template molecules and end up with 19 billion copies of the region you are amplifying 7n practice the yields are not quite as high because not all potential template molecules get amplified each cycle *owever the amplification is still substantial The %C! product is 1 $b long# so '19 19
molecules of %C! product( ) '19 = bp5molecule of %C! product( ) 'mole5@9. ) 19 .= bp( ) '@@ ) 19 . g5mole( B 11 ) 19 1C g B 119 ng Iou started with 1 ng of the whole genome and ended up with 110 ng %f a 1 (b se"ti%n %f t*e gen%me after t*e E52H 9-21. %rimers have to be >/ to =/ and have the =/ end toward the center so DNA polymerase can e)tend into the sequence being amplified 3nly set b. satisfies these criteria 9-22. a 0oth of the primers in set b in problem 91. 1 A are 1C nucleotides long 7f 'i( human DNA is assumed to be a random sequence of equal proportions of A# ;# C# and T 'this is not entirely accurate# but it is close enough for this discussion(# and 'ii( no mismatches are allowed between the primer and the genomic template 'again# this is not entirely accurate as seen in parts b and c below# but again# it is close enough( then t*e "*an"e t*at %ne %f t*e t+% !rimers +ill anneal t% a ran#%m regi%n %f &'A t*at is n%t t*e targete# 5FT2 e%n +%ul# be <184= 18 , %r ab%ut 1 "*an"e in 7 10 10 7n other words# an 18 base se1uen"e +ill be !resent %n"e in e)er/ 70 billi%n nu"le%ti#es :ince the human genome is = billion nucleotides long it is e)tremely unli$ely that even one of the primers will anneal anywhere else than the desired target The probability is much lower that both of the primers will anneal to other stretches of DNA that happen to be close enough together to allow the formation of a %C! product This latter number is hard to calculate e)actly because of the variation in the possible distance between the primers b 'i( The lower limit on the si"e of the primers is governed by two main factors -irst# the %C! amplification must be specific# so the primers should be long enough to guarantee this specificity As in part a# t*e "*an"e !r%babilit/ %f a 16 base se1uen"e in ran#%m &'A is <184= 16 , %r 1 "*an"e %ut %f 4 10 9 Therefore# two 1@ base pair primers allow a comfortable margin for specificity ,ore importantly the primers must anneal to the genomic DNA to be amplified As Chapter 9 1C> discussed in Chapter 9# hydrogen bonding between 1> or 1@ nucleotides of contiguous base pairs is required to allow DNA to remain double stranded 'ii( 7f the primers are too long# several potential problems arise -irst# t*e l%nger t*e !rimers t*e m%re e!ensi)e t*e/ are t% s/nt*esi9e :econd# t*e l%nger t*e !rimers t*e m%re li(el/ t*e/ are t% anneal +it* ea"* %t*er# or for a single primer to anneal to itself and form a hairpin loop# and the less li$ely the primers are to anneal with the template Third# and most importantly# if t*e !rimer is t%% l%ng it "an */bri#i9e +it* &'A +it* +*i"* it is n%t !erfe"tl/ mat"*e# 7nternal mismatches are tolerated and hybridi"ation can occur as long as there are enough surrounding base paired nucleotides# especially at the =/ end of the primer Thus# l%nger !rimers mig*t anneal t% %t*er regi%ns %f t*e gen%me t*an t*e regi%n /%u a"tuall/ +ant t% am!lif/ c Iou would be more li$ely to obtain a %C! product if the mismatch were at t*e 5,-en# The =/1end of a primer is its business end 1 that is where DNA polymerase adds additional nucleotides to the chain Iismat"*es at t*e 3,-en# +%ul# !re)ent &'A !%l/merase fr%m a##ing an/ ne+ nu"le%ti#es t% t*e "*ain 'Iou might remember that some DNA polymerases have a =/1to1>/ e)onuclease that could potentially remove the mismatch# now allowing further polymeri"ation This is true of E. coli DNA polymerase# but many of the DNA polymerases used in %C! come from thermophilic bacteria and these DNA polymerases do not have this e)onuclease activity( A mismatch at the >/1end of the primer does not matter as long as there is enough base1pairing between the primer and genomic template to allow annealing 9-23. a The Eco!7 and the Sal7 restrictions sites are both found in the p,ore vector sequence shown in the problem The Eco!7 site is nearer the >/ end and the Sal7 site is nearer the =/ end of the p,ore sequence shown This region of p,ore is at the C1terminal end of the maltose binding protein ',0%( Therefore your cloning will insert the C-T! DNA sequence into the DNA sequence that codes for the C1terminal end of the ,0% protein 7n other words# t*e '-terminus %f t*e fusi%n !r%tein "%ntains m%st %f t*e I:E !r%tein se1uen"e The ,0% sequence ends at the Cth amino acid from the C1terminus of ,0% where the Eco!1 site cuts the ,0% DNA T*e net !art %f t*e fusi%n !r%tein "%ntains t*e 5FT2 !r%tein en"%#e# b/ t*e E52 !r%#u"t Note that the %C! amplifies the last protein coding e)on of the C-T! gene Therefore t*e 5-terminal en# %f t*e fusi%n !r%tein +ill "%ntain t*e 5-terminal en# %f 5FT2 !emember that the N1to1C orientation of the C-T! protein must be the same as that of the fusion protein as a whole -urther details of the fusion protein will be discussed in part c below b ?hen you use two different restriction en"ymes# t*e 5FT2 gene "an %nl/ be inserte# int% t*e )e"t%r +it* t*e #esire# %rientati%n yielding the fusion protein you described in part a Thus the 1C@ Chapter 9 N1to1C orientation of the C-T! protein will be the same as the ,0% protein 7f the vector was only cut with Eco!1 and the %C! product had Eco!1 sites at both ends# then the %C! product could be inserted into the vector in two equally li$ely orientations# only one of which is the one you desire A second advantage is that cutting with two en"ymes minimi"es unwanted products of the ligation in which ends of the same molecule come together 'see problem 9117 a and b( c There are many things to ta$e into consideration here -irst# you can use the set b %C! primers you designed in your answer to problem 91.A in order to amplify the entire C-T! e)on :econd# the C-T! e)on does not have sites for Eco!1 and Sal7 so you need to add nucleotides to the >/1 ends of the two primers that will contain appropriate sites for the two restriction en"ymes These sites cannot be e)actly at the >/1ends of the %C! primers G you must also add > more nucleotides beyond the restriction sites to enable the restriction en"ymes to bind to their recognition sequences and digest the DNA The sequence of these > nucleotides is not important Third# the two parts of the fusion protein must end up being in frame 0ecause the %C! product encodes the C terminus of the fusion protein# there are fewer constraints on the identity of the additional nucleotides added to the second 'bac$wards( primer The answer below is &ust one of many possible solutions The sequence of the critical part of the p,ore vector is reproduced here The dots at the left and right ends of this sequence represent the continuity of the DNA 1 this was a circular plasmid before the digestion >/AGGATTTCAGAATTCGGATCCTCTAGAGTCGACCTGTAGGGCAA=/ =/TCCTAAAGTCTTAAGCCTAGGAGATCTCAGCTGGACATCCCGTT>/ The vector is digested with Eco!7 and Sal7 to generate these stic$y ends+ ArgIleSerGluPh >/AGGATTTCAG TCGACCTGTAGGGCAA=/ =/TCCTAAAGTCTTAA GGACATCCCGTT>/ The %C! product using the set b primers 'problem 91.A( is shown below !emember that this %C! product contains the last protein coding e)on of the C-T! gene The left hand primer only has one open reading frame with the amino acid sequence shown below The right hand primer contains the DNA sequence coding for the last four amino acids at the C1terminal end of the C-T! protein# as shown in the problem The stop codon ':T%( is underlined Therefore the amino acids are LeuArgSerGluPheSerGluOTrpAlaIleMet >/ GGCTAAGATCTGAATTTTCCGAGTTGGGCAATAATGTAGCGC =/ =/ CCGATTCTAGACTTAAAAGGCTCAACCCGTTATTACATCGCG >/ Now you need to add an Eco!1 site to the >/ end of the left primer and a Sal7 site to the >/ end of the right primer G the restriction sites are underlined below These sites cannot be directly at the ends of the DNA sequence# so you need > random nucleotides added to each of the primers -urthermore# you must maintain the continuity of the 3!- 'open reading frame( between the ,0% Chapter 9 1C7 and the C-T! proteins after the vector and insert are digested and ligated Therefore two more nucleotides 'note the two ;+C pairs# italici"ed( were added to the left primer between the restriction site and the beginning of the C-T! 3!- Also# the region between the vector and the insert cannot have any in1frame stop codons The %C! product using these primers is+ LeuArgSerGluPheSerGlu TrpAlaIleMet >/ CCCCCGAATTCGGGCTAAGATCTGAATTTTCCGAGTTGGGCAATAATGTAGCGCGTCGACCCCCC =/ =/ GGGGGCTTAAGCCCGATTCTAGACTTAAAAGGCTCAACCCGTTATTACATCGCGCAGCTGGGGGG >/ <pon digestion of the %C! product with Eco!1 and Sal7# you will get+ LeuArgSerGluPheSerGluOTrpAlaIleMet >/ AATTCGGGCTAAGATCTGAATTTTCCGAGTTGGGCAATAATGTAGCGCG =/ =/ GCCCGATTCTAGACTTAAAAGGCTCAACCCGTTATTACATCGCGCAGCT >/ Now you can ligate the vector and the %C! product yielding+ ArgIleSerGluPheGlyLeuArgSerGluPheSerGluOTrpAlaIleMetSTP >/AGGATTTCAGAATTCGGGCTAAGATCTGAATTTTCCGAGO TTGGGCAATAATGTAGCGCGTCGACCTGTAGGGCAA=/ =/TCCTAAAGTCTTAAGCCCGATTCTAGACTTAAAAGGCTCO AACCCGTTATTACATCGCGCAGCTGGACATCCCGTT>/ The ;ly 'italicised( is the result of the ad&ustment to the %C! primer to ensure that the N1terminal part of the C-T! region was in frame with ,0% :o in summary# the two %C! primers needed are+ 5, 555554AATT54445TAA4AT5T4AATTTT5 3, an# 3, A5554TTATTA5AT54545A45T444444 5, Again# there are many possible answers that have minor variations# but you must still go through all of these steps to ma$e sure your %C! primers will wor$ properly d The fusion protein contains almost all of ,0%# so it should also bind to the amylose resin The cloning described in part b removes only the last 7 amino acids from ,0% Ia(e etra"ts %f ba"terial "ells e!ressing t*e fusi%n !r%tein an# a## t*ese etra"ts t% am/l%se resin The fusion protein should stic$ on the resin while all the other bacterial proteins in the e)tract should not Iou can +as* t*e %t*er ba"terial !r%teins a+a/ leaving the fusion protein bound to the resin T% get t*e fusi%n !r%tein %ff t*e resin /%u "an a## t*e sugar malt%se ,altose and amylase will compete for binding sites on the fusion protein 7f maltose is in e)cess then it will PdisconnectP the fusion protein from the resin# leaving a solution with purified fusion protein 1CC Chapter 9 Section 9.5 DNA Sequence Analysis 9-24. >n +ell stu#ie# %rganisms su"* as C. elegans, D. melanogaster, /east an# mi"e t*e entire &'A se1uen"e %f t*e gen%mes is n%+ a)ailable All you need to do in %r#er t% stu#/ an/ regi%n in t*ese gen%mes is to #esign E52 !rimers base# %n t*e gen%mi" se1uen"e that will amplify the region of interest 7f necessary you can then determine the DNA sequence of the amplified region using automated methods Iou might do this# for e)ample# if you wanted to $now if an individual/s gene carried a mutation These techniques require much less effort on the part of the investigator Thus *a)ing t*e gen%me se1uen"e %f an %rganism in"reases t*e im!%rtan"e %f E52 !estriction mapping is becoming a rarity even when studying unusual organisms 1 if you have cloned a gene from your organism you can sequence the DNA 3nce you $now the DNA sequence you can automatically find the location of the sites for all $nown restriction en"ymes *owever you still need to use restriction en"ymes to construct libraries and specific recombinant DNA molecules 2estri"ti%n #igesti%ns remain t*e basis f%r man/ im!%rtant a!!li"ati%ns %f &'A "l%ning and also for understanding in the ne)t chapter how scientists were actually able to determine the DNA sequences of entire genomes 9-25. Notice how many of these processes require the use of DNA polymerase# underlining why it is so important to learn how this en"yme wor$s a .n9/me-base#C &'A ligase b .n9/me-base#C restri"ti%n en9/mes c '%n-en9/mati"C */bri#i9ati%n relies %n "%m!lementar/ base !airing d .n9/me-base#C &'A !%l/merase e .n9/me-base#C re)erse trans"ri!tase f%r t*e first stran# %f "&'A an# &'A !%l/merase f%r t*e "%m!lementar/ stran# f .n9/me-base#C &'A !%l/merases fr%m t*erm%!*ili" ba"teria E. coli DNA polymerase would not be very effective for %C! because at each cycle# heat is applied to denature the DNA# and this heat would inactivate the E. coli en"yme This is not true of DNA polymerases from bacteria that live in high temperature conditions Chapter 9 1C9 9-26. a The newly synthesi"ed strand is read from the gel beginning with the smallest band which corresponds to the >/ end of this strand This newly synthesi"ed strand is complementary to the template strand !eading the sequence from the gel+ ne+l/ s/nt*esi9e# stran#J 5, TA45TA445TA4555TTTAT54 3, tem!late stran#J 3, AT54AT554AT5444AAATA45 5, b The sequencing template is the m!NA1li$e strand# s% t*e se1uen"e %f t*e m2'A isJ 5, 54AGAAA4445GA455GA45TA 3,. c Any m!NA has = possible reading frames# which begin at the >/ end with the first nucleotide# the second nucleotide and the third nucleotide T*ere are st%! "%#%ns in ea"* frame 'there are no open reading frames or 3!-s( s% it is unli(el/ t*at t*is is an e%n se1uen"e %f a "%#ing regi%n 9-27. a :ynthesis occurs in the >/ to =/ direction# so the smallest fragment would contain the >/ T added to the primer and the ne)t si"ed product would incorporate the C b -irst write out the sequence of both strands and scan each strand for stop codons T*e ne+l/ s/nt*esi9e# stran# *as st%! "%#%ns in all t*ree frames <un#erline#= an# t*eref%re +%ul# n%t be t*e "%#ing <e%n= se1uen"e. An t*e &'A se1uen"ing tem!late stran# t*e rea#ing frame t*at starts +it* t*e first nu"le%ti#e #%es n%t "%ntain a st%! "%#%n an# t*eref%re is t*e A2F in t*is 2'A-li(e stran# :ynthesi"ed strand+ >/ TCTA;CCT;AACTAAT;C =/ DNA sequencing template+ =/ A;ATC;;ACTT;ATTAC; >/ c The peptide sequence begins with the amino terminal end which corresponds to the >/ end of the m!NA1li$e DNA sequence 'the DNA sequencing template( is ' Ala-$eu-Kal-4ln-Ala-Arg 199 Chapter 9 9-28. a 7n -igure 91Aa# you can see that the fragments of DNA get successively larger by adding nucleotides onto the =/1end DNA polymerase synthesi"es growing strands in the >/1to1=/ direction The trace shows a portion of a synthesi"ed single stranded DNA The green pea$ at the left end of the trace means that there is a fragment of DNA of a specific length 'see part c( that was terminated when a dideo)y1A 'ddA( was incorporated into the DNA strand being synthesi"ed T*is terminal ##A, +*i"* is lin(e# t% a green flu%res"ent label, t*eref%re be"%mes t*e 3, en# %f t*is m%le"ule b 5,...A55TATTTTA5A44AATT...3, c B2esi#ue E%siti%nB in#i"ates a !ea( at a s!e"ifi" l%"ati%n in t*e s"an ,ost probably# nucleotide position 1 corresponds to the first nucleotide at the >/1end of the newly synthesi"ed fragments Iou should note that all of the fragments will start at their >/1end with the same short oligonucleotide primer# since DNA polymerase requires a primer Thus# nucleotide position 1 is also the >/1end of the primer used to generate the nested array of fragments Therefore t*e si9e %f t*e single- stran#e# &'A fragment is re!resente# b/ t*e resi#ue !%siti%n d There are two different pea$s showing up at the same position 3ne is a T# the other is a ; T*e #%uble !ea( at !%siti%n 370 is m%st li(el/ "ause# b/ t*e fa"t t*at t*e %riginal &'A a"tuall/ *a# t+% #ifferent &'A se1uen"es This pattern would be seen if the person whose DNA was amplified was actually a hetero"ygote with %ne "*r%m%s%me "arr/ing a T-A base !air at t*is l%"ati%n +*ile t*e *%m%l%gue *a# a 4-5 base !air This is in fact the way that %C! amplification and DNA sequencing can be used together to loo$ for hetero"ygosity anywhere in the genome 3f course this result could also be due to an error either in DNA sequencing or in %C! amplification Section 9.6 Bioinformatics: Information Technology and Genomes 9-29. a 7t indicates that there are regions of the chromosome where genes are clustered b The largest gene desert is from appro)imately >C999999 to @.999999 c The centromere corresponds the largest gene desert d The C-T! is on the long arm of the chromosome e The C-T! gene is trancribed in the direction of the green arrow which is pointing away from the centromere Chapter 9 191 f There are appro)imately .A e)ons in the C-T! gene 7t s an appro)imation as the eons are predicted by computer analysis and not by a comparison to actual protein sequence 9-30. The simplest method to try to determine potential proteins in this organism is to compare the sequences to organims that have also had their genomes sequenced Those sequences that are most highly conserved would be e)pected to be open reading frames from genes To determine alternative splicing in various tissues the cDNA sequences from those tissues can be compared to each other and to the genomic sequences