DNA sequencing

Extended Elective Studies DNA analysis, Proteomics and Metabolomics ASM013 - 2009/10 Dr Giovanna Bermano – g.bermano@rgu.ac.uk - Room A35a

DNA Sequencing
• Important to know the precise order of nucleotides in DNA as: - gives clues about function of gene product responsible for disease; - allows comparison of gene sequence with sequences of genes in databanks and possible discovery of cause of disease; - allows comparison of a newly discovered gene to previously discovered genes in other organisms which may reflect common functions for the protein products. • Two methods: - Enzymatic method of Sanger-Coulson (commonly used) - Chemical degradation method of Maxam and Gilbert

• Sequencing primer anneals to ssDNA. • Taq or T7 DNA polymerase extend the primer using dNTPs. • Based on premature termination of DNA synthesis. each reaction uses a specific ddNTPs (dideoxynucleotide) to terminate enzymatically the reaction. • Extension reaction is split into 4. Primer is complementary to the region near the vector-insert junction. .DNA Sequencing Sanger-Coulson method • ssDNA used as template.

except it has no 3' hydroxyl group once it is added to the end of a DNA strand. it can no longer be elongated. . • This is just like regular nucleotides.DNA Sequencing • The reactions are run in the presence of ddNTPs.

a small piece of single-stranded DNA about 20-30 nt long that can hybridize to one strand of the template DNA. • an enzyme (usually a variant of Taq polymerase) • a 'primer' . • free nucleotides.DNA Sequencing The reaction mix includes • the template DNA. .

DNA Sequencing .

.DNA Sequencing • Gel electrophoresis can be used to separate the fragments by size. • This can be repeated for all four ddNTPs. • This diagram shows the results of a sequencing reaction run in the presence of dideoxyCytidine (ddCTP).

and with “different” fluorescent colours on each. • The sequence of the DNA is determined by knowing the colour codes.DNA Sequencing • All four reactions can be run in a single tube with “all four” of the ddNTPs (A. . G. C and T) present. • The gel is read from bottom to top: TGCGTCCA-(etc).

the colours in one lane of a gel (one sample). • The computer also produces a text file containing the nucleotide sequence. • An average of 500 nucleotides are read in each reaction. from smallest fragment to largest.DNA Sequencing • The computer reads the sequence from the gel by scanning. • DNA sequencing http://uk.youtube.com/watch?v=Mz4LSfecM4&feature=PlayList&p=BADA17575EBD7A76&playnext=1&index=8 .

DNA Sequencing .

. • Not commonly used as it is slower than Sanger method.base first modified using specific chemicals . followed by high resolution acrylamide gel electrophoresis.DNA Sequencing Maxam and Gilbert method • Based on introduction of strand breaks at specific nucleotides by chemical degradation. • This method involves base specific cleavages: .the sugar phosphate backbone of the DNA is then cleaved by piperidine at that site. • Used for sequencing of particular genes whose sequence is GC rich.

DNA Sequencing Maxam and Gilbert method .

protein function and cellular pathways. . . .pathogen/host relationships.gene interactions and the regulation of gene expression.the evolution of gene/protein families. and hence. organisms.Genome Analysis • To know the sequence of the entire genome of an organism is useful for several reasons and it provides a better understanding of: . .

and functions – Coordination of gene expression. distribution. What we still do not know: – – – – – Gene number. and computational scientists.Research challenges in genetics Deriving meaningful knowledge from DNA sequences will require the expertise and creativity of teams of biologists. engineers. chemists. and functions Gene regulation DNA sequence organization Chromosomal structure and organization Noncoding DNA types. among others. and posttranslational events – Interaction of proteins in complex molecular machines . protein synthesis. exact locations. information content. amount.

– – – – – – – – – Predicted vs experimentally determined gene function Evolutionary conservation among organisms Protein conservation (structure and function) Proteomes (total protein content and function) in organisms Correlation of SNPs (single-base DNA variations among individuals) with health and disease Disease-susceptibility prediction based on gene sequence variation Genes involved in complex traits and multigene diseases Complex systems biology. genomics . including microbial consortia useful for environmental restoration Developmental genetics.

cerevisie E.8 x 109 3 billion .3 x 109 High quality draft Partially sequenced 2.Genome Analysis Year 1975 1977 1978 1981 1982 1996 1997 2001 2002 2003 2004 2006 Organism Bacteriophage ФX174 SV40 pBR322 Human mitochondria Bacteriophage λ S.coli Human genome Mouse genome Dog genome Rat genome Human genome Size (bp) 5386 5243 4363 16600 49000 12500000 4600000 3.

T.7 million chemical nucleotide bases (A.9%) nucleotide bases are exactly the same in all people. .What Does the Human Genome Sequence Tell Us? By the Numbers • The human genome contains 3164. • The average gene consists of 3000 bases.000.000.000 to 140. • The functions are unknown for over 50% of discovered genes. but sizes vary greatly • The total number of genes is estimated at 30. • Almost all (99. much lower than previous estimates of 80. and G). C.

but they shed light on chromosome structure and dynamics. • During the past 50 million years. a dramatic decrease seems to have occurred in the rate of accumulation of repeats in the human genome. • Repetitive sequences are thought to have no direct functions. • Repeated sequences that do not code for proteins ("junk DNA") make up at least 50% of the human genome. .• Less than 2% of the genome codes for proteins.

• Humans share most of the same protein families with worms. flies. and plants. . but the number of gene family members has expanded in humans.How the Human Compares with Other Organisms • Humans have on average three times as many kinds of proteins as the fly or worm because of mRNA transcript "alternative splicing" and chemical modifications to the proteins.

. • This information promises to revolutionize the processes of finding disease-associated sequences. Researchers point to several reasons for the higher mutation rate in the male germline.4 million locations where single-base DNA differences (SNPs) occur in humans.Variations and Mutations • Scientists have identified about 1. including the greater number of cell divisions required for sperm formation than for eggs. • The ratio of germline (sperm or egg cell) mutations is 2:1 in males vs females.

• It will be possible to study all the genes in a genome. and cancers will provide focused targets for the development of effective new therapies. and blindness. • A number of genes have been associated with breast cancer. arthritis. diabetes.Applications. for example. or all the transcripts in a particular tissue or organ. • Finding the DNA sequences underlying such common diseases as cardiovascular disease. muscle disease. deafness. Future Challenges • The genome sequence will help on finding genes associated with disease. .

where. dynamic living systems. new experimental methodologies. and comparative genomics. • Transcriptomics involves large-scale analysis of messenger RNAs transcribed from active genes to follow when. . proteomics. • These explorations will encompass studies in transcriptomics. and under what conditions genes are expressed.The Next Step: Functional Genomics • To use this vast reservoir of data to explore how DNA and proteins work with each other and the environment to create complex. structural genomics.

. • Knock out studies will be used to understand the function of DNA sequences and the proteins they encode. • Structural genomics involves the generation of the 3-D structures of proteins. • Comparative genomics involves the analysis of DNA sequence patterns of humans and well-studied model organisms for identifying human genes and interpreting their function.• Proteomics involves the study of protein expression and function. offering clues to function and biological targets for drug design.

Human Genome Project Information http://www.ornl.gov/sci/techresources/Human_Genome/faq /seqfacts.shtml#whatis .

Sign up to vote on this title
UsefulNot useful