Vous êtes sur la page 1sur 4

Sequencing the present, building the future.

Javier Aguirre Rivera Uppsala Universitet March 2014.

In the early 90s the science fiction culture made some predictions that are now clearly becoming part of the day-to-day science. The genome sequencing of samples obtained from fossils was presented by sci-fi as part of a successful method to bring back extinct species, several years before the human genome project was even initiated. Although the DNA of a mammoth has already been sequenced (Poinar et al., 2006), it is still difficult for paleogenetists to work with older samples at the same level, dinosaurs for example, as even DNA conserved at permafrost conditions has a survival time of around one million years only. It is clear that at the rate sequencing is moving forward, new applications are on sight such as personal medicine, real time monitoring of mutations, purely synthetic organisms capable of producing/degrading any needed chemical, among many other useful applications that surely are going to improve human life quality at unimagined levels. Genome Sequencing. The sequence of any genome, which codes for expression of proteins and other regulatory elements, is composed by Adenine, Cytosine, Guanine, and Thymine (A, C, G, and T). The order of these four bases within the genome determines the element to be expressed. Is then of much interest to unravel the base sequence of a genome to make further studies and try to understand the biology of living organisms. There are nowadays two approaches that come from a convergent evolution of distinct methodologies that have improved in efficiency, accuracy and economy: Whole Genome Sequencing (WGS) and Deep Exon Sequencing (DES). In deep exon sequencing, a specific region within the genome is selected and sequenced. This region most of the cases is an exon. For whole genome sequencing, the objective is to sequence a complete genome. This task is easy for small genomes like bacterial, but more difficult for complex organisms. Some parameters used to evaluate the obtained sequence are depth (also called coverage), genome length, read length, number of reads, and time of the assay. The sequence depth can be seen as the number of times each base in the original molecule is sequenced on average (Dunham, 2005). And can be calculated with the following equation:

!!

!!! !

Where C represents the coverage, L the read length, N number of reads and G the haploid genome length. (Illumina inc., 2011) Whole genome sequencing vs. deep exon sequencing. With the advent of the popular human genome project, the scientific community became aware of the importance of knowing the sequence of human DNA. Since the time of its completion it has turned more and more significant as more studies are conducted to relate variations on the genome with certain diseases. Analyzing this relationship is not an easy task as extremely accurate output is required to set causality on rare single nucleotide variants (SNV). Moreover to clarify these effects with a statistic significance a bigger number of samples are needed (Tennessen et al., 2012), which may be extremely costly using whole genome sequencing, considering that most procedures cost still oscillate around or above $1000.00 USD per sequence. In this situation other techniques like deep sequencing are utilized. Deep sequencing clearly has advantages like its sensitivity to detect single nucleotide variants among a population, also known as single nucleotide polymorphisms (SNP) and also to detect copy number variable regions (CNVR) due to the high throughput of the procedures (Kahvejian, Quackenbush, & Thompson, 2008). Furthermore, recent findings show that with a depth of less than 10x there is a probability of 22.6 % of missing a heterozygous position(Ajay, Parker, Abaan, Fajardo, & Margulies, 2010) another score for deep sequencing. The importance of detecting CNVR and SNP lies on their relationship with expression of complex phenotypic traits or health conditions (Kahvejian et al., 2008). Another use of deep sequencing has been to monitor mutations across exomes in thousands of genes revealing multiple variations at multiple regions, driving the understanding of oncogenesis to another level (Kahvejian et al., 2008). Therefore would be correct to state that deep sequencing is the best option, not as we know it nowadays with short read length and coverage but as whole genome deep sequencing. Deep sequencing is currently more associated with exome sequencing because of the short read length, but output from ENCODE project, which strives to understand all the functional elements within the human genome, makes us question the validity of such assays. If 75% of the genome is transcribed during the cellular cycle and there are overlapping transcripts coming from both DNA strands then the gene boundaries we believed to know so well, turn blurry and imprecise (Djebali et al., 2012). Thus it becomes

more relevant to keep a wider sight that allows a more precise analysis of the relationship between all elements of the system (coding elements and non coding elements like enhancers, promoters, and other communication elements within the chromosome). Furthermore it is starting to become basic that relationships made between genetic variation and certain phenotypes or diseases go beyond the linear analysis of the genome as the network of regulatory elements interacting at structural level is being unraveled.(Sanyal, Lajoie, Jain, & Dekker, 2012). Moreover these variations in the genetic sequence usually related to disease have been identified in non coding regions of the genome so their certain mechanism of action remain a mystery. (Sanyal et al., 2012)

!"#$%&'(' Figure 1, Graphic representation of some advantages of each approach and the suggestion that the
further development of an optimal technique should include the best features of both methods.

Even with the low price and high resolution of deep sequencing, it would extremely ambitious and slightly inaccurate to try to correlate a particular gene with a specific phenotype or disease using the information obtained from these methodologies alone. With deep sequencing we are playing with a single piece on the chessboard and nobody can say it is the queen. The best option would be to develop a whole genome sequencing technique with all the advantages of deep sequencing to fulfill the need of fast, cheap and accurate full sight genetic information. Sequencing is definitely todays trend and definitely a tool that in due time will allow the analysis in real time of all the parameters involved in gene expression and regulation in order to really understand the mechanisms involved. Only then we can start to think about personal medicine or next level synthetic biology.

References

Ajay, S. S., Parker, S. C. J., Abaan, H. O., Fajardo, K. V. F., & Margulies, E. H. (2010). Accurate and comprehensive sequencing of personal genomes, 14981505. doi:10.1101/gr.123638.111.Freely Djebali, S., Davis, C. a, Merkel, A., Dobin, A., Lassmann, T., Mortazavi, A., Gingeras, T. R. (2012). Landscape of transcription in human cells. Nature, 489(7414), 1018. doi:10.1038/nature11233 Dunham, I. (2005). Genome Sequencing, 16. doi:10.1038/npg.els.0005378 Illumina inc., I. T. N. (2011). Estimating Sequencing Coverage. Technical Note, 3000000000, 23. Kahvejian, A., Quackenbush, J., & Thompson, J. F. (2008). What would you do if you could sequence everything? Nature Biotechnology, 26(10), 112533. doi:10.1038/nbt1494 Poinar, H. N., Schwarz, C., Qi, J., Shapiro, B., Macphee, R. D. E., Buigues, B., Schuster, S. C. (2006). Metagenomics to paleogenomics: large-scale sequencing of mammoth DNA. Science (New York, N.Y.), 311(5759), 3924. doi:10.1126/science.1123360 Sanyal, A., Lajoie, B. R., Jain, G., & Dekker, J. (2012). The long-range interaction landscape of gene promoters. Nature, 489(7414), 10913. doi:10.1038/nature11279 Tennessen, J. a, Bigham, A. W., OConnor, T. D., Fu, W., Kenny, E. E., Gravel, S., Akey, J. M. (2012). Evolution and functional impact of rare coding variation from deep sequencing of human exomes. Science (New York, N.Y.), 337(6090), 649. doi:10.1126/science.1219240

Vous aimerez peut-être aussi