CBG.

02: Genomes ON THE IMMORALITY OF TELEVISION SETS:
“FUNCTION” IN THE HUMAN GENOME
ACCORDING TO THE EVOLUTION-FREE GOSPEL OF
ENCODE PROJECT WRITE EULOGY FOR JUNK ENCODE (Dan Graur)
DNA
- less than 10% of the genome is evolutionarily
conserved through purifying selection
- human DNA has 3 billion bases - according to ENCODE, a biological function can be
- instead of the expected 100,000 genes, the initial maintained indefinitely without selection, which implies that at
analysis found about 35,000 and that number has since been
least 80-10=70% of the genome is perfectly invulnerable to
whittled down to about 21,000
deleterious mutations, either because no mutation can ever
- 80% of the human genome serves some purpose,
occur in these “functional” regions or because no mutation in
biochemically speaking: these regions can be deleterious
- specify landing spots for proteins that influence - problems in ENCODE logic:
gene activity o seldom used “causal role” definition of
- strands of RNA with myriad roles biological function and then applying it inconsistently to
- places where chemical modifications serve to
different biochemical properties
silence stretches of our chromosomes
o logical fallacy “affirming the consequent”
- a gene’s regulation is far more complex than
o failing to appreciate the crucial difference
previously thought, being influenced by multiple stretches of between “junk DNA” and “garbage DNA”
regulatory DNA located both near and far from the gene o using analytical methods that yield biased
itself and by strands of RNA not translated into proteins errors and inflate estimates of functionality
(=noncoding RNA) o favouring statistical sensitivity over specificity
- 11,224 DNA stretches are classified as pseudogenes, o emphasizing statistical significance rather than
“dead” genes now known to be active in some cell types or
the magnitude of the effect
individuals
- in biology, there are 2 main concepts of function:
- there are many “genes” out there in which DNA o “selected effect”: function of a trait is the
codes for RNA, not a protein, as the end product effect for which it was selected, or by which it
- various cell genes home in on different cell is maintained
compartments, as if they have fixed addresses where they o “causal role”: historical and non-evolutionary:
operate: some go to the nucleus, some to the nucleolus and for a trait Q to have a “causal role” function, G,
some to the cytoplasm
it is necessary and sufficient that Q performs G


GINGERAS: the fundamental unit of the genome and the
(ex:) TATAAA – maintained by natural selection to bind a
basic unit of heredity should be the transcript – the piece of transcription factor; a mutated sequence, resembling this one,
RNA decoded from DNA- and not the gene also binds the transcription factor, but does not result in
transcription (no adaptive or maladaptive consequence); hence,
- 5% of the human genome is conserved across the second sequence has no selected effect function, but its
mammals
causal role function is to bind a transcription factor
- DNA’s bases function in gene regulation through

their interactions with transcription factors and other
- from an evolutionary viewpoint, a function can be
proteins; assigned to a DNA sequence if and only if it is possible to
- 8% of the genome falls within a transcription factor destroy it; unless a genomic functionality is actively protected by
binding site, a percentage that is expected to double once selection, it will accumulate deleterious mutations and will cease
more transcription factors have been tested to be functional
- the fact that sometimes it is difficult to identify
selection should never be used as a justification to ignore
selection altogether in assigning functionality to parts of the
human genome
- the surest indicator of the existence of a genomic
function is that losing it has some phenotypic consequence for
the organism
- functional regions of the genome should evolve more
slowly and be more conserved among species than non-
functional ones
- Ward and Kellis confirmed that approx.. 5% of the
genome is interspecifically conserved and an additional 4% of
the human genome in under selection
- According to ENCODE:
o 74.7% of the genome is transcribed
o 56.1% is associated with modified histones

regions that do not overlap with active ENCODE seemingly endless stream of “genome-wide association elements and inactive chromatin states show lower studies” (GWAS). . selection GWAS studies are working from disease in. hence. conserved across mammals. . So far. is deleterious. these also show higher primate divergence relative compared to random SNPs. and provides many fresh leads for . although only 5% of the human genome is . and because the IN HUMANS FOR RECENTLY ACQUIRED molecular process generating extra DNA outpaces those getting REGULATORTY FUNCTIONS rid of it. misconceptions in common objections to “junk DNA” biologists had on their radar. however. “In some function” sense. attempting to understand the genetic basis constraint than ancestral repeats.2% is found in open-chromatin areas regions. while . mammalian conservation suggests that aprox. . introns (some human introns harbour regulatory the top are all the possible cell types and transcription sequences TISHKOFF. ENCODE is working form the genome out. generation time are correlated with 50 and 100 were predictable. lack coding went: Yes!” potential due to the presence of disruptive mutations. They found that just 12 percent of the SNPs lie regions. in the majority of known bacterial species.” says Birney. non-coding predates human-macaque divergence . ZHOU. human constraint correlates with mammalian conservation. “That wasn’t something that the Crohn’s disease . They have thrown up a long list of SNPs – variants provide a more accurate neutral reference than repeats that at specific DNA letters—that correlate with the risk of can have exapted functions different conditions. nematode Caenorhabditis elegans has 20. raising the question of whether the deletion of these sites. a type of bowel disorder. Others are head-scratchers.517 protein. the disease-associated ones are to active regions. The something to do with function team found five SNPs that increase the risk of Crohn’s. and . ENCODE: THE ROUGH GUIDE TO THE HUMAN similar selective pressures act in humans and across GENOME mammals . that are recognised by a group of transcription factors called coding genes GATA2. a substantial factor of human constraint lies outside mammalian-conserved regions . as well as being useless. especially in promoters and enhancers. the excess DNA in our genome is junk and it is there EVIDENCE OF ABUNDANT PURIFYING SELECTION because it is harmless. suggesting that some loss of constraint 60 percent more likely to lie within functional. 5% of “PURPOSE IS THE ONLY THING EVOLUTION CANNOT PROVIDE” the human genome is conserved due to noncoding and regulatory roles. a substantially larger portion is but show no evidence of selection against point mutations. biochemically active. within protein-coding areas. and is subject to additional elements evolve neutrally or confer a lineage- purifying selection specific fitness advantage . and many of them are new. Take Crohn’s disease. Lots. less than 2% of the histone modifications may have .5% binds transcription factors that many of these variants are controlling the activity of o 4. the team have identified 400 enormous effective population sizes. the nunfunctional DNA . but are typically devoid of function: “Literally. pseudogenes (up to 1/10 transcribed. o 15. classes of sequences that are known to be abundantly one of those too good to be true moments. Are there hotspots? Are there SNPs that . The ENCODE team have mapped all of these to activity show reduced human constraint relative to active their data. suggesting that they may of disease.” Where they against excess genome is extremely efficient due to the meet. mRNA splice sites and regulatory elements. transcription is fundamentally a stochastic process understanding how they affect our risk of disease. They also showed that .” In other words. there is interest. Of these. it’s a new lead to follow up o the belief that evolution can always can rid of on. mammalian conserved regions lacking ENCODE . transcribed. “Suddenly we’ve o a lack of knowledge of he original and correct made an unbiased association between a disease and a piece sense of the term of basic biology. bound by A SLIGHTLY DIFFERENT RESPONSE TO TODAY’S a regulator. and the fact that such hotspots that are worth looking into. Imagine a massive table. suggesting recent loss in function and activity. Across . For the last decade. indifferent DNA refers to DNA sites that are functional.6% consists of methylated CpG dinucleotides different genes. the diseases that people have done GWAS studies for. geneticists have run a . evolve .“We’re now working with lots of different disease o the belief that “future potential” constitutes “a biologists looking at their data sets. “It was . This suggests o 8. mobile elements correspond to both? Yes. but more than 80% is transcribed. between replication time and. 2006 as well as sequences that produce factors (proteins that control how genes are activated) in the small RNA molecules (HIROSE. 2004)) ENCODE study. BRENNER: differentiated between “junk DNA” and “garbage DNA”.” says Birney. or associated with chromatin states suggestive of regulatory functions ENCODE HYPE . 2003. Down the left side are all very rapidly and are mostly subject to no functional constraint). I was in the room [when they got the result] and I . Some of the rest make intuitive genome size sense.” says Birney.

243 human-mouse conserved non-coding elements . in MMU3 desert. beta-galactosidase expression harm. a collection of repar enzymes corrects mammalian genomes not corresponding to protein coding chemical changes inflicted on the strands by environmental sequences remains largely undetermined insults . each of our cells (or more correctly. almost half of human constraint lies outside NUCLEOSOME mammalian-conserved regions. both to transcribe mRNA for building new proteins and to replicate the DNA when the cell divides. the functional importance of the roughly 98% of the . forming tight. the boundaries for the deletions permitted proximate regulatory elements nearby the flanking genes to remain intact . nucleosomes protect the delicate strands (=gene deserts) can be well tolerated by any organism from physical damage . quantitative assays revealed detectable alterations in levels requiring it to perform 2 opposite functions simultaneously: of expression: Prkacb reduced in the heart and Rpp30 on one hand. protein-coding constraint occurs primarily in conserved regions. polymerases must be allowed access to the DNA. rate 1:2:1 (wild-type : heterozygous : mutant homozygous) . nucleosome must be labile enough promoter to allow the information in the DNA to be used. reduced in the brain sheltering structures that compact the DNA and keep it from . whereas regulatory constraint is primarily lineage-specific. encoded in 3 billion bp of DNA . genome-wide association studies suggest that 85% of disease-associated variants are noncoding. some large-scale deletions of the non-coding DNA . on the other hand. this suggests that mutations outside conserved elements play important roles in both human evolution and disease MEGABASE DELETIONS OF GENE DESERTS . the job of the nucleosome is paradoxical. . even though the strength of human constraint is higher in conserved elements . a fraction similar to the proportion of human constraint that we estimate lies outside protein-coding regions. as proposed during mammalian radiation . the heterozygous mice appeared phenotypically normal . phenotypic parameters measures in the homozygous deletion mice. although gene inactivation can sometimes fail to result in detectable phenotype. compared with controls: o post-natal survival rates for 25 weeks o measurable growth retardation o clinical chemistry tests (general and specific plasma parameters) o morphological abnormalities o abnormal growth o tissue degeneration o organ mass was similar in both groups of deletion mice and their wild-type littermates . molecular level impact: only 2 out of the 108 . nucleosome must be stable. but may involve a partial unfolding of the DNA from around the . the method by which nucleosomes solve these opposed needs is not well understood. this is usually related to the removal of genes with redundancy elsewhere in the genome . deletion of a gene desert mapping to mouse chromosome 3 and in chromosome 19 (with no evidence of transcription) à contain 1. the deletions weren’t lethal in embryons because of approx. nearly RESULT IN VIABLE MICE all of our cells) contain a copy of this genome. the homozygous deletion mice for both deletions were viable .

one loop at a time. as if it were part of the yeast’s normal complement regulatory enzymes that chemically modifies these tails to of chromosomes.ural yeast chromosome—DNA “histone” proteins bundled tightly together at the centre. the human genome is not so very different from physical maps: distances between features are measured not that of chimpanzees or mice. sequences that mark the ends encircled by 2 loops of DNA. chromosome 1 has the most genes (3. this serves to glue the DNA strand to the protein core. which comprises nearly a quarter of their .” The classic example is the cloning vector. each weaken their interactions. reaching out to neighbouring nucleosomes and reinserted into a yeast cell. as the information in the of “foreign” DNA can be inserted into them. some regions of the genome resist cloning in YACs and others are prone to rearrangements (PRIMER ON MOLECULAR GENETICS) .cules derived from bacteria genome where individuals differ in their DNA sequence. identical segments of human DNA yield identical sets of . each chromosome is a physically separated specific recognition sites. the histone proteins. the histone proteins are perfectly designed for their jobs.tion during cell division—then splicing in a frag. which then produce different patterns when dynamics sorted according to size . A yeast artificial chromo. the surface of the octamer is decorated with positively charged AA. of the same fragment of human particular genes more accessible to polymerases.000 bp restriction fragments. it indicates for each chromosome the whereabouts of genes or other “heritable markers”. that interact strongly with the negatively-charged phosphate groups of the DNA. the average gene is 3. when digested with a particular restriction enzyme.nucleosome. so much so that histones are nearly identical in all non-bacterial organisms. with their subtly .ment length. with distances measured in centimorgans (=measure of recombination frequency) à the closer 2 genes are on a single chromosome the less likely they are to get split up during genetic recombination à when they are close enough that the chances of being separated are only 1/100 they are said to be separated by a distance of 1 centimorgan . single nucleotide polymorphism (SNP) are sites in a which may be circular DNA mole. however. even slight modifications can be lethal. Bp genomic region of 2 different people. and sequences required for chromosome are not completely globular like most other proteins à they separa- have long tails. restriction enzyme: cleave dsDNA molecules at . but in “real” physical units (base pairs) many common elements with the genome of lowly fruit fly . scientists believe that human genome has at least 10 mil SNPs . . and it even shares in genetic terms. or from bacteriophages (viruslike parasites of bacteria). on the other hand. often by a single base . usually 4 or 6 nucleotides long molecule of DNA that ranges in length from about 50 million .some is then nucleosome. or artificial chro. or clone. repeat sequences are thought to have no direct different genomic sequences. . This engineered chromo.mosomes constructed from yeast or bacterial genomic DNA. or YAC. allowing DNA their particular information to be copied and used to build new proteins . sequences that initiate replication. can yield dissimilar sets of functions.168) and Y . largest known human gene is dystrophin 2. in this way. nucleosomes also modify the activity of the the host reproduces itself. the tails extend outward from the compact of human DNA. which reproduces the YAC during binding them tightly together. DNA from the same . is constructed by assembling the essential genes that they store: each nucleosome is composed of 8 functional parts of a nat.some. the nucleus contains cell division. of the chromosomes.4 mil. the cell makes containing a copy. whereby the DNA is read inserted DNA is replicated along with the rest of the vector as . to 250 million bp then. for instance.a third necessary tool is some means of DNA chromosome has the fewest (344) “amplification. but they shed light on chromosome structure and fragments. The result is a colony of yeast cells. The characteristic all these vectors share is that fragments . HUMAN GENOME PROJECT (TO KNOW OURSELEVES) genetic linkage map: based on careful analyses of human inheritance patterns.

Sign up to vote on this title
UsefulNot useful