Journal Club 2009

Science. 2009 April 17;324(5925):389-92

Local DNA Topography Correlates with Functional Noncoding Regions of the Human Genome
Stephen C. J. Parker, Loren Hansen, Hatice Ozel Abaan, Thomas D. Tullius, Elliott H. Margulies

João Carneiro

Is structural information important when analysing the DNA molecule?

Human genome

≈ 2% code for proteins

≈ 98% non-coding Regions not well understood

Functional regions

Association to conserved regions because of their importance to the organism

The analysis is based in the primary nucleotide sequence

Evolutionary sequence-constraint algorithms

DNA is a double helix molecule with a three dimensional structure that varies according to the nucleotide sequence.

Shape of grooves Shape of backbone

Previous described method to predict the modification of the DNA three dimensional structure

Hydroxyl radical cleavage

Substitutions of a few bases can affect the cleavage pattern

Similar sequences often adopt similar structures

Very different sequences at primary level can have identical structure

Similar sequences can adopt very different structures (few)

Hydroxyl radical footprinting of RNA

Gel pattern of unfolded RNA is cleaved uniformly

Gel pattern of folded RNA is cleaved where the solvent has accessibility to the backbone

High similarity in cleavage patterns for two sequences with low sequence identity. Plotted are the hydroxyl radical cleavage patterns of two 10-mer sequences that share no common nucleotides (sequence identity = 0%). Note the significant correlation (R = 0.94) of the two patterns.
Construction of a genome-scale structural map at singlenucleotide resolution
Jason A. Greenbaum, Bo Pang and Thomas D. Tullius Genome Res. 2007 17: 947-953

ORChID database

Collection of a library of hydroxyl radical cleavage experimentally determined for 4-bp DNA sequences

Prediction of the DNA backbone and grooves of genomic DNA in all possible 11–bp sequences

In silico library of structural profile of DNA

Euclidean distance to detected conserved structure possible represents important biological functions

Quantification of single base substitutions in DNA structure

The distribution of average structure changes observed for all 11-bp sequences

Alignments of 30 Mb of high quality comparative sequence data for 36 different species

Chai: Structural information + DNA sequence–based binomial conservation (binCons) binCons: only DNA primary sequence analised

Incorporation of a false discovery rate (FDR) Random permutation of alignment columns to generate a new alignment (total number of constrained bases are compared with the original alignment and confidence value determined)

Re-run of Chai and binCons algorithm over null alignments

For all statistical confidence values Chai identified more constrained region than binCons

Identification of the regions retrieved by Chai that harbor functional elements

DNase I hypersensitive sites

Predicted transcriptional enhancers identified using chromatin modification patterns

Ancestral repeats

Structural characteristics and nucleotide sequence are important in noncoding regions

Analysis of nucleotide substitutions consequences in structure profile of DNA binding proteins Mammal transcription factor Zif268 protein and archaeal transcriptional regulator SsLrpB

Results revealed that lowaffinity binding sites differed dramatically in structure from high-affinity sites
DNA structural profiles correlate with motifbinding affinity for the mutant REDV Zif268 protein.

Collection of 734 noncoding single-nucleotide variants in the human genome associated with a phenotype (PhenCode Project)

For comparison, a distribution of baseline variation in DNA topography was computed for 16,832 neutrally evolving single-nucleotide polymorphisms (SNPs)

Analyses of 11-bp window centered on the nucleotide variant using Chai

Big alterations in DNA structures are significantly correlated to noncoding phenotypic associated variants
P < 3 × 10−4 Wilcoxon rank sum test

Distribution of DNA structural changes for phenotype associated variants (red line) and neutral variants (black line)

Finally they analysed 12 predicted enhancercontaining regions

7 regions overlap a combination of Chaiand binCons-detected elements

5 regions overlap elements detected only by the Chai algorithm

They used a luciferase reporter construct that included this regions and transfected them into 293T cells

Luciferase-based reporter activity of 12 regions containing Chai detected elements
P ≤ 0.05 Wilcoxon rank sum test


DNA local structural architecture (DNA topography) is wide conserved in the analysed regions These regions overlap the majority of known non-coding functional sites Conserved regions that escaped detection by DNA primary sequence analyses can be detected by this method

