Vous êtes sur la page 1sur 190

Leading Edge

In This Issue
Residing in the physical heart of the cell, the nucleus has now fully shed its once one-dimensional reputation as the repository for genetic information and steady supplier of messages to the cytoplasm. This sea change toward a more dynamic view of nuclear biology has been led by a revolution in our understanding of genomes, regulatory elements, epigenetic modiers, and nuclear spatial organization, which has enabled the examination of the interplay of these factors both genome wide and at individual loci. From this vast and rapidly expanding body of work, we can now see that, in the cellular orchestra, the nucleus is not simply the sheet music from which the notes are played but is instead the conductor and composer, both writing the songs and leading the performance. This special issue, The Dynamic Nucleus, explores the meaning behind the maestros motionsboth dramatic and nuancedin directing the cell symphony.

Please Put on Your 3D Glasses


We open the issue by posing the question of what has been the most surprising revelation about the nucleus in recent years. The responses of Job Dekker, Joanna Wysocka, Iain Mattaj, Erez Lieberman Aiden, and Craig Pikaard range from the pervasive roles of RNA to mapping genomic connectomes and appreciating the marvel of nucleocytoplasmic transport (Voices, page 1207). Many of the major themes of The Dynamic Nucleus are then framed in an Essay from Tom Misteli (page 1209), who places genome function in the context of cell biology, discussing the impact of stochasticity, epigenetics, and nuclear organization while juxtaposing this emerging complexity with the simple elegance of the structure of DNA on the 60th anniversary of its discovery. If the intervening years have taught us anything, it is that knowing the sequence of DNA is just the starting point for understanding gene regulation, with the critical importance of epigenetic modication and higher-order chromatin structure becoming ever more apparent. In their Review, Wendy Bickmore and Bas van Steensel (page 1270) provide an introduction to how local and long-range contacts between genes shape the three-dimensional organization of interphase chromosomes. A major driver of nuclear architecture, the formation of chromatin loops, is examined in depth by Duncan Odom and Matthias Merkenschlager (Review, page 1285), who focus on two proteins, cohesin and CTCF, which are frequently involved in bringing together gene regulatory elements with their targets. Pedro Batista and Howard Chang grapple with an emerging facet of nuclear organization in their Review (page 1298), proposing that long noncoding RNAs act as address codes directing protein complexes, genes, and chromosomes to their appropriate locations. The formation of chromatin loops and other modes of protein and RNA-mediated gene regulation are further featured in this issues Select, penned by Molecular Cells Brian Plosky (page 1203).

Message Control
Of course, the main nuclear preoccupation is transcription. Tong Ihn Lee and Richard Young (Review, page 1308) take on the challenge of synthesizing the vast literature on transcriptional control, highlighting recent advances in understanding the impact of transcription factors, cofactors, and chromatin regulators and delving into the misregulation of transcriptional control in disease states. Bernadett Papp and Kathrin Plath (Review, page 1323) similarly delve into the evidence that chromatin regulators serve as gatekeepers of cellular reprogramming to induced pluripotency. Initiating transcription is just the rst of many steps in the making of a messenger RNA fated for cytoplasmic ribosomes. In their Review, Blencowe and colleagues (page 1252) place RNA-splicing events within the complex spatial and temporal dynamics of the nucleus and examine the layers of crosstalk between splicing and chromatin context and transcriptional regulation. Gene regulation by RNA also lies at the heart of the allelic control of large gene clusters, as observed in imprinting and X chromosome inactivation, the processes highlighted in a Review by Jeannie Lee and Marisa Bartolomei (page 1308).
Cell 152, March 14, 2013 2013 Elsevier Inc. 1199

When Bad Things Happen to Good Nuclei


Like transcription, maintaining genome integrity is a particular nuclear obsession. How the chromatin landscape changes in response to DNA damage and how this then impacts DNA repair processes is the focus of the Review by Brendan Price and Alan DAndrea (page 1343). On a different spatial scale, Vincent Dion and Susan Gasser (Review, page 1354) discuss how the chromatin environment affects the long-range mobility of DNA, with double-strand breaks increasing movement and thus impacting genome stability and the efciency of repair. Yet despite a cells best efforts, things sometimes go wrong, with a dramatic example being chromothripsis, in which many chromosomal rearrangements occur in a single catastrophic event. In their Primer, Jan Korbel and Peter Campbell (page 1226) enumerate the criteria for inferring the occurrence of chromothripsis from cancer genomesequencing data. Luckily, however, DNA repair and replication proceed most of the time with exquisite precision, and for a view on the elegant spatial and temporal dance of DNA replication, you should check out the SnapShot by David Gilbert and colleagues (page 1390).

The Outer Limits and Beyond


The boundary of the nucleoplasm is the nuclear lamina, which is a unique and complex nexus for gene regulation. Recent studies reveal that associations between chromatin and the nuclear lamina differ between cell types, a topic tackled by Kevin Van Bortle and Victor Corces (Minireview, page 1213) who argue that these interactions are critical determinants of cell fate and identity. The function of the lamin proteins at the nuclear envelope likely reaches far beyond chromatin organization, as mutations in these proteins give rise to a group of diverse syndromes with symptoms ranging from muscular dystrophy to progeria. Katherine Schreiber and Brian Kennedy review these laminopathies (Review, page 1364), examining the connections between nuclear envelope dysfunction and altered nuclear activity and commenting on how emerging molecular insights might be translated into new treatments. In vertebrate cells, the nuclear envelope retracts into the endoplasmic reticulum at the onset of mitosis and subsequently reforms in daughter cells. Cornelia Wandke und Ulrike Kutay (Minireview, page 1222) parse the mechanisms behind this dramatic breakdown and reassembly. However, the nuclear envelope is more than just a barrier that holds in nuclear contents. It also serves as a highly selective yet very rapid and efcient lter for the trafcking of RNA and protein via nuclear pores that are essential for getting cellular components where they need to be, thereby maintaining the distinct properties of the nucleoplasm and cytoplasm. Understanding how this is accomplished is examined by Rebecca Adams and Susan Wente (Minireview, page 1218), who guide us through this remarkable feat of selective transport, opening our eyes to nuclear pore complexity as revealed by high-resolution structural analysis and imaging. Although often depicted as the center of a cell, the position of the nucleus can in fact vary extensively, with important functional consequences for cell signaling and migration. Gregg Gundersen and Howard Worman (Review, page 1375) describe the cytoskeletal forces that position the nucleus and present evidence linking disruptions in these forces to numerous diseases. Creating this special issue has relied upon involvement from many dedicated authors and reviewers, and we would like to thank them for their time, effort, and insights. We hope that, in reading The Dynamic Nucleus, you will come away with an appreciation for the processes that contribute to nuclear complexity, inspiration from how much has been learned in recent years, and motivation to pursue answers to the many remaining mysteries of nuclear biology.
Robert P. Kruger

Cell 152, March 14, 2013 2013 Elsevier Inc. 1201

Leading Edge

Select
The Nucleus: Express Yourself
As the home for the bulk of eukaryotic genomes, the nucleus is a perfect venue for observing intricate regulatory mechanisms involving interactions between DNA, RNA, and protein. These interactions have evolved beyond the original central dogma concept, and the papers featured in this Select highlight some recent and exciting examples of these mechanisms.

Why Does Polycomb Like to Hang Out in the Islands?


Methylation of cytosine in CpG sequences is an epigenetic modication of DNA that is associated with a repressive chromatin state. Large clusters of CpG sequences known as CpG islands (CGIs) tend to be unmethylated and are found near active transcription start sites. However, a subset of CGIs corresponds to binding regions for the repressive polycomb complex in mammals. In the past couple of years, it has become clear that certain chromatin-modifying activities are directed to nonmethylated CGIs. So far, each of these activities has been associated with active transcription. The Set1 histone H3 lysine 4 methyltransferase complex subunit CFP1 and KDM2A, a histone H3 lysine 36 demethylase, both contain a zinc nger CxxC DNA-binding domain (ZF-CxxC) that preferentially binds The negative regulation of gene unmethylated CGIs. In two separate studies, Robert Klose and Kristian Helin and their expression by CpG sequences can colleagues looked at the binding and function of KDM2B (a paralog of KDM2A also known be through cytosine methylation (left) as FBXL10) to see whether it binds CGIs like KDM2A. The ChIP-seq results from both groups or through the recruitment of a show that, like KDM2A, KDM2B binds to CGIs across the genome, and Klose and colleagues polycomb-repressive complex to unsee that KDM2B is more robustly represented at CGIs that are bound by the polycombmethylated CGIs. Image courtesy of repressive complex, PRC1, which ubiquitinates histone H2A at lysine 119. It was previously X. Wu and K. Helin. shown that KDM2B can be part of an alternative PRC1 complex, and both groups nd that KDM2B binds a specic variant of PRC1 via a subunit called NSPc1 (or PCGF1) and that the ZF-CxxC motif of KDM2B is responsible for targeting these complexes to unmethylated CGIs. The ZF-CxxC motif of KDM2B is important for histone H2A ubiquitination, and loss of KDM2B reactivates a subset of polycomb targets. Helin and colleagues go on to show that KDM2B, like other PRC1 components, is important for proper differentiation of mouse embryonic stem cells. These ndings add to the complexity of polycomb targeting in mammals, which lack a dened polycomb response element, and support an emerging concept that PRC1 targeting is not always dependent on the histone H3 lysine 27 trimethylation activity of PRC2. Farcas, A.M., et al. (2012). eLIFE. Published online December 18, 2012. http://dx.doi.org/10.7554/eLife.00205. Wu, X., et al. (2013). Mol. Cell. Published online February 7, 2013. http://dx.doi.org/10.1016/j.molcel.2013.01.016.

How Does Human X Inactivation Work? Im Not eXACTly Sure


In addition to the protein complexes that regulate gene expression, we know that noncoding RNAs can be involved in targeting regulatory complexes to chromatin. While analyzing RNA-seq data from a female-derived human embryonic stem cell line (hESC), Claire Rougeulle and colleagues found a previously unannotated region of the X chromosome that was actively transcribed. The resulting 251.8 kb, unspliced, polyadenylated transcript is primarily nuclear, bearing the hallmarks of a long noncoding RNA (lncRNA). The X chromosome is not a stranger to lncRNAs, as two such RNAs, XIST and TSIX, are well known to have a role in the mammalian version of sex chromosome dosage compensation, X chromosome inactivation (XCI). So what might be special about another lncRNA expressed from the X chromosome? Could it have a role in XCI? XIST is known to coat the In female human embryonic stem cells, inactive X and to maintain it in an inactive state. However, this newly identied lncRNA is XIST RNA (green) covers the inactive X found to be associated with the active X chromosome, which led to the name of XACT and XACT (red) coats the active one. (X active coating transcript). Interestingly, XACT is downregulated upon differentiation of Image courtesy of C. Vallot. hESCsnearly completely silent after 10 daysand is either weakly expressed or not detectable in adult tissues. However, upon conversion of mesenchymal stem cells to induced pluripotent stem cells, there is strong re-expression of XACT. Also, XIST appears to prevent XACT expression from the inactive X. So what does this lncRNA do? That remains to be determined. It is not the rst case of a lncRNA being associated with an active chromosome in a dosage compensation process. In fruit ies (where females do not inactivate either X chromosome), the lncRNAs rOX1 and rOX2 are critical components of a histone acetyltransferase complex that is responsible for upregulating the single X chromosome in males. Whether XACT is responsible for targeting a chromatin-modifying activity or has some other function remains to be seen. If you are thinking about checking this out in some mouse embryonic stem cells, it might not be worth the effort because it looks like expression of XACT may be a recent evolutionary event and is seen only in human cells. Vallot, C., et al. (2013). Nat. Genetics 45, 239241.
Cell 152, March 14, 2013 2013 Elsevier Inc. 1203

A Remodeler Gets Drawn into a Loop


In addition to the interactions of regulatory molecules with arrays of nucleosomal DNA, it is important to consider the impact of higher-order organization such as looping. Typically, ATP-dependent chromatin-remodeling complexes are targeted through either specic interactions with modied histone tails or interactions with specic transcription factors (TFs). When asking whether the targeting of yeast remodeler Isw2 is transcription factor dependent, Toshio Tsukiyama and colleagues stumbled upon something unusual. Although they did see that Isw2 targeting overlapped and depended on transcription factors, curiously, many of these sites seemed to lack annotated TF-binding sites. In the case of the yeast transcription factor Ume6, nearly 90% of the 563 Isw2 binding sites that are Ume6 dependent were devoid of Ume6-specic binding sites. To examine the Isw2 is targeted to canonical sites mechanism by which this targeting could happen, they turned back to their Isw2 ChIP-chip via physical interactions with Ume6. In contrast, Isw2 is targeted to ectopic sites data and noticed that the remodeler often bound at 5 and 3 ends of the same gene, via Ume6- and TFIIB-dependent DNA reminiscent of the binding pattern of TFIIB when it forms gene loops. Using chromosome looping. A light-blue oval represents conrmation capture (3C) and yeast genetics, Tsukiyama and colleagues were able to the transcription preinitiation complex. show that DNA looping targets Isw2 to specic loci and that Ume6 and TFIIB are necessary Image courtesy of T. Tsukyama. for formation of these loops and also for the repression of Ume6 target genes. Interestingly, all previously known examples of gene looping in yeast are associated with transcriptional activators. This is the rst time that a nucleosome remodeler has been shown to be targeted by looping, and it will be interesting to see whether other related enzymes might be targeted similarly. Yadon, A.N., et al. (2013). Mol. Cell. Published online March 7, 2013. http://dx.doi.org/10.1016/j.molcel.2013/02.005.

Two Kinds of Loops for Proper T Cell Receptor Locus Rearrangements


Looping and higher-order structures have implications in genomic stability as well. The recombinases RAG1 and RAG2 generate diversity in immunoglobin (Ig) loci in B cell and in the T cell receptor loci in T cells through a process known as V(D)J recombination. The process rearranges the variable (V), diversity (D), and joining (J) segments within each locus to create a broad repertoire of receptors to recognize foreign antigens. The rearrangement of the T cell receptor a locus Tcra is one of the last steps in the complex process of diversifying T cell receptors. There are a large number of possible outcomes for Looping, kissing, and recombining. On the left, a ballooneach recombination event, and tight control is essential. One level of based model of the image on the right, which shows monoalcontrol is to limit recombination to a single allele. Jane Skok and lelic looping and pairing that faciliate targeted cleavage on colleagues have found that, like other loci, Tcra only cleaves one allele one Tcra allele (red) in recombination centers away from the at a time, and they have added some key insights into how this rest of chromosome 14 (green). Image courtesy of J. Skok. happens. First, they show that higher-order intrachromosomal looping is linked to RAG-mediated cleavage of each allele. The association of RAG proteins coincides with transcription from the soon to be cleaved alleles along with decondensation of the chromatin. So what prevents the cleavage of the other allele? Previous work had shown that a key DNA-damage-signaling kinase, ATM, was involved in repositioning the second allele at Ig loci to heterochromatin. For Tcra, they nd that ATM not only controls the positioning to heterochromatin, but also prevents looping. This suggests a negative feedback loop that acts in trans, wherein ATM is recruited to a break on the rst allele and acts to suppress higher-order looping on the second allele. Chaumell, J., et al. (2013). Cell Rep. 3, 359370.

Brian Plosky
Senior Editor, Molecular Cell

Cell 152, March 14, 2013 2013 Elsevier Inc. 1205

Leading Edge

Voices
Nuclear Biology: Whats Been Most Surprising?
Restricting Genomic Partners Plasticity of Interpretation Surprises at the Membrane

Job Dekker
University of Massachusetts Medical School

Joanna Wysocka
Stanford University

Iain Mattaj
European Molecular Biology Laboratory

The regulatory potential of the human genome is much richer than some had anticipated. With greatly rened annotations, we now realize that each gene nds itself surrounded by a huge number of potentially regulatory elements in a very crowded nucleus. Given that many regulatory elements control genes through direct physical interaction, one can imagine that this could create a potentially risky situation in which genes get misregulated by chance encounters with inappropriate elements. So, a major question in the eld of nuclear organization is how do cells ensure that genes only respond to the right regulatory elements while ignoring the hundreds of thousands of others? Recent work has revealed a surprisingly simple strategy for matching genes to only some regulatory elements, which involves the spatial organization and folding of chromosomes inside the nucleus. In Drosophila, mouse, and human nuclei, chromosomes are spatially compartmentalized. Using 5C and Hi-C technologies, it has been shown that chromosomes form strings of topologically associating domains (TADs) that are each hundreds of Kb in size but are spatially insulated from neighboring TADs. As a result, a given gene lives in a relatively small neighborhood where it encounters only a small section of the genome and thus can partner with only a small number of regulatory elements. Future studies will no doubt unveil how TADs are established and how they insulate genes from the wrong crowd.

Advances in genomic proling technologies combined with the realization that certain chromatin features can be effectively used to annotate cis-regulatory elements enabled a large number of recent epigenome mapping efforts across a myriad of cell types and organisms. The picture that emerges from these studies elucidates the astounding degree to which our genome, including the repetitive regions derived from transposon elements, appears to be dynamically utilized for the purposes of gene regulation. The human ENCODE project alone mapped nearly 400,000 distinct transcriptional enhancers, most of which showed high cell type specicity of the chromatin-marking patterns. Other studies have demonstrated that thousands of regulatory regions undergo activation or decommissioning even during transitions between closely developmentally related cell types. It seems highly likely that the information content within regulatory parts of the genome substantially exceeds that of protein-coding regions, suggesting the enormous potential for combinatorial complexity of gene expression regulation during embryogenesis. Dynamic changes between distinct chromatin states have proven to be remarkably commonplace during differentiation. Moreover, discoveries of enzymatic activities that are responsible for removal or alteration of chromatin modications previously thought of as relatively stable, such as methylation of histone proteins and DNA, contribute to the mechanistic explanation of the observed chromatin dynamics. Taken together, emerging views change our thinking about both the content of our genome and the plasticity of its interpretation through chromatin-mediated mechanisms.

Nuclear biology is full of surprises because, like all biology, the underlying mechanisms result from evolution and have been selected to work, independent of how. Studying nucleocytoplasmic transport, an early eyepopping moment resulted from calculating the ux of macromolecules transported between cytoplasmic and nuclear compartments. Who would guess that transport through nuclear pore complexes (NPCs) affects millions of macromolecules every minute in mammalian cells? Transport substrates vary, but some are huge, like viruses or RNPs of up to 50 MDa, whereas passive macromolecules of >60 KDa are excluded from NPC transit. It is therefore remarkable that NPC passage per se occurs independent of energy input. Instead, import or export is driven indirectly via energydependent assembly or disassembly of transport-competent complexes on one side or the other of the NPC. NPC selectivity results from the properties of intrinsically disordered segments of the NPC proteins that line and occupy the transport channel and the transport receptors with which they interact. But the most dynamic aspect of nuclear biology is the complete disassembly and reassembly of the nucleus during each metazoan cell division. Here, the nding is so old that it is no longer a surprise, but we have no idea why this should happen, especially as many single-celled eukaryotes undergo mitosis with an intact nucleus.

Cell 152, March 14, 2013 2013 Elsevier Inc. 1207

A Blueprint for Spatial Sequencing

RNA Rewrites Central Dogma

Erez Lieberman Aiden


Harvard University

Craig Pikaard
Indiana University

From an information theoretic standpoint, the throughput of todays sequencers dwarfs almost any other means of interrogating a biological sample. This makes it increasingly tempting to try to translate far-ung biological questions into the language of DNA sequence. But how well can this sort of experimental shoehorning work? If the recent experience of nuclear biology is any guide, the answer is: better than we might have guessed. New proximity ligation methods based on the nuclear ligation assay and its intellectual descendants have made DNA sequencers the platform of choice for rapidly estimating the physical distance between genomic loci in the nucleus of a cell. As a result, three-dimensional DNA sequencing has begun to have a marked impact on our understanding of chromatin structure, playing a role that is highly complementary to microscopy. Because ligation-based methods can be used to probe the distance between other cellular actors, such as RNAs and proteins, this development suggests a broader template for translating cell biologys spatial puzzles. And why limit ourselves to the cells interior? Recent proposals have suggested mapping connectomes by tagging individual neurons with DNA barcodes and then ligating the tags. Todays nuclear biology might prove to be tomorrows neuroscience.

The pervasive role of RNA in nearly all aspects of nuclear biology is a continuing revelation. The eukaryotic nucleus is commonly perceived to be a realm in which DNA reigns supreme. Elucidation of the genetic code showed that messenger RNAs, transfer RNAs, and ribosomal RNAs transcribed in the nucleus are exported to the cytoplasm for protein synthesis. These early studies suggested that DNA gave the orders and RNA carried out the mission elsewhere. Fast forward to today and compelling evidence that an RNA-based biology predated the evolution of DNA for information storage, and one sees the nucleus in a new light: as a hotbed of RNA-mediated information management. DNA replication is initiated by RNA primers. Chromosome ends are maintained by RNA-templated telomere addition. Multiple classes of small regulatory RNAs (e.g., snRNAs, snoRNAs, and scaRNAs) are critical for messenger RNA splicing, transfer RNA maturation, ribosomal RNA processing, and RNA chemical modication by methylation or pseudouridylation. More recently, long noncoding RNAs and short RNAs (siRNAs, miRNAs, and piRNAs) have been shown to act within the nucleus to regulate cytosine methylation (e.g., plants and mammals) or histone modication (most eukaryotes). These epigenetic modications regulate genes during development, silence transposons and retroviruses, and contribute to centromere function and accurate chromosome segregation. An emerging RNA-centric view of the nucleus represents a major paradigm shift.

1208 Cell 152, March 14, 2013 2013 Elsevier Inc.

Leading Edge

Essay

The Cell Biology of Genomes: Bringing the Double Helix to Life


Tom Misteli1,*
1National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA *Correspondence: mistelit@mail.nih.gov http://dx.doi.org/10.1016/j.cell.2013.02.048

The recent ability to routinely probe genome function at a global scale has revolutionized our view of genomes. One of the most important realizations from these approaches is that the functional output of genomes is affected by the nuclear environment in which they exist. Integration of sequence information with molecular and cellular features of the genome promises a fuller understanding of genome function.
It was a moment of scientic amazement in 1952 when Watson and Crick revealed the structure of DNA. The magnicence of the double helix and its elegant simplicity were awe inspiring. But more than just being beautiful, the double helix immediately paved the way forward; its structure implied fundamental biological processes such as semiconservative replication and the notion that chemical changes in its composition may alter heritable traits. The linear structure of DNA laid the foundation for the concept that a string of chemical entities could encode the information that determines the very essence of every living organism. The beauty of the double helix was the promise that, if the sequence of bases in the genome could be mapped and decoded, the genetic information that underlies all living organisms would be revealed and the secret of biological systems would be unlocked. The idea of linearly encoded genetic information has been spectacularly successful, culminating in the recent development of powerful high-throughput sequencing methods that now allow the routine reading of entire genomes. The conceptual elegance of the genome is that the information contained in the DNA sequence is absolute. The order of bases can be determined by sequencing, and the result is always unequivocal. The ability to decipher and accurately predict the behavior of genome sequences was appealing to the early molecular biologists, has given rise to the discipline of molecular genetics, and has catalyzed the reductionist thinking that has driven and dominated the eld of molecular biology since its inception. But the apparent simplicity and deterministic nature of genomes can be deceptive. One of the most important lessons learned from our ability to exhaustively sequence DNA and to probe genome behavior at a global scale by mapping chromatin properties and expression proling is that the sequence is only the rst step in genome function. In intact living cells and organisms, the functional output of genomes is modulated, and the hard-wired information contained in the sequence is often amplied or suppressed. While mutations are an extreme case of genome modulation, most commonly occurring changes in genome function are more subtle and consist of uctuations in gene expression, temporary silencing, or temporary activation of genes. Although not caused by mutations, these genome activity changes are functionally important. Several mechanisms modulate genome function (Figure 1). At the transcription level, the limited availability of components of the transcription machinery at specic sites in the genome inuences the short-term behavior of genes and may make their expression stochastic. Epigenetic modications are capable of overriding genetically encoded information via chemical modication of chromatin. Similarly, changes in higher-order chromatin organization and gene positioning within the nucleus alter functional properties of genome regions. The existence of mechanisms that modulate the output of genomes makes it clear that a true understanding of genome function requires integration of what we have learned about genome sequence with what we are still discovering about how genomes are modied and how they are organized in vivo in the cell nucleus. The Stochastic Genome The genome is what denes an organism and an individual cell. It is therefore tempting to assume that identical genomes behave identically in a population of cells. We now know that this is not the case. Individual, genetically identical cells can behave very differently even in the same physiological environment. It is rare to nd a truly homogeneous population of cells even under controlled laboratory conditions, as anyone who has tried to make a cell line stably expressing a transgene knows. Much of the variability in biological behavior between individual cells comes from stochastic activity of genes (Raj and van Oudenaarden, 2008). Genes are by denition low-copynumber entities, as each typically only exists in two copies in the cell. Similarly, many transcription factors are present in relatively low numbers in the cell nucleus. The low copy number of genes and transcription factors makes gene expression inherently prone to stochastic effects (Raj and van Oudenaarden, 2008). Numerous observations make it clear that gene expression is stochastic in vivo. For example, dose-dependent increases in gene expression after treatment of cell populations with stimulating

Cell 152, March 14, 2013 2013 Elsevier Inc. 1209

ligands, such as hormones, merase is a stochastic and are often brought about by relatively inefcient event ithigh expression of target self. In order for a functional genes in a relatively small polymerase complex to asnumber of cells in the semble, individual transcrippopulation rather than by tion machinery components a uniform increase in the associate with chromatin in activity in all cells. Stochastic a step-wise fashion, and gene behavior is most eviformation of the mature podent in single-cell imaging lymerase complex involves approaches, and mapping by multiple partially assembled uorescence in situ hybridizaintermediates, many of which tion of multiple genes, which are unstable and disintegrate according to populationbefore a functionally compebased PCR analysis are tent complex is formed (Misactive in a given cell populateli, 2001). The inefciency of tion, shows that only a few polymerase assembly may cells transcribe all constitucreate stochasticity at an inditively active genes at any vidual locus. given time. Most cells only A further contributor to express a subset of genes, stochastic gene expression and the combinations vary may be the organization of considerably between inditranscription events in tranvidual cells. These observascription factories. These tions suggest that many hubs of transcription consist genes blink on and off and of accumulations of transcripare expressed in bursts rather tion factors to which multiple than in a continuous fashion genes, often located on (Larson et al., 2009). distinct chromosomes, are Figure 1. From Primary Sequence to Genome Output The molecular basis for recruited (Edelman and The hard-wired primary information contained in the genome sequence is stochastic gene expression Fraser, 2012). Typically only modulated at short or long timescales by several molecular and cellular events. Modulation may lead to activation (green) or silencing (red) of genome regions. is unknown. There are several a few hundred such transcripcandidate mechanisms, all of tion factories are observed in which are related to genome a mammalian cell nucleus. It or nuclear organization. Most genes remodelers. Whether an active gene is is possible that some genes need to physrequire some degree of chromatin remod- transcribed at any given time may thus ically relocate from nucleoplasmic locaeling for activity, which is thought to make depend on the transient condensation tions to transcription factories. A nomiregulatory regions accessible to the tran- status of its chromatin at a particular nally active gene locus that is not scription machinery. Several observa- moment. associated with a transcription factory tions suggest that chromatin remodeling A second mechanism to impose non- may thus be stochastically silent. The contributes to the stochastic bursting of uniform stochastic genome activity may relatively low number of transcription sites gene expression. Maybe most compelling be the local availability of the transcription makes them a limiting factor in is the nding that genes located near each machinery at a gene. Although transcrip- the transcription process and thus other on the same chromosome show tion factors are able to relatively freely a potential mediator of stochastic gene correlated blinking behavior, indicating diffuse through the nuclear space, and in expression. that a local chromosome property, such this way effectively scan the genome for as chromatin structure, drives stochastic binding sites, their availability and func- EpigeneticsAnd When behavior (Becskei et al., 2005). Further- tionality at a given local site may undergo Epigenetics Is Not Epigenetics more, altering chromatin, for example signicant temporal uctuations (Misteli, Stochastic effects modulate genome by deletion of chromatin remodeling 2001). The local availability of transcrip- output on short timescales. A mechanism machinery, affects stochastic variability tion complexes may affect transcription to modulate the hardwired information of in yeast. It can be envisioned that frequency positive or negatively. On the genomes on longer timescales is via the stochastic behavior of genes is one hand, it is possible that relatively epigenetics. The Greek-derived Epi caused by the requirement for cyclical stable preinitiation complexes persist on means over or above, and epigenetic opening of chromatin regions. Open chro- a given gene, where they may support effects are dened as heritable changes in matin has a limited persistence time, and multiple rounds of transcription and in genome activity caused by mechanisms maintaining chromatin in an open state this way boost initiation frequency. On other than changes in DNA sequence. requires the cyclical action of chromatin the other hand, assembly of the full poly- Epigenetic events are mediated by
1210 Cell 152, March 14, 2013 2013 Elsevier Inc.

chemical modications of DNA or core histones in complex patterns by methylation, acetylation, ubiquitination, phosphorylation, etc. These modications alter gene expression by changing the chromatin surface and in this way affect the binding of regulatory factors. Well-established examples of such effects include binding of the DNA-methylation-dependent binding of the MeCP2 protein or the binding of PHD-domain-containing proteins to trimethylated histone H3 tails. Prominent biological effects based on epigenetic regulation are phenotypic differences between homozygous twins or imprinted genes that are expressed from only one allele in a diploid organism. A central tenet in the denition of epigenetic regulation is that its effects are heritable, i.e., transmittable over generations. In fact, the concept of epigenetics was inspired by epidemiological ndings that nutrient availability in preadolescents during the 19th century Swedish famine determined life expectance of their grandchildren. The epidemiological studies have recently been complemented by controlled laboratory studies in mice (Rando, 2012), and they have been extended to the molecular level by the ndings that loss of the histone H3K4-trimethylation prolongs lifespan in C. elegans in a heritable fashion for several generations (Greer et al., 2011). A complicating aspect of epigenetics is that the same modications that mediate heritable epigenetic regulation may also bring about nonheritable transient modulations of the genome. In fact, the term epigenetic is nowadays often used in a very cavalier manner to refer to any biological effect, heritable or not, that is affected by histone modications. Even if they are not heritable, histone modications are biologically relevant modulators of genome function. The system of histone modications is in many ways akin to the mechanisms by which signal transduction pathways work (Schreiber and Bernstein, 2002). Just as in signal transduction pathways, posttranslational modications on histone tails create binding sites that are then recognized by adaptor or reader proteins, which in turn elicit downstream effects such as activation of kinases in the case of signaling cascades or recruitment of transcription factors in the case of

histone modications. In further analogy to the reversible events in signaling pathways, histone modications can be altered or erased by modifying enzymes. Such transient and reversible modulatory effects of histone modications have been implicated in every step of gene expression, starting from chromatin remodeling to recruitment of transcription machinery and even to downstream events that were thought to be chromatin independent, such as alternative premRNA splicing (Luco et al., 2011). It is often difcult to determine heritability of these histone modication effects, and it therefore remains unclear how many of them are truly epigenetic. Regardless, DNA and histone modications are an obvious source of modulation of the information contained in the genome sequence. Genome Organization as a Modulator of Genome Function Genomes of course do not exist as linear, naked DNA in the cell nucleus but are organized into higher-order chromatin bers, chromatin domains, and chromosomes. Many correlations between genome organization and activity have been mademost prominently, the ndings that transcriptionally active genes are generally located in decondensed chromatin and that transcriptionally repressed genome regions are often found at the nuclear periphery. These observations point to the possibility that the spatial organization of the genome modulates its functional output. But in considering the relationship of genome structure with its function, we are faced with a perpetual chicken-andegg problem. Does structure drive function, or is structure merely a reection of function? Much of the thinking on this topic has been guided by observations on individual genes. How representative these were for the genome as a whole has been a confounding concern. Recent unbiased genome-wide analysis of structure/function relationships has validated the tight link between structure and function. Large-scale analysis of chromatin structure, histone modications, and expression proles shows that genomes are portioned into well-dened domains that closely correlate with their activity status and the presence of active

or repressive histone marks (Sexton et al., 2012). The domains are separated by sharp boundaries marked by particular histone modication patterns and binding sites for chromatin insulator proteins such as CTCF. Even stronger evidence comes from the analysis of physical interactions between chromatin domains. At least in fruit ies, functionally equivalent domains tend to preferentially interact; that is, domains containing silent regions cluster in three-dimensional space, as do domains containing active regions (Sexton et al., 2012). But can genome structure drive its function? The best example for structuremediated gene expression effects is the silencing of genes when they become juxtaposed to heterochromatin domains, be it in the nuclear interior or at the nuclear periphery (Beisel and Paro, 2011). Gene activity has also frequently been linked to the position of a gene within the cell nucleus. The strongest evidence for such a relationship is experiments in which genes are transplanted from the nuclear interior to the lamina, leading to their repression or making them refractory to activation (Geyer et al., 2011). Based on these and similar experiments, it is often quite categorically stated that active genome regions are found in the interior of the nucleus and inactive ones at the periphery. This is a somewhat misleading oversimplication. Although lamina-associated genome regions are generally gene poor and are not transcribed, transcription labeling experiments reveal numerous active transcription sites at the periphery, and genes that are near the periphery, but not physically associated with it, are often active. On the other hand, inactive genes are frequently found in the interior. As far as we can tell, nuclear position per se does not determine activity, but association with repressive regions of the nucleus, be it at the periphery or the interior, does. So, how then should we think about the chicken-and-egg problem of nuclear structure and function? How can it be that clear evidence exists for both function-driving-structure as well for structure-driving-function? The likely answer is that both effects are at play and are part of an overarching principle in which the mutual interplay of structure and function at multiple levels inuences gene

Cell 152, March 14, 2013 2013 Elsevier Inc. 1211

expression. The fact that there are very few known heterochromatic active genes suggests that a structural change in the form of chromatin decondensation is a crucial early step in gene activation. However, because chromatin states are generally unstable, mechanisms that reinforce a decondensed chromatin state must be in force for a gene to remain active. Such reinforcing mechanisms are dependent on gene activity and represent the activity-drives-function aspect of gene expression. Reinforcement mechanisms might be mediated by what we consider active histone modications, some of which are known to be deposited during transcription as the polymerases traverse genes. On the ipside, a chromatin domain may also impose its effect on neighboring regions, either in cis on the same chromosome by spreading or in trans on distinct chromosomes. This effect represents the structure-drivesfunction aspect of genome function. Such a bidirectional, self-enforcing function-structure-function model accounts for most experimental observations on structure-function relationships in gene expression. Facing the Complexity Since the discovery of the double helix, we have come to realize that understanding genomes requires more than reading their sequence and that the information contained in the sequence is modulated by the cellular environment. How then do we gain full knowledge of the functional information encoded in genomes? To get a comprehensive picture of the functional output of genomes, the sequence information needs to be integrated with other information parameters such as epigenetic patterns, higher-order chromatin landscapes, and noncoding RNA proles. The technology to do this is now available, and intense efforts are currently underway to comprehensively gather these data sets in various biological systems. The rst examples of such multilevel mapping analyses are emerging, such as the recent urry of reports from the ENCODE consortium,

which has systematically mapped genome properties ranging from histone modication proles to regulatory elements and chromatin structure (Ecker et al., 2012). Given the scale and complexity of the generated data, not to mention the technical difculties in gathering it, this is a challenging undertaking that will require a series of progressively larger studies. Ideally, future studies should be designed to systematically map multiple genome properties for focused biological systems such as specic human diseases. Large-scale mapping of genomerelated parameters and their comparison is a logical and necessary next step in the exploration of genomes and their function. These efforts will create invaluable catalogs of genome properties, and the hope is that, by cross-comparing data sets, insight into the rules that govern genome regulation will be gleaned. One can go one step further and advocate for an even more comprehensive approach in which genome expression data are then compared to other cellular characteristics such as proteomic, metabolomic, morphological, and physiological data to systematically link genome activity to biological behavior. The ultimate version of such an approach was recently described in a report by the US National Academies of Sciences entitled Toward Precision Medicine, which envisioned a fully minable biomedical data repository that would include information ranging from genomic and epigenetic parameters to physiological features and clinical symptoms. The elegant simplicity of the DNA structure revealed by Watson and Crick is still stunning. True to its promise when it was rst discovered, it opened up the oodgates to understanding heredity. But one of the most profound lesions from the ensuing decades of genome exploration must be that the linear arrangement of bases in the DNA is not an absolute set of instructions but is malleable by the cellular environment. We are just beginning to uncover some of the mechanisms that are responsible for these effects. As is the rule in biology, wherein the whole

is often greater than the sum of its parts, we are realizing that the genome is far more complex than the sequence of its DNA.
ACKNOWLEDGMENTS Due to space limitations, mostly review articles were cited. Work in the authors laboratory is supported by the Intramural Research Program of the National Institutes of Health (NIH), NCI, Center for Cancer Research. REFERENCES Becskei, A., Kaufmann, B.B., and van Oudenaarden, A. (2005). Contributions of low molecule number and chromosomal positioning to stochastic gene expression. Nat. Genet. 37, 937944. Beisel, C., and Paro, R. (2011). Silencing chromatin: comparing modes and mechanisms. Nat. Rev. Genet. 12, 123135. Ecker, J.R., Bickmore, W.A., Barroso, I., Pritchard, J.K., Gilad, Y., and Segal, E. (2012). Genomics: ENCODE explained. Nature 489, 5255. Edelman, L.B., and Fraser, P. (2012). Transcription factories: genetic programming in three dimensions. Curr. Opin. Genet. Dev. 22, 110114. Geyer, P.K., Vitalini, M.W., and Wallrath, L.L. (2011). Nuclear organization: taking a position on gene expression. Curr. Opin. Cell Biol. 23, 354359. Greer, E.L., Maures, T.J., Ucar, D., Hauswirth, A.G., Mancini, E., Lim, J.P., Benayoun, B.A., Shi, Y., and Brunet, A. (2011). Transgenerational epigenetic inheritance of longevity in Caenorhabditis elegans. Nature 479, 365371. Larson, D.R., Singer, R.H., and Zenklusen, D. (2009). A single molecule view of gene expression. Trends Cell Biol. 19, 630637. Luco, R.F., Allo, M., Schor, I.E., Kornblihtt, A.R., and Misteli, T. (2011). Epigenetics in alternative pre-mRNA splicing. Cell 144, 1626. Misteli, T. (2001). Protein dynamics: implications for nuclear architecture and gene expression. Science 291, 843847. Raj, A., and van Oudenaarden, A. (2008). Nature, nurture, or chance: stochastic gene expression and its consequences. Cell 135, 216226. Rando, O.J. (2012). Daddy issues: paternal effects on phenotype. Cell 151, 702708. Schreiber, S.L., and Bernstein, B.E. (2002). Signaling network model of chromatin. Cell 111, 771778. Sexton, T., Yaffe, E., Kenigsberg, E., Bantignies, F., Leblanc, B., Hoichman, M., Parrinello, H., Tanay, A., and Cavalli, G. (2012). Three-dimensional folding and functional organization principles of the Drosophila genome. Cell 148, 458472.

1212 Cell 152, March 14, 2013 2013 Elsevier Inc.

Leading Edge

Minireview
Spinning the Web of Cell Fate
Kevin Van Bortle1 and Victor G. Corces1,*
1Department of Biology, Emory University, Atlanta, GA 30322, USA *Correspondence: vcorces@emory.edu http://dx.doi.org/10.1016/j.cell.2013.02.052

Spatiotemporal changes in nuclear lamina composition underlie cell-type-specic chromatin organization and cell fate, suggesting that the lamina forms a dynamic framework critical for genome function, cellular identity, and developmental potential.
Introduction The incredible complexity and plasticity of eukaryotic genome organization underlies the transformational ability of stem cells to become an array of diverse tissues and differentiated cell types. Interphase chromosomes are spatially arranged into dynamic structures and subcompartments that signicantly inuence gene activity. The nuclear lamina (NL), for example, preferentially interacts with transcriptionally silent chromatin characterized by low gene density and the absence of active histone modications. Lamina-associated chromatin domains (LADs) are sharply dened and vary between cell types, suggesting interactions between chromatin and the NL are actively established and dynamically modied during cellular differentiation and development. Nevertheless, to what degree this nonchromatin nuclear structure actively participates in gene regulation and differentiation remains an active area of research. Recent studies by Clowney et al. (2012), Kohwi et al. (2013), and Solovei et al. (2013) provide evidence that spatiotemporal differences in lamina composition and genome architecture underlie developmental competence and differentiation, suggesting the nuclear lamina is directly involved in spinning the web of cell fate. Chromatin at the Nuclear Lamina The nuclear lamina is a thin proteinacious layer of highly conserved intermediate lament proteins, called lamins, which lie at the interface between interphase chromatin and the inner nuclear membrane. Lamins maintain the mechanical integrity and shape of the nucleus and serve as a platform for chromatin organization and gene regulation. Lamins are encoded by three genes in mammals and categorized as either A type (lamin A/C), or B type (lamins B1 and B2). Whereas B type lamins are expressed in essentially all mammalian cell types, A type lamins appear only in a subset of differentiated cell types and at low levels in embryonic stem cells (ESCs) (Eckersley-Maslin et al., 2013).The requirement of A type and/or B type lamins for appropriate nuclear architecture is often cell type specic, suggesting the function of lamins can be differentially utilized in a cell-type- and tissue-specic manner. The exibility in lamin dependency may also be contingent on the presence of laminassociated proteins, such as the Lamin-B Receptor (LBR), a nuclear envelope protein that can also anchor heterochromatin to the nuclear periphery (Solovei et al., 2013). Nevertheless, B type lamins are essential for tissue differentiation and organ development, and mutations in A type lamins and lamin-associated proteins cause a wide range of human diseases referred to as laminopathies. Nuclear Periphery and Gene Repression In most cell types, the nuclear periphery is associated with transcriptionally silent and late replicating chromatin, a feature that appears to be conserved from yeast to humans. Movement of genes to the nuclear periphery often coincides with gene repression, yet articial tethering experiments suggest that perinuclear localization is sufcient for downregulation of some, but not all, genes (Burke and Stewart, 2013). The mechanisms responsible for perinuclear gene silencing and the role of lamins remain poorly dened. However, mapping of interactions between chromatin and lamins in vivo using a microarray-based approach indicates that the NL associates with large, sharply dened domains characterized by low gene expression levels (Guelen et al., 2008; Pickersgill et al., 2006). LADs identied in both Drosophila melanogaster and human broblasts contain widely spaced, coordinately expressed gene clusters, conrming earlier microscopy-based evidence that the nuclear periphery preferentially interacts with gene-poor regions. LADs are also partially enriched for repressive H3K9 and H3K27 methylation, and recent genetic screens in Caenorhabditis elegans demonstrate that enzymes involved in H3K9 methylation are essential for sequestering heterochromatin at the nuclear periphery (Towbin et al., 2012). Whether the formation of heterochromatin itself is sufcient to drive perinuclear anchoring is unknown. However, many genes are devoid of repressive histone modications in human LADs (Guelen et al., 2008), suggesting that mammalian chromatin-lamina interactions are not solely dependent on H3K9 methylation. LAD organization also requires the transcriptional repressor HDAC3, a histone deacetylase targeted to the nuclear periphery by lamin-associated protein Emerin (Demmerle et al., 2012; Zullo et al., 2012), suggesting the removal and absence of active histone marks are the dening features of peripheral localization. Understanding the mechanisms by which LADs are established and the molecular link between perinuclear localization and heterochromatin remains a priority for future research. Nevertheless, mapping of LADs in both Drosophila and humans provides preliminary evidence that chromatin insulators, which correlate with physical domain borders and mediate long-range interactions, are involved in peripheral compartmentalization.
Cell 152, March 14, 2013 2013 Elsevier Inc. 1213

For example, insulator protein CTCF delineates sharply dened lamina domain borders in human broblasts and mouse embryonic stem cells (Guelen et al., 2008; Handoko et al., 2011) and is essential for perinuclear positioning of the cystic brosis-relevant CFTR gene (Muck et al., 2012). Functional analysis of LAD-derived DNA sequences in murine broblasts further reveals an enrichment for GAGA sequences, which are bound by a transcriptional repressor, cKrox, in complex with HDAC3 and lamina-interacting protein Lap2b (Zullo et al., 2012). cKrox mediates chromatin-lamina interactions in a cell-type-specic manner, suggesting LADs may be developmentally regulated by differential recruitment of cKrox and other factors. In addition to insulator proteins, a subset of lamina-associated domain borders are enriched for H3K4me3 in the absence of CTCF (Zullo et al., 2012) and delineated by promoters oriented away from LADs (Guelen et al., 2008), suggesting genome-NL domain organization is likely specied by a complex combination of nuclear factors. Genome-wide mapping studies of chromatin-lamina interactions cannot discriminate between perinuclear associations and interactions that occur within the nucleoplasm, which are also involved in cell proliferation and differentiation (Burke and Stewart, 2013). Interactions between chromatin and nuclear pore proteins, which are often associated with gene activation, similarly take place both in and away from the nuclear periphery, suggesting dynamic movement of perinuclear lamins and pore proteins is an important process in gene regulation. Targeting of lamins to the nucleoplasm appears cell type specic and depends on the expression of different lamin-interacting proteins, yet the dynamics of chromatin-lamina interactions in the nucleoplasm remain ill-dened. Genome-NL Dynamics through Differentiation and Disease Gene expression patterns underlying cellular identity must be reprogrammed in order for pluripotent stem cells to give rise to a complex system of tissues and differentiated cell types, a feat accomplished collectively by transcription factors, chromatin, and DNA modications, and by 3D rearrangement of chromatin organization. A series of genome-wide mapping experiments carried out in mouse ESCs, sequentially derived neural precursor cells (NPCs), and differentiated astrocytes (ACs) reveal how genome-NL interactions are reorganized during lineage commitment and terminal differentiation, (Peric-Hupkes et al., 2010). LADs are surprisingly congruent across cell types, with overlap ranging between 73%87%. In a follow-up study, cell-type-independent LADs are shown to be highly conserved between mouse and humans ESCs and characterized by high A/T content (Meuleman et al., 2013), suggesting constitutive LADs are specied by interactions between A/T sequence elements and the nuclear lamina. Cell-type invariant NL-interacting sequences are also A/T rich in ESCs, further suggesting that conserved LADs represent an inherited backbone structure for peripheral chromatin contacts. Nevertheless, localized, cell-type-specic differences in chromatin interactions indicate that some degree of LAD reorganization occurs concomitant to differentiation (PericHupkes et al., 2010). Reorganization of NL interactions from
1214 Cell 152, March 14, 2013 2013 Elsevier Inc.

ESC/NPCs/ACs is largely cumulative, i.e., gene relocation during lineage commitment is maintained during subsequent cell-type transitions. Genes that undergo repositioning are substantially different across differentiation lineages and important for cellular identity, suggesting genome-NL dynamics reect a progressive, lineage-specic process in which factors important for maintaining pluripotency or involved in cell fate decisions are regulated by locking, or unlocking, genes at the nuclear periphery. Kohwi et al. (2013) provide supporting evidence that lamins indeed contribute to cell fate decisions through gene repositioning and repression at the nuclear periphery. In Drosophila embryonic neuroblasts, progenitor competence is lost over time, wherein sequential expression of temporal identity genes determines the cell fate of neuronal progeny. The rst transcription factor expressed, Hunchback (Hb), species early-born U1/U2 neuronal identity within a limited early competence window. Tracking of the hunchback (hb) genomic locus in vivo reveals that the hb gene is gradually and synchronously repositioned to the nuclear lamina coincidentally with the end of the neuroblast early competence window (Kohwi et al., 2013). Depletion of lamin extends neuroblast competence by reducing both hb positioning and gene silencing, suggesting peripheral compartmentalization and repression of hb is an important determinant of neuronal fate specication and progenitor competence. To what extent lamins are required for developmentally regulated reorganization of other competence-relevant loci will require future exploration. However, disruption of Drosophila lamin also prevents peripheral compartmentalization and repression of testis-specic gene clusters in somatic cells (Shevelyov et al., 2009), supporting a general model in which the nuclear lamina imprisons developmental loci for tissue-specic gene repression. Independent studies of laminopathies also provide insight into the function of chromatin interactions at the nuclear lamina. For example, Hutchinson-Gilford progeria syndrome (HGPS) is a premature-aging disease caused by progerin, an incompletely processed mutant form of lamin A that promotes abnormal chromatin structure and increased DNA damage. Expression of a GFP-progerin transgene in human mesenchymal stem cells (hMSCs) causes aberrant expression of general and tissuespecic differentiation markers and disrupts the cellular identity, function, and differentiation potential of hMSCs in a manner consistent with phenotypes of HGPS patients (Scafdi and Misteli, 2008). Examination of cells from HGPS patients further revealed that SKIP, a downstream coactivator of Notch target genes normally sequestered and repressed by the nuclear lamina, loses association with this structure, suggesting aberrant Notch signaling and differentiation abnormalities result from disrupted genome-NL interactions. Indeed, recent sequencingbased mapping of lamin A/C associations reveals that heterochromatin interactions at the NL are reduced genome-wide in HGPS cells, in accordance with microscopy-based evidence (McCord et al., 2013). By integrating proles for lamin A/C and H3K27me3 with 3D organization changes, McCord et al. (2013) also demonstrate global changes in chromatin compartmentalization in HGPS cells. Changes observed in spatial genome organization correlate with and are preceded by changes in

Figure 1. Retinal Cells Differentiation and Chromatin Organization


Spatiotemporal differences in the nuclear lamina composition of differentiating retinal cells (left to right) underlie tissue-specic chromatin organization and genome function. Comparison of lamina composition and genome-NL interactions in the nuclei of embryonic stem cells (ESC), progenitor cells, and differentiated retinal rod cells, bipolar neurons, and ganglion cells. Peripheral compartmentalization of heterochromatin is mediated by lamin proteins; euchromatin is largely nucleoplasmic. Restructuring of chromatin-lamina interactions is a gradual and cumulative process, wherein changes that occur during lineage commitment are often maintained in terminally differentiated cells (Peric-Hupkes et al., 2010). In most cell types, lamin A/C and the inner nuclear membrane protein LBR are consecutively transcribed, with LBR expressed early and A type lamins developmentally regulated for expression in differentiated cells (see bipolar neurons and ganglion cells; Solovei et al., 2013). However, neither LBR nor lamin A/C are transcribed in the differentiated rod photoreceptor cells of nocturnal mammals, causing inversion of nuclear architecture with implications for night vision (Solovei et al., 2009).

lamin A/C and heterochromatin, providing additional evidence that reduction of H3K27me3 and loss of heterochromatin-lamina interactions underlie changes in chromatin structure and genome function. Additional disease-related mutations in lamins and laminassociated proteins provide insight into the functional relevance of dynamic genome-NL interactions for tissue differentiation. Emery-Dreifuss muscular dystrophy (EDMD) is a slow progressing degenerative muscle disease caused by autosomal-dominant or X-linked mutations in LMNA or in the lamin-interacting protein Emerin, respectively. Recapitulation of a severe lateonset EDMD-linked lamin mutation in C. elegans leads to muscle-specic perinuclear retention and repression of transgene-generated heterochromatin carrying a strong musclespecic promoter (Mattout et al., 2011). The dominant single point mutation in lamin also disrupts tissue-specic expression patterns and leads to defective muscle organization. Together, integration of basic and clinical research suggests that genome-NL interactions are an important regulatory mechanism for controlling cellular identity, differentiation potential, and maintenance of tissue integrity. Inverted Nuclear Architecture: Learning from Inside Out In an extreme twist on the relationship between nuclear organization and genome function, specic cell types exhibit an inside-out architecture in which genes and markers of active chromatin are found exclusively at the nuclear periphery and heterochromatin centrally positioned. Nuclear inversion occurs in the nuclei of mouse retinal rod cells (Solovei et al., 2009), wherein rearrangement of chromatin takes place during terminal differentiation of rod nuclei (Figure 1) and affects the optical properties of the retina by reducing light scattering in the outer nuclear layer. This unusual pattern of nuclear inversion also

develops in the rod nuclei of several other nocturnal mammals, suggesting rearrangement of chromatin represents an adaptation for night vision. Nuclear inversion is gradually established over several weeks, and a recent follow up study (Solovei et al., 2013) suggests that changes in NL composition underlies the dynamic arrangement and maintenance of chromatin organization in the differentiating rod cells. In most cells, lamin A/C and the inner nuclear membrane protein LBR are consecutively transcribed, with LBR expressed early and A type lamins developmentally regulated for expression in differentiated cell types. Sequential expression of LBR and lamin A/C is common across diverse cell types, and differentiated cells that do not express A type lamins often persistently express LBR. Strikingly, inverted rod nuclei express neither lamin A/C nor LBR, and transgenic expression of LBR preserves establishment of the conventional nuclear architecture in differentiated rod cells (Solovei et al., 2013). Moreover, nonrod cells that do not express lamin A/C undergo inversion in LBR null mice, and all postmitotic cells undergo inversion in double-null LBR/ LMNA/ mice, indicating that nuclear inversion is caused by the loss of both LBR and/or lamin A/C. Transgenic expression of lamin C alone does not prevent inversion in rod nuclei, suggesting that in contrast to LBR, A type lamins require additional lamin-associated factors for establishing heterochromatin tethers. In myoblasts, deletion of A type lamins reduced expression of muscle-related genes, whereas deletion of LBR had a slightly opposite effect, indicating that lamin A/C and LBR inversely regulate tissue-specic transcription patterns. Loss of lamin A/C or LBR had comparatively smaller effects on muscle-specic transcription in differentiated muscle, suggesting lamin dynamics are most critical during the early stages of myotube differentiation. Similar evidence for the role of LBR and NL composition in development and tissue-specic gene expression comes from
Cell 152, March 14, 2013 2013 Elsevier Inc. 1215

recent studies in mouse olfactory neurons. Murine olfactory sensory neurons (OSN), which choose and monoallelically expresses one out of 1,400 olfactory receptor (OR) genes, are organized into subregions of the olfactory epithelium called zones. OR genes in mice are highly similar, and previous ndings suggest OR choice might be determined by dynamic reversal of repressive H3K9 and H4K20 methylation marks along OR clusters (Magklara et al., 2011). Repressed OR loci colocalize with H3K9 and H4K20 marks in differentiation-dependent and OSN-specic nuclear aggregates, which may function to maintain silencing and conceal transcription factor binding sites that might otherwise disrupt transcription of the active OR allele (Clowney et al., 2012). Silenced OR foci are established near the center of OSN nuclei and requires the downregulation and removal of LBR, reminiscent of heterochromatin remodeling in differentiating rod photoreceptor cells. Similarly, loss of LBR leads to OR aggregation in non-OSN cells, and OR foci in OSNs are disrupted by ectopic expression of LBR, which causes decompaction of OSN heterochromatin and coexpression of many OR genes. Dynamics in NL composition are therefore critical for remodeling and effective silencing of nonchosen OR alleles in olfactory neurons. Implications for Reprogramming? The dynamics of nuclear lamina composition during differentiation and the importance of lamins in human health have inuenced our evolving view of the nuclear periphery, from a simple framework for nuclear structure to a complex system underlying genome function and development. The spatiotemporal differences in NL composition and nuclear organization also suggest that genome-NL interactions are likely to be an important and understudied feature of dedifferentiation. Reprogramming of somatic cells into induced pluripotent stem cells remains an inefcient process, and cells that successfully acquire pluripotency do so gradually, through multiple waves of transcription and changes in chromatin and DNA modications. The similarly slow progression in restructuring of nuclear architecture in differentiating tissues, including repositioning of the hb locus in differentiating neuroblasts (Kohwi et al., 2013), remodeling of heterochromatin compartmentalization in rod photoreceptor cells and olfactory sensory neurons (Clowney et al., 2012; Solovei et al., 2013), and the gradual loss of lamin A/C interactions and compartmentalization in HGPS cells (McCord et al., 2013), suggests that changes in genome-NL interactions may be the rate-limiting step for both cellular differentiation and reprogramming. The progressive, lineage-specic nature of remodeling also indicates that peripheral compartmentalization is altered in intermediate steps, perhaps relying on multiple rounds of cell division. The important role of NL composition and dynamic genome-NL interactions in early differentiation suggests that the nuclear architecture established in somatic cells may also represent a barrier to reprogramming, where factors important for maintaining pluripotency are locked away. It is therefore conceivable that understanding the step-wise progression of chromatin-lamina alterations and NL composition differences concomitant to lineage commitment and terminal differentiation might serve as a guide for how to nd our way back.
1216 Cell 152, March 14, 2013 2013 Elsevier Inc.

ACKNOWLEDGMENTS Work in the authors laboratory is supported by NIH award R01GM035463. The content is solely the responsibility of the authors and does not necessarily represent the ofcial views of the NIH.

REFERENCES Burke, B., and Stewart, C.L. (2013). The nuclear lamins: exibility in function. Nat. Rev. Mol. Cell Biol. 14, 1324. Clowney, E.J., LeGros, M.A., Mosley, C.P., Clowney, F.G., MarkenskoffPapadimitriou, E.C., Myllys, M., Barnea, G., Larabell, C.A., and Lomvardas, S. (2012). Nuclear aggregation of olfactory receptor genes governs their monogenic expression. Cell 151, 724737. Demmerle, J., Koch, A.J., and Holaska, J.M. (2012). The nuclear envelope protein emerin binds directly to histone deacetylase 3 (HDAC3) and activates HDAC3 activity. J. Biol. Chem. 287, 2208022088. Eckersley-Maslin, M.A., Bergmann, J.H., Lazar, Z., and Spector, D.L. (2013). Landes highlights. Nucleus 4, 12. Guelen, L., Pagie, L., Brasset, E., Meuleman, W., Faza, M.B., Talhout, W., Eussen, B.H., de Klein, A., Wessels, L., de Laat, W., and van Steensel, B. (2008). Domain organization of human chromosomes revealed by mapping of nuclear lamina interactions. Nature 453, 948951. Handoko, L., Xu, H., Li, G., Ngan, C.Y., Chew, E., Schnapp, M., Lee, C.W., Ye, C., Ping, J.L., Mulawadi, F., et al. (2011). CTCF-mediated functional chromatin interactome in pluripotent cells. Nat. Genet. 43, 630638. Kohwi, M., Lupton, J.R., Lai, S.L., Miller, M.R., and Doe, C.Q. (2013). Developmentally regulated subnuclear genome reorganization restricts neural progenitor competence in Drosophila. Cell 152, 97108. Magklara, A., Yen, A., Colquitt, B.M., Clowney, E.J., Allen, W., MarkenscoffPapadimitriou, E., Evans, Z.A., Kheradpour, P., Mountoufaris, G., Carey, C., et al. (2011). An epigenetic signature for monoallelic olfactory receptor expression. Cell 145, 555570. Mattout, A., Pike, B.L., Towbin, B.D., Bank, E.M., Gonzalez-Sandoval, A., Stadler, M.B., Meister, P., Gruenbaum, Y., and Gasser, S.M. (2011). An EDMD mutation in C. elegans lamin blocks muscle-specic gene relocation and compromises muscle integrity. Curr. Biol. 21, 16031614. McCord, R.P., Nazario-Toole, A., Zhang, H., Chines, P.S., Zhan, Y., Erdos, M.R., Collins, F.S., Dekker, J., and Cao, K. (2013). Correlated alterations in genome organization, histone methylation, and DNA-lamin A/C interactions in Hutchinson-Gilford progeria syndrome. Genome Res. 23, 260269. Meuleman, W., Peric-Hupkes, D., Kind, J., Beaudry, J.B., Pagie, L., Kellis, M., Reinders, M., Wessels, L., and van Steensel, B. (2013). Constitutive nuclear lamina-genome interactions are highly conserved and associated with A/Trich sequence. Genome Res. 23, 270280. nther, M., and Zink, D. (2012). Muck, J.S., Kandasamy, K., Englmann, A., Gu Perinuclear positioning of the inactive human cystic brosis gene depends on CTCF, A-type lamins and an active histone deacetylase. J. Cell. Biochem. 113, 26072621. Peric-Hupkes, D., Meuleman, W., Pagie, L., Bruggeman, S.W., Solovei, I., f, S., Flicek, P., Kerkhoven, R.M., van Lohuizen, M., et al. Brugman, W., Gra (2010). Molecular maps of the reorganization of genome-nuclear lamina interactions during differentiation. Mol. Cell 38, 603613. Pickersgill, H., Kalverda, B., de Wit, E., Talhout, W., Fornerod, M., and van Steensel, B. (2006). Characterization of the Drosophila melanogaster genome at the nuclear lamina. Nat. Genet. 38, 10051014. Scafdi, P., and Misteli, T. (2008). Lamin A-dependent misregulation of adult stem cells associated with accelerated ageing. Nat. Cell Biol. 10, 452459. Shevelyov, Y.Y., Lavrov, S.A., Mikhaylova, L.M., Nurminsky, I.D., Kulathinal, R.J., Egorova, K.S., Rozovsky, Y.M., and Nurminsky, D.I. (2009). The B-type lamin is required for somatic repression of testis-specic gene clusters. Proc. Natl. Acad. Sci. USA 106, 32823287.

t, C., Ko sem, S., Peichl, L., Cremer, T., Guck, Solovei, I., Kreysing, M., Lancto J., and Joffe, B. (2009). Nuclear architecture of rod photoreceptor cells adapts to vision in mammalian evolution. Cell 137, 356368. Solovei, I., Wang, A.S., Thanisch, K., Schmidt, C.S., Krebs, S., Zwerger, M., Cohen, T.V., Devys, D., Foisner, R., Peichl, L., et al. (2013). LBR and Lamin A/C Sequentially Tether Peripheral Heterochromatin and Inversely Regulate Differentiation. Cell 152, 584598.

lez-Aguilera, C., Sack, R., Gaidatzis, D., Kalck, V., Meister, Towbin, B.D., Gonza P., Askjaer, P., and Gasser, S.M. (2012). Step-wise methylation of histone H3K9 positions heterochromatin at the nuclear periphery. Cell 150, 934947. -Regi, R., Gaffney, D.J., Epstein, C.B., Zullo, J.M., Demarco, I.A., Pique Spooner, C.J., Luperchio, T.R., Bernstein, B.E., Pritchard, J.K., Reddy, K.L., and Singh, H. (2012). DNA sequence-dependent compartmentalization and silencing of chromatin at the nuclear lamina. Cell 149, 14741487.

Cell 152, March 14, 2013 2013 Elsevier Inc. 1217

Minireview
Uncovering Nuclear Pore Complexity with Innovation
Rebecca L. Adams1 and Susan R. Wente1,*
1Department of Cell and Developmental Biology, Vanderbilt University School of Medicine, Nashville, TN 37232, USA *Correspondence: susan.wente@vanderbilt.edu http://dx.doi.org/10.1016/j.cell.2013.02.042

Leading Edge

Advances in imaging and reductionist approaches have provided a high-resolution understanding of nuclear pore complex structure and transport, revealing unexpected mechanistic complexities based on nucleoporin functions and specialized import and export pathways.
First impressions can be misleading. Pioneering transmission electron microscopy (EM) approaches 60 years ago rst revealed a structure within the eukaryotic nuclear envelope (NE): the nuclear pore complex (NPC) (Gall, 1954) (Figure 1A). The original view is striking yet deceptively simple, with the 100 MDa proteinaceous NPC assembly spanning the NE to provide a passageway between the nucleus and cytoplasm. Over time, insights into NPC structure and function have revealed unexpected complexities. NPC pathways for nucleocytoplasmic transport are based on the type of cargo. Diffusion through NPCs is inhibited for molecules > 40 kDa; larger macromolecules and/or accumulation against a concentration gradient requires facilitated transport (Aitchison and Rout, 2012). Nuclear RNAs are actively exported for function in the cytoplasm, whereas nuclear import is required for proteins made in the cytoplasm during interphase. Increased eukaryotic proteome and RNA repertoires have expanded the range and bulk of macromolecules that require facilitated transport through NPCs. Based on the plethora of physiological needs for proper gene expression, the NPC must be a robust and selective portal. Do all NPCs in a given cell and all transport pathways in a given NPC function the same? Recent work uncovers unanticipated layers of complexity in NPC structure and function. High-resolution imaging has allowed dynamic visualization of NPC transport events, whereas reductionist approaches pinpoint how both complex and simple components contribute to transport pathway specialization. How such specialization might contribute to the transport mechanism and high cargo load capacity is intriguing. This also sets the stage for future studies taking into account possible heterogeneity between NPCs. Insights Gained from High-Resolution NPC Structures The original EM views of the NPC documented a simple structure with 8-fold rotational symmetry in the plane of the NE. Details of cytoplasmic laments and a nuclear basket structure were dened by scanning EM (Aitchison and Rout, 2012) (Figure 1C). Leaps in structural resolution come from a combination of X-ray crystallography studies of NPC proteins (Nups) (Bilokapic and Schwartz, 2012) and high-resolution cryoelectron tomography (cryo-ET) of NPCs in intact NEs, with cryo-ET work
1218 Cell 152, March 14, 2013 2013 Elsevier Inc.

yielding a 6.6 nm resolution image of the human NPC (Maimon et al., 2012). Coupling these with strategies to individually pinpoint different Nups may allow crystal structures of components to be modeled into the entire NPC. Tour de force analysis of most yeast Saccharomyces cerevisiae (S. cerevisiae) Nups (NPC-wide) by parallel structural and biochemical approaches enabled in silico computational modeling, generating insights into NPC molecular architecture (Alber et al., 2007). Importantly, whereas previous low-resolution studies show conservation of structure between humans and other eukaryotes, high-resolution cryo-ET unravels subtle differences in divergent NPCs. Variations in the cavities near the periphery of the central transport channel suggest functional divergence in this part of the NPC (Maimon et al., 2012). These may arise from the few protein composition differences across species. Innovations in super-resolution light microscopy should allow Nup localization to be examined at an EM-level resolution. These methods have already permitted visualization of the 8-fold schberger et al., 2012) symmetry of Nups in xed cells (Lo (Figure 1B) and direct live cell observations of the asymmetric nuclear-cytoplasmic distribution of Nups in NPCs (Hayakawa et al., 2012). Further studies employed to map Nups in NPCs could establish how specic Nup subcomplexes are oriented in NPCs. Functional Complexity Revealed by NPC-wide Analysis Most of the S. cerevisiae and human NPC-constituting proteins were identied a decade ago. The 30 proteins are grouped into three functional classes (Terry and Wente, 2009): transmembrane Nups that anchor the NPC in the NE, also called pore membrane proteins (Poms); structural Nups that stabilize the NE curvature at nuclear pores and provide scaffolding for assembling other peripheral Nups; and FG Nups that contribute to the permeability barrier for nonspecic transport and facilitate movement as direct binding sites for transport receptors. Nups adopt a limited variety of structural folds such as b-propeller, a-solenoid, or FG domains (Aitchison and Rout, 2012; Bilokapic and Schwartz, 2012). Parts of this simple structural assembly reect the Nups ancestral relationship with vesicle coat complexes. Thus, this complex machine derives its function through surprisingly simple structural elements.

Figure 1. NPC Structure and Transport


(A) Early EM image of the NPC cytoplasmic face in a salamander oocyte NE. Reprinted with permission from Gall (1954). Scale bar, 100 nm. (B) 8-fold symmetry of the NPC in the NE plane resolved by dSTORM microscopy. Lumenal domain of the transmembrane Nup gp210 (magenta) and the FG Nups (green) in a Xenopus oocyte NE. Reprinted with permission schberger et al., 2012. Scale bar, 100 nm. from Lo (C) Schematic of NPC architecture. Measurements indicate dimensions for human NPC from cryo-ET (Maimon et al., 2012). (D) Transport pathways through the NPC, with distinct FG Nup requirements for karyopherin transport versus mRNA export (Terry and Wente, 2009). Protein transport occurs in 10 ms (Yang and Musser, 2006), whereas mRNA nwald and Singer, 2010). Transport cargo sizes to export takes 180 ms (Gru scale with NPC: protein cargo as 80 kDa globular shape, mRNP size proportional to the transcript length and shown covered with RNA-binding proteins, green circles. CBP: 50 cap-binding protein complex.

The complexity in NPC function comes from several elements. First, different Nups are associated with NPCs for different time periods. Structural Nups are among the most stable proteins in a cell, persisting for months or years in a nondividing cell (Savas et al., 2012); moreover, these remain stably NPC associated

once assembled into the NPC (Rabut et al., 2004). In contrast, FG Nups are highly dynamic (Rabut et al., 2004), with seconds to minutes of residence times in the NPC. It is unknown how this dichotomy in association times for different components might affect transport. Second, NPC cargo load can alter the transport mechanism. Single-molecule microscopy studies show that increasing concentrations of the importin-b transport receptor alters transport time of both its cargo and molecules that passively diffuse (Yang and Musser, 2006). It is intriguing to consider that the environment of a given transport channel might be temporally impacted due to either cargo load or the specic associated FG Nups. Third, diversity in function among the FG Nups is illuminated by several key NPC-wide studies. FG Nups have been considered to be interchangeable and of uniform function due to their common attributes. FG Nups contain motifs enriched in phenylalanine (F) and glycine (G) repeats, such as FXFG and GLFG (L, leucine; X, any amino acid); the spacer sequences between FG repeats consist of 530 residues that are typically enriched in polar amino acids. Analyses to date indicate that FG domains are unstructured and occupy the central NPC channel (Terry and Wente, 2009; Yamada et al., 2010; Aitchison and Rout, 2012). Although these FG domains constitute 12% of the NPC mass, they are not resolved in high-resolution structures. EM analysis of anti-Nup immunogold-labeled NPCs indicates that a single FG domain type occupies multiple topologies (Fahrenkrog et al., 2002). Thus, all FG Nups may share an unexpected structural exibility as a dening feature. Several notable distinctions are also dened among the FG domains. NPC-wide analyses of biochemical and biophysical properties of individual FG domains or subdomains show differences in cohesive properties in terms of self- and inter-FG interactions and in levels of compaction (collapsed versus random coil) (Yamada et al., 2010). In vivo evidence reveals distinct functions for FG domains. In an analysis of FG domain deletion mutants, S. cerevisiae viability required only specic combinations of FG domains; individual ones were dispensable, with only a few required in higher-order mutant combinations (Terry and Wente, 2009). Importantly, FG domain deletion mutants were defective in specic nuclear transport pathways. For example, an FG deletion mutant defective in Kap121 import was competent for mRNA export and vice versa (Terry and Wente, 2009). Recently, in a Xenopus in vitro system, the Nup98 was shown to be necessary for generation of the permeability barrier that inhibits diffusion of macromolecules lsmann et al., 2012). Without the Nup98 FG domain, only (Hu substitution with another cohesive FG domain restored the barrier. That the permeability barrier function could be attributed to one specic FG Nup provides further evidence that all FG Nups are neither the same nor interchangeable. A nal layer of complexity stems from Nup posttranslational modications. It is known that vertebrate FG Nups are modied by O-linked glycosylation, and this may regulate the vertebrate NPC permeability barrier (Labokha et al., 2012). Nup98 phosphorylation is an initial step in the breakdown of the NPC during open mitosis (Laurell et al., 2011). Phosphorylation increases permeability of the NPC either through altering the conformation of the Nup98 GLFG domain or through inducing its dissociation
Cell 152, March 14, 2013 2013 Elsevier Inc. 1219

lsmann et al., 2012). In an NPC-wide analysis from NPCs (Hu of ubiquitylation carried out in S. cerevisiae (Hayakawa et al., 2012), this modication was discovered on almost all Nups. Interestingly, proper nuclear migration during mitosis requires Nup159 ubiquitylation. Future work should reveal how these layers of complexity impact nuclear transport function. Dynamic and Diverse Transport Pathways Uncovered within NPCs NPC translocation is dened by docking, translocation, and release steps for cargo complexes (Aitchison and Rout, 2012). Proteins typically display a nuclear localization sequence (NLS) for entry or nuclear export sequence (NES) for exit. These motifs provide binding sites for transport receptors (karyopherins, importins, exportins, and transportins). RNA transport receptors either recognize the RNA directly (tRNA and miRNA) or interact with an RNA-binding adaptor protein (in the mRNA ribonucleoprotein [mRNP] complex). In addition to cargo interactions, transport receptors also contain hydrophobic pockets that bind the phenylalanine residues of FG domains (Terry and Wente, 2009). Alternative models for how transport receptor-FG interactions mediate NPC translocation are under investigation. However, the understanding of how transport directionality is dictated has reached better consensus. For karyopherins, accumulation of cargo against its concentration gradient and recycling of the transport receptor are based on localized control of Ran GTPase activity (GTP state in the nucleus and GDP in the cytoplasm). Specically, the importin-cargo complex binding to Ran-GTP in the nucleus causes cargo release. In contrast, a RanGTPexportin-cargo complex disassembles in the cytoplasm with GTP hydrolysis (Aitchison and Rout, 2012). An analogous nonRanGTP mechanism exists for mRNA export by the NXF1 receptors (S. cerevisiae Mex67), wherein ATP/ADP cycling of an RNA-dependent DEAD box ATPase (Dbp, or DDX) localized on the NPC cytoplasmic laments drives directional transport (Folkmann et al., 2011). Overall, directional facilitated translocation is dictated by spatially controlled, nucleotide-dependent switches at exit sites. The requirements of different FG Nups for specic transport receptors underscore the potential for multiple preferential pathways existing in an NPC (Figure 1D) (Terry and Wente, 2009). Whether the active and passive transport pathways are both functionally and spatially distinct in the NPC central channel has been debated. Recent microscopy technologies have documented real-time single translocation events (Yang and nwald and Singer, 2010; Lowe et al., 2010; Musser, 2006; Gru Mor et al., 2010; Ma et al., 2012) based on both high spatial and temporal resolution coupled with single-molecule innovations for specic protein cargo labeling such as large quantum dots (Lowe et al., 2010). NPC interaction times during facilitated protein transport were measured as 10 ms, with a reported range of 234 ms (Yang and Musser, 2006), with RanGTP driving release of large cargo from the NPC (Lowe et al., 2010). These approaches have also allowed mapping of NPC transport pathways, and recent studies suggest that importin-b cargo moves more peripherally to the central NPC channel, as compared to diffusive cargo (Figure 1D) (Ma et al., 2012).
1220 Cell 152, March 14, 2013 2013 Elsevier Inc.

Single mRNAs have also been observed moving across the NPC by engineering sequence-specic RNA stem loops into endogenous or inducible transcripts and by coexpressing uorescently tagged MS2 RNA stem-loop-binding proteins nwald and Singer, 2010; Mor et al., 2010). Here, the (Gru observed time frame for mRNA transport through the pore is nwald and Singer, 2010) to 500 ms (Mor et al., 180 ms (Gru nwald 2010), with nuclear and cytosolic rate-limiting steps (Gru and Singer, 2010). The rate-limiting interval at the cytoplasmic face is likely due to mRNP remodeling to promote directionality. Although both fast and slow (>800 ms) transport rates are nwald and Singer, 2010), observed for a single mRNA type (Gru mRNP translocation through the NPC occurred 15-fold faster than diffusion through the nucleus (Mor et al., 2010). Comparing the transport of protein and mRNA reveals differences, with a longer duration for mRNA transport across the NPC that is possibly due to the size differences in the respective protein versus mRNP cargos (Figure 1D). mRNA export also has a rate-limiting step at the NPC entry site that might be attributed to the mRNA quality control and surveillance mechanisms prior to export. For protein and mRNA transport single-molecule experiments, a striking common conclusion is that cargo enters the NPC and explores the channel in a diffusive/ subdiffusive manner with observed back and forth movements. This suggests the lack of a straight path through the NPC and that movement itself is not inherently directional. It is remarkable that the nwald and transport events are most often unsuccessful (Gru Singer, 2010; Yang and Musser, 2006), raising the question of how the NPC accommodates not only a large amount of successful transport events, but also an even larger number of unsuccessful events. Models Impacted by Nuclear Pore Complexity and Heterogeneity The NPCs inherent complexity has favored reductionist approaches to gain molecular insights into transport mechanisms. Innovations include the development of in vitro nanopores and hydrogels for testing the selective barrier properties with transport receptors and cargo. In a nanopore approach, recombinant FG domains were coupled to a small nanopore (30 nm holes) (Jovanovic-Talisman et al., 2009). In contrast, the hydrogels self-formed under experimentally determined conditions with recombinant FG domains (Labokha et al., 2012). These strategies demonstrated that FG domains are sufcient for allowing selective passage of transport receptors. A recent hydrogel study characterized individual FG domains of Xenopus laevis on an NPC-wide level, nding that resulting hydrogels had different capacities for selective transport (Labokha et al., 2012). To effectively mimic the heterogeneous and dynamic NPC environment, these systems will require constructing single nanopores and hydrogels with multiple different FG domains included. Because of the now known complexity, one FG domain type cannot be considered in isolation; nor are all FG domains the same. Several different models have been proposed for the mechanism of NPC translocation. These differ in how the intermolecular interactions between FG domains contribute to facilitated transport and a selective barrier (Terry and Wente, 2009; Aitchison

lsmann et al., 2012). For example, the and Rout, 2012; Hu entropic barrier model suggests that unstructured FG domains function to exclude noninteracting molecules. Alternatively, the selective phase model proposes that interdomain hydrophobic interactions form a gel-like meshwork locally dissolved by transport receptor interactions. For both models, work is needed to account for the heterogeneity of FG domains in vivo and in vitro. A hybrid model is also quite appealing, wherein functions for cohesive (for permeability barrier) and noncohesive (for entropic bristles) interactions are considered (Yamada et al., 2010). These complexities provide an exciting challenge for further investigations. Perspective Currently, a single mechanism of nuclear transport across the NPC likely does not exist; rather, layers of complexity lead to multiple specialized pathways in a given NPC. Whether different transport pathways allow multiple transport events to take place within a single NPC is still unresolved. Classic EM experiments demonstrated that an individual NPC is capable of carrying out both import and export (Feldherr et al., 1984); however, whether import and export can be simultaneous has not been tested. Tracking single mRNA transcripts reveals tran nwald and sient association with multiple NPCs before exit (Gru Singer, 2010) possibly due to the inherent properties of stochastic cargo movement with the NPC. Alternately, this might reect a full cargo load for a given NPC, inhibiting entry and new translocation events. This may also involve the absence of specic factors/Nups at a given NPC or quality control mechanisms detecting incomplete processing of the transcript. To directly address simultaneous transport, a future challenge will be to monitor single-molecule facilitated transport of different cargos at the same time within one cell/NPC. Though specialized transport pathways exist within the heterogeneous environment of the NPC, it is unclear whether different NPCs in a cell are specialized for distinct types of transport. Distinctions might exist in each NPC as a result of dynamic Nup associations, posttranslational or conformational changes, or temporal changes in expression. There is evidence for differential NPC function in specic animal tissues at specic times in cellular differentiation. A recent study found that a transmembrane Nup (gp210) was absent in proliferating myoblasts but was required for differentiation into neuroprogenitors (DAngelo et al., 2012). Using genome-wide RNA sequencing, gp210 expression caused differential regulation of a subset of transcripts without globally affecting NPC transport. How a transmembrane Nup has these effects is unclear; however, NPC function is evidently altered by differential Nup association. Advances in imaging and NPC-wide, or genomewide, approaches will be needed to further analyze NPC mechanisms of specialization on cellular and organism levels. Finally, the complexity of Nups extends beyond the NPC, as independent functions have been uncovered for some Nups (Raices and DAngelo, 2012). Thus, a full understanding of nuclear pore complexity is needed to position the eld in evaluating the molecular mechanisms underlying nup mutants linked to human developmental diseases (Raices and DAngelo,

2012). The wealth of innovations has unveiled NPC structure and function as much more complex than anticipated at rst glance.
ACKNOWLEDGMENTS We thank Joe Gall (Carnegie Institution for Science) and Markus Sauer (JuliusMaximilians-University Wurzburg) for permission to reprint the images in Figures 1A and 1B, and we thank Wente laboratory members and Elizabeth Bowman for discussion. Due to space constraints, we regret not being able to cite all primary references. The authors were supported by grants from the National Institutes of Health (R37GM051219 [S.R.W.] and T32HD007502 [R.L.A.]). REFERENCES Aitchison, J.D., and Rout, M.P. (2012). Genetics 190, 855883. Alber, F., Dokudovskaya, S., Veenhoff, L.M., Zhang, W., Kipper, J., Devos, D., Suprapto, A., Karni-Schmidt, O., Williams, R., Chait, B.T., et al. (2007). Nature 450, 695701. Bilokapic, S., and Schwartz, T.U. (2012). Curr. Opin. Cell Biol. 24, 8691. DAngelo, M.A., Gomez-Cavazos, J.S., Mei, A., Lackner, D.H., and Hetzer, M.W. (2012). Dev. Cell 22, 446458. ser, J., Sauder, U., Ullman, K.S., and Fahrenkrog, B., Maco, B., Fager, A.M., Ko Aebi, U. (2002). J. Struct. Biol. 140, 254267. Feldherr, C.M., Kallenbach, E., and Schultz, N. (1984). J. Cell Biol. 99, 2216 2222. Folkmann, A.W., Noble, K.N., Cole, C.N., and Wente, S.R. (2011). Nucleus 2, 540548. Gall, J.G. (1954). Exp. Cell Res. 7, 197200. nwald, D., and Singer, R.H. (2010). Nature 467, 604607. Gru Hayakawa, A., Babour, A., Sengmanivong, L., and Dargemont, C. (2012). J. Cell Biol. 196, 1927. rlich, D. (2012). Cell 150, 738751. lsmann, B.B., Labokha, A.A., and Go Hu Jovanovic-Talisman, T., Tetenbaum-Novatt, J., McKenney, A.S., Zilman, A., Peters, R., Rout, M.P., and Chait, B.T. (2009). Nature 457, 10231027. lsmann, B.B., Urlaub, H., Baldus, Labokha, A.A., Gradmann, S., Frey, S., Hu rlich, D. (2012). EMBO J. 32, 204218. M., and Go Laurell, E., Beck, K., Krupina, K., Theerthagiri, G., Bodenmiller, B., Horvath, P., Aebersold, R., Antonin, W., and Kutay, U. (2011). Cell 144, 539550. schberger, A., van de Linde, S., Dabauvalle, M.C., Rieger, B., Heilemann, Lo M., Krohne, G., and Sauer, M. (2012). J. Cell Sci. 125, 570575. Lowe, A.R., Siegel, J.J., Kalab, P., Siu, M., Weis, K., and Liphardt, J.T. (2010). Nature 467, 600603. Ma, J., Goryaynov, A., Sarma, A., and Yang, W. (2012). Proc. Natl. Acad. Sci. USA 109, 73267331. Maimon, T., Elad, N., Dahan, I., and Medalia, O. (2012). Structure 20, 998 1006. Mor, A., Suliman, S., Ben-Yishay, R., Yunger, S., Brody, Y., and Shav-Tal, Y. (2010). Nat. Cell Biol. 12, 543552. Rabut, G., Doye, V., and Ellenberg, J. (2004). Nat. Cell Biol. 6, 11141121. Raices, M., and DAngelo, M.A. (2012). Nat. Rev. Mol. Cell Biol. 13, 687699. Savas, J.N., Toyama, B.H., Xu, T., Yates, J.R., 3rd, and Hetzer, M.W. (2012). Science 335, 942. Terry, L.J., and Wente, S.R. (2009). Eukaryot. Cell 8, 18141827. Yamada, J., Phillips, J.L., Patel, S., Golden, G., Calestagne-Morelli, A., Huang, H., Reza, R., Acheson, J., Krishnan, V.V., Newsam, S., et al. (2010). Mol. Cell. Proteomics 9, 22052224. Yang, W., and Musser, S.M. (2006). J. Cell Biol. 174, 951961.

Cell 152, March 14, 2013 2013 Elsevier Inc. 1221

Minireview
Enclosing Chromatin: Reassembly of the Nucleus after Open Mitosis
Cornelia Wandke1 and Ulrike Kutay1,*
1Institute of Biochemistry, Department of Biology, ETH Zurich, Schafmattstrasse 18, 8093 Zurich, Switzerland *Correspondence: ulrike.kutay@bc.biol.ethz.ch http://dx.doi.org/10.1016/j.cell.2013.02.046

Leading Edge

During mitosis in vertebrate cells, the nuclear envelope undergoes extensive structural reorganization, starting with the retraction of nuclear membranes into the ER at mitotic onset and ending with the re-enclosure of chromatin by ER-derived membranes during mitotic exit. Here, we review our current understanding of postmitotic nuclear assembly.
Nuclear and cytoplasmic processes are separated from each other by the nuclear envelope (NE), a double membrane perforated by nuclear pore complexes (NPCs). Extensive reorganization of the NE accompanies many forms of cell division and is required for spindle assembly. To establish the mitotic spindle, microtubules (MTs) need to gain access to chromatin. In open mitosis, as employed by most metazoan cells, the spindle is built in the cytoplasm, and chromatin is exposed to cytoplasmic MTs by NE breakdown (NEBD). In extreme forms of open mitosis, e.g., in vertebrate cells, the nuclear compartment is completely taken apart, including the disintegration of NPCs and the dispersal of NE membranes into the ER. Consequently, to re-establish nucleocytoplasmic compartmentalization, the NE needs to be reassembled around the segregated mass of chromatin in the future daughter cells. This relies on the general spatiotemporal orchestration of mitotic exit and requires coordination of chromosome decondensation, membrane recruitment to chromatin, and NPC assembly. Here, we discuss models of mitotic NE/ER remodeling and nuclear assembly focusing on vertebrate systems and highlight the crucial role for phosphatases as spatial and temporal regulators of nuclear reformation. NE Reformation Temporal control of nuclear reassembly is exerted by the machinery governing mitotic exit and relies on the inactivation of mitotic kinases like CDK1, as well as on the action of protein phosphatases. These phosphatases revert phosphorylation events that have driven NEBD during mitotic entry. The catalytic phosphatase subunits are constitutively active but restricted in their intracellular localization, substrate specicity, and overall activity through association with a large range of regulatory subunits. These regulatory subunits thus contribute both to temporal and spatial control. Another aspect of spatial control is conferred by RanGTP generation around chromatin through the action of the chromatin-bound RanGEF. Certain nucleoporins and membrane proteins are kept in a reassembly-incompetent state in the mitotic cytosol by association with RanGTP-binding import receptors. In the vicinity of chromatin, RanGTP triggers their release from inhibitory importins and helps to spatially conne formation of
1222 Cell 152, March 14, 2013 2013 Elsevier Inc.

NPCs and the NE to the surface of chromatin (reviewed in ttinger et al., 2009). Gu Nuclear Membrane Formationfrom ER Sheets or Tubules? Early experiments using Xenopus egg extracts had suggested that membranes utilized for postmitotic NE reformation originate from vesicles. In vitro, these vesicles bind to chromatin, then atten and fuse, thereby forming the NE. However, it became evident later that these vesicles represent a peculiarity of the in vitro system as they arise by fragmentation of the fragile ER network during fractionation of Xenopus eggs. It is now commonly accepted that NE reformation in vivo involves membranes derived from the mitotic ER, which start to be recruited back to chromatin in late anaphase. The rapid reattachment of ER membranes to chromatin is protein mediated and redundantly facilitated by several inner nuclear membrane (INM) proteins (Anderson et al., 2009). The best-studied examples are the lamin B receptor (LBR), which is recruited to chromatin by interaction with core histones H3/H4 and heterochromatin-binding protein 1 (HP1), as well as the LEM domain proteins Lap2b, Emerin, and MAN1, which bind the chromatin-associated barrier-to-autointegration factor (BAF) in a cell-cycle-regulated manner. A number of membrane proteins, including LBR and the nucleoporins POM121 and NDC1, may also directly bind to DNA as it is becoming more exposed during chromatin decondensation (reviewed in ttinger et al., 2009). The redundant involvement of many Gu membrane proteins that reassociate with chromatin/DNA during NE formation ensures fast and robust nuclear reassembly. Although the role of INM proteins in NE reformation is undisputed, controversy exists on whether ER membranes approach chromatin as tubules or sheet-like structures (Figure 1A). This dispute is more than a scholarly quarrel on different experimental observations because the mode of NE reformation may impact on the mechanism of postmitotic NPC assembly (see below). Confocal microscopy and electron tomographic analysis of chemically xed samples had rst indicated that the mitotic ER is entirely tubular (Puhka et al., 2007). In agreement with this description, Hetzer and coworkers observed tips of ER tubules in the vicinity of chromatin during late anaphase, suggesting

Figure 1. Nuclear Envelope Reformation


(A) The NE is reformed from ER membranes, which contact chromatin either as tubules or sheets. (B) The enclosure and insertion models of postmitotic NPC assembly. For enclosure, preNPCs containing Nup107160 complexes assemble on the chromatin surface, are engulfed by membranes, and mature into NPCs. Insertion relies on INM-ONM fusion in NE sheets. These holes are then occupied by Nups, which assemble stepwise into mature NPCs. (C) Membranes start binding back to chromatin via future INM proteins during late anaphase. RanGTP releases soluble Nups and membrane proteins from inhibitory importins in the vicinity of chromatin, thereby contributing to spatial control of NE assembly. Phosphatases confer temporal and spatial control by reverting inhibitory phosphorylations on chromatin and NE proteins. The regulatory subunit Repo-Man targets PP1g to chromatin; H3 becomes dephosphorylated, allowing for chromatin restructuring and binding of HP1. Both HP1 and H3 interact with the INM protein LBR, connecting chromatin to the NE. PP2A is recruited to membranes by LEM4, which inhibits the BAF kinase VRK-1 (not shown) and promotes dephosphorylation of BAF by PP2A. This drives the interaction of BAF with chromatin and other LEM domain proteins.

that ER tubules might serve as source of NE membranes. Enforcing this idea, a preformed, largely tubular ER network could efciently support NE assembly in vitro (Anderson and Hetzer, 2007). To form the sheet-like structure of the NE, chromatinbound tubules must at some point atten, expand, and seal on the chromatin surface, involving chromatin/DNA interactions of membrane proteins. Accordingly, overexpression of ER tubule-forming proteins such as reticulons and DP1 delayed NE formation and nuclear expansion in mammalian cells, whereas their depletion accelerated the formation of a closed NE (Anderson and Hetzer, 2008). These experiments were taken as indication that remodeling of the ER from tubules to sheets could present a rate-limiting step in nuclear assembly. Yet, the same results would be expected if a sheet-like morphology were required for NE reformation in the rst place. Recent studies have indeed questioned the existence of a primarily tubular mitotic ER network. Using spinning-disk

confocal microscopy and EM tomography after high-pressure freezing, Kirchhausen and coworkers demonstrated in various cell types that the mitotic ER is almost entirely composed of extended sheets, or cisternae, which are continuous with the nascent NE in the chromatin periphery (Lu et al., 2009, 2011). Subsequent re-evaluation of EM xation methods and analyses of additional cell lines by Puhka et al. (2012) revealed cell-type-specic variations of mitotic ER structure, softening some of the controversy. Taking these results together, cisternal organization of the mitotic ER predominates in the majority of studied cells, with some cell types displaying a transition to fenestrated ER sheets and more tubular networks in mitosis. But how does the ER approach anaphase chromatinin the form of sheets or tubules? Lu et al. (2011) indeed observed sheet-like structures on the surface of chromatin during NE reformation and suggest that ER cisternae attach to chromatin for reassembly of the NE. In contrast, chromatin-proximal ER tubules seemed incompetent in generating NE membranes. Still, the binding of an ER sheet to anaphase chromatin has not yet been visualized at sufcient temporal and spatial resolution to exclude a tubule-to-sheet transition, and additional evidence will be required to strengthen the sheet hypothesis. NPC AssemblyInsertion or Enclosure? Concomitantly with the attachment of membranes to chromatin, NPCs assemble into the growing NE. The recruitment of nucleoporins to chromatin is spatially controlled by
Cell 152, March 14, 2013 2013 Elsevier Inc. 1223

RanGTP-dependent release of inhibitory importins in the vicinity of chromatin and timely dephosphorylation of nucleoporins by uncharacterized protein phosphatases. Postmitotic NPC formation is a stepwise process that has been visualized by live-cell imaging and recapitulated in vitro using Xenopus egg extracts and sperm chromatin. An initial event is the binding of the large Nup107160 NPC scaffolding complex to chromatin, mediated by the nucleoporin ELYS, which directly interacts with AT-rich DNA sequences (reviewed ttinger et al., 2009). Immunodepletion of Nup107160 in Gu complexes from in vitro nuclear assembly reactions resulted in nuclei with pore-free, closed NEs (Harel et al., 2003; Walther et al., 2003), highlighting the central role of this subcomplex in coordinating NPC assembly with NE reformation. Originally, it had been suggested that the Nup107160 complex is seeded onto chromatin in the form of pre-NPCs (Walther et al., 2003). However, careful EM analysis of early NPC assembly intermediates in the absence of membranes failed to visualize ring-like pre-pores but detected smaller Nup107-containing seeds (Rotem et al., 2009), likely made up from single subcomplexes. Ring-shaped NPC structures were only formed when membranes were present, indicating a requirement for membrane components to induce subsequent steps in NPC assembly. These include recruitment of the membrane nucleoporins POM121 and NDC1, which facilitate integration of the central Nup5393 scaffolding complex. At the same time, Nup98, essential for the transport and barrier properties of NPCs, associates, followed by other FG domain Nups rendering the NPC competent for various transport path ttinger et al., 2009). ways (reviewed in Gu Although an approximate order of nucleoporin recruitment has been established, the mechanism of postmitotic NPC assembly is a matter of debate. Two opposing models have been proposedenclosure and insertion (Figure 1B). The insertion model suggests that NPCs are introduced into chromatinassociated NE sheets, necessitating a membrane fusion event between INM and ONM for pore formation (Macaulay and Forbes, 1996). Such NPC insertion mechanism is employed during interphase, when a continuous NE surrounds chromatin; yet the molecular mechanism of INM-ONM fusion is enigmatic. For NPC assembly by insertion, it should not matter whether membranes initially contact chromatin as sheets or tubules, as long as a attened double membrane is nally available. The enclosure model, in contrast, proposes that chromatin-associated, preassembled NPCs are engulfed by membranes, rendering membrane fusion between INM and ONM unnecessary for postmitotic NPC assembly (Anderson and Hetzer, 2007). Recruitment of large membrane sheets to chromatin could hinder NPC formation by enclosure on the surface of affected chromatin areas. Forbes and coworkers have recently followed postmitotic nuclear assembly in vitro at low temperatures, which decelerates the reaction and allows for dissecting assembly steps (Fichtman et al., 2010). Under these conditions, chromatin-associated NPC assembly intermediates containing the Nup107160 complex and POM121 were detected in a fully closed NE. Completion of NPC assembly indeed occurred upon longer incubation, indicating the requirement of INM-ONM fusion. Thus, in
1224 Cell 152, March 14, 2013 2013 Elsevier Inc.

principle, postmitotic assembly can occur by insertion, but it remains to be shown that this is relevant at normal assembly kinetics. Postmitotic and interphase NPC biogenesis differ in several respects. After mitosis, there is a burst in NPC assembly (2,000 NPCs assemble from pre-existing building blocks within 10 min in cultured somatic cells). Compared to this fast, parallel formation of NPCs, interphase assembly is the culmination of many occasional events that double NPC number until the next mitosis. In interphase, newly synthesized nucleoporin (subcomplexes) must be generated and integrated into a continuous NE. Notably, kinetic measurements have revealed that single NPC assembly events are considerably slower in interphase than after mitosis (Dultz and Ellenberg, 2010). Whether this reects mechanistic differences between the pathways, i.e., insertion versus enclosure, or simply the fact that de novo synthesis of nucleoporins is rate limiting for interphase NPC assembly is unclear. Additional dissimilarities have been observed between postmitotic and interphase NPC biogenesis and have been interpreted as evidence for distinct mechanisms underlying both modes of NPC assembly, although none of these studies has directly addressed insertion or enclosure. First, ELYS is necessary for NPC assembly at the end of mitosis but appears dispensable during interphase assembly (Doucet et al., 2010). Second, there are differences in the order of nucleoporin recruitment, i.e., POM121 precedes the integration of Nup107160 complexes into the NE during interphase, but not after mitosis (Doucet et al., 2010; Dultz and Ellenberg, 2010). Yet, both differences could simply reect the need for seeding Nup107160containing NPC assembly sites on chromatin after open mitosis. Third, proteins or domains involved in sensing or generating membrane curvature are specically required for interphase NPC formation, potentially indicating a different pore formation mechanism. These include the reticulons (Anderson and Hetzer, 2008), the ALPS motif of Nup133 (Doucet et al., 2010), and a C-terminal membrane-bending domain in Nup53 (Vollmer et al., 2012). It is, however, conceivable that these membrane curvature modules are critical for interphase assembly only because of its slower kinetics and the requirement to stabilize pore assembly intermediates over longer times. Fourth, RNA interference (RNAi) experiments indicated that POM121 and the LINC complex component SUN1 might only be important for interphase and not for postmitotic NPC assembly, perhaps via a direct role in NPC insertion (Doucet et al., 2010; Talamas and Hetzer, 2011). It should be noted, however, that the interpretation of the POM121 RNAi data is controversial, and a postmitotic role for POM121 is supported by depletion (Antonin et al., 2005) and dominant-negative experiments in vitro (Shaulov et al., 2011). Taken together, progress in the last years has revealed differences between postmitotic and interphase NPC assembly. Still, the call is out whether these indeed reect distinct mechanisms of pore generationi.e., enclosure versus insertion. All so-fardescribed dissimilarities could also arise from differences in chromatin accessibility, NPC assembly kinetics, and availability of components (reservoir versus de novo synthesis). It will be possible to directly distinguish between the enclosure and

insertion models for postmitotic assembly once the mechanism of ONM-INM fusion has been delineated. Phosphatases in Charge of Spatial and Temporal Control Protein phosphatases directly contribute to NE reformation by reverting phosphorylation of chromatin and NE components, thereby making both sides competent for reassociation (Figure 1C). An important player is protein phosphatase 1g (PP1g). During early anaphase, PP1g is targeted to chromatin by its regulatory subunit Repo-Man, causing dephosphorylation of histone H3 at several sites (Vagnarelli et al., 2011). Likely by dephosphorylating H3 at Ser10, Repo-Man/PP1g controls the association of HP1 with chromatin, contributing to heterochromatin formation during mitotic exit (Vagnarelli et al., 2011). Interestingly, HP1 has been suggested to promote NE reformation by assisting the recruitment of membranes to chromatin through interaction with the INM protein LBR (Ye et al., 1997). Repo-Man also harbors another function in nuclear assembly that is independent of PP1 activity, which is to support the recruitment of importin b and Nup153 to the periphery of anaphase chromosomes. Based on these ndings, it has been speculated that importin b marks sites on chromatin for NPC assembly (Vagnarelli et al., 2011). Like PP1, the phosphatase PP2A is required for timely dephosphorylation of CDK1 substrates and also functions as a key factor in postmitotic nuclear reassembly. One critical target of PP2A is BAF. During interphase, BAF provides the chromatin-binding site for LEM domain proteins of the INM. At the onset of mitosis, BAF is phosphorylated by VRK-1, which is proposed to contribute to NEBD both by releasing BAF from chromatin and weakening the interaction with LEM domain na cz et al., 2007). Conversely, recruitment of proteins (Gorja LEM domain proteins from the ER into the reforming NE requires dephosphorylation of BAF and its reassociation with chromatin. Strikingly, one LEM family member, Lem4/ANKLE2 in human cells and LEM-4L in C. elegans, serves as a membrane-bound platform that coordinates the dephosphorylation of BAF by simultaneously inhibiting the BAF kinase VRK-1 and recruiting PP2A. In this scenario, Lem4 potentially functions as a regulatory PP2A subunit (Asencio et al., 2012). The interactions between Lem4, VRK-1, and PP2A must differ between mitotic entry and exit, suggesting that additional control mechanisms exist upstream of the VRK1-BAF-PP2A-Lem4 axis. It will be interesting to see whether the integrative role of the NE protein Lem4 in regulating kinase and phosphatase activity on a common substrate will emerge as paradigm for spatial coordination during mitotic exit. Outlook Key mechanisms underlying the dynamic changes of the cell nucleus during mitosis have been revealed, yet many exciting questions remain. Nuclear membranes are retracted into the mitotic ER, from where they re-emerge in anaphase, but is the mitotic ER more than an inert spindle shell? Further, it is still controversial whether NPCs are inserted into or rather enclosed by the reforming NE. Similarly, it is debated whether the ER approaches chromatin as tubules or sheets. As the development

of membrane probes for time-resolved superresolution microscopy progresses rapidly, these open questions should soon be resolvable. Finally, protein phosphatases coordinate nuclear reassembly in a spatiotemporal manner, but only few phosphatase-substrate relationships causal for specic steps of nuclear reassembly have been established. Clearly, much remains to be learned in this interesting area, especially with respect to how changes in chromatin translate into competence for nuclear reformation.
ACKNOWLEDGMENTS We apologize for the few citations owing to space limitations, and we thank Drs. M. Mayr and A. Rothballer for critical reading and the SNSF and ERC for funding. REFERENCES Anderson, D.J., and Hetzer, M.W. (2007). Nat. Cell Biol. 9, 11601166. Anderson, D.J., and Hetzer, M.W. (2008). J. Cell Biol. 182, 911924. Anderson, D.J., Vargas, J.D., Hsiao, J.P., and Hetzer, M.W. (2009). J. Cell Biol. 186, 183191. Antonin, W., Franz, C., Haselmann, U., Antony, C., and Mattaj, I.W. (2005). Mol. Cell 17, 8392. Asencio, C., Davidson, I.F., Santarella-Mellwig, R., Ly-Hartig, T.B., Mall, M., na cz, M. (2012). Cell 150, 122135. Wallenfang, M.R., Mattaj, I.W., and Gorja Doucet, C.M., Talamas, J.A., and Hetzer, M.W. (2010). Cell 141, 10301041. Dultz, E., and Ellenberg, J. (2010). J. Cell Biol. 191, 1522. Fichtman, B., Ramos, C., Rasala, B., Harel, A., and Forbes, D.J. (2010). Mol. Biol. Cell 21, 41974211. na cz, M., Klerkx, E.P., Galy, V., Santarella, R., Lo pez-Iglesias, C., Gorja Askjaer, P., and Mattaj, I.W. (2007). EMBO J. 26, 132143. ttinger, S., Laurell, E., and Kutay, U. (2009). Nat. Rev. Mol. Cell Biol. 10, Gu 178191. Harel, A., Orjalo, A.V., Vincent, T., Lachish-Zalait, A., Vasu, S., Shah, S., Zimmerman, E., Elbaum, M., and Forbes, D.J. (2003). Mol. Cell 11, 853864. Lu, L., Ladinsky, M.S., and Kirchhausen, T. (2009). Mol. Biol. Cell 20, 3471 3480. Lu, L., Ladinsky, M.S., and Kirchhausen, T. (2011). J. Cell Biol. 194, 425440. Macaulay, C., and Forbes, D.J. (1996). J. Cell Biol. 132, 520. Puhka, M., Vihinen, H., Joensuu, M., and Jokitalo, E. (2007). J. Cell Biol. 179, 895909. Puhka, M., Joensuu, M., Vihinen, H., Belevich, I., and Jokitalo, E. (2012). Mol. Biol. Cell 23, 24242432. Rotem, A., Gruber, R., Shorer, H., Shaulov, L., Klein, E., and Harel, A. (2009). Mol. Biol. Cell 20, 40314042. Shaulov, L., Gruber, R., Cohen, I., and Harel, A. (2011). J. Cell Sci. 124, 3822 3834. Talamas, J.A., and Hetzer, M.W. (2011). J. Cell Biol. 194, 2737. Vagnarelli, P., Ribeiro, S., Sennels, L., Sanchez-Pulido, L., de Lima Alves, F., Verheyen, T., Kelly, D.A., Ponting, C.P., Rappsilber, J., and Earnshaw, W.C. (2011). Dev. Cell 21, 328342. Vollmer, B., Schooley, A., Sachdev, R., Eisenhardt, N., Schneider, A.M., Sieverding, C., Madlung, J., Gerken, U., Macek, B., and Antonin, W. (2012). EMBO J. 31, 40724084. odice, I., Hetzer, M., Galy, V., Walther, T.C., Alves, A., Pickersgill, H., Lo cher, T., Wilm, M., Allen, T., et al. (2003). Cell 113, 195206. lsmann, B.B., Ko Hu Ye, Q., Callebaut, I., Pezhman, A., Courvalin, J.C., and Worman, H.J. (1997). J. Biol. Chem. 272, 1498314989.

Cell 152, March 14, 2013 2013 Elsevier Inc. 1225

Primer
Criteria for Inference of Chromothripsis in Cancer Genomes
Jan O. Korbel1,* and Peter J. Campbell2,3,4,*
Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany Genome Project, Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK 3Department of Haematology, Addenbrookes Hospital, Cambridge CB2 0QQ, UK 4Department of Haematology, University of Cambridge, Cambridge CB22XY, UK *Correspondence: jan.korbel@embl.de (J.O.K.), pc8@sanger.ac.uk (P.J.C.) http://dx.doi.org/10.1016/j.cell.2013.02.023
2Cancer 1Genome

Leading Edge

Chromothripsis scars the genome when localized chromosome shattering and repair occurs in a one-off catastrophe. Outcomes of this process are detectable as massive DNA rearrangements affecting one or a few chromosomes. Although recent ndings suggest a crucial role of chromothripsis in cancer development, the reproducible inference of this process remains challenging, requiring that cataclysmic one-off rearrangements be distinguished from localized lesions that occur progressively. We describe conceptual criteria for the inference of chromothripsis, based on ruling out the alternative hypothesis that stepwise rearrangements occurred. Robust means of inference may facilitate in-depth studies on the impact of, and the mechanisms underlying, chromothripsis.
Introduction Often described as a disease of the genome, cancer typically results from the acquisition of DNA alterations in somatic cells leading to activation of oncogenes and inactivation of tumor suppressor genes. As a result, cellular processes including cell-cycle control, apoptosis, and DNA repair are impaired, conferring a growth advantage to cells and fomenting tumorigenesis (Stratton et al., 2009). According to a long-standing presumption, a single genetic hit is typically insufcient for a cell to develop into cancer. Instead, several progressive (i.e., gradually acquired or stepwise) DNA alteration events are required, resulting in incremental development and progression of cancer (Knudson, 1971; Stratton et al., 2009). Recent cancer genome analyses, however, have revisited this presumption by suggesting an alternative process that involves massive de novo structural rearrangement formation in a one-step catastrophic genomic event coined chromothripsis (Stephens et al., 2011) (chromo from chromosome; thripsis for shattering into pieces; illustrated in Figures 1A and 1B). A key feature of chromothripsis is the formation of tens to hundreds of locally clustered DNA rearrangements through a singular, cataclysmic (one-off) event, resulting in a large number of rearranged fragments (often tens to hundreds) interspersed with widespread losses of sequence fragments (Figure 1B). Occasionally, rearrangements resulting from chromothripsis can lead to the formation of small circular DNA molecules (double-minute chromosomes), which may subsequently become amplied if they harbor oncogenes (Rausch et al., 2012a; Stephens et al., 2011) (Figure 1B). As a result of the massive DNA alterations occurring, chromosomes affected by chromothripsis show a characteristic pattern of copy-number oscillations, whereby typically only two (or occasionally three) copy-number states are detectable along
1226 Cell 152, March 14, 2013 2013 Elsevier Inc.

the chromosome in the context of a large number of rearrangements (Stephens et al., 2011). This pattern distinguishes chromothripsis from other punctuated equilibrium-like mechanisms in which one-off events precipitate multiple successive DNA rearrangements. An example of the latter is the breakage-fusion-bridge cycle (Figure 1C), in which one DNA double-strand break can result in further DNA alterations acquired with each subsequent cell cycle (Bignell et al., 2007; Rudolph et al., 2001). Such processes, although occurring in a short period of time, are conceptually different to chromothripsis because they are associated with DNA replication interspersed with progressive rearrangements, and thus copy-number states can vary extensively across the derivative chromosome. Impact of Chromothripsis on Cancer Development and Progression The DNA breakpoints resulting from chromothripsis frequently affect only one or a few chromosomes (Figure 2A). Spectral karyotyping and uorescent in situ hybridization (FISH) experiments have further shown that only one of the two parental chromosomes (or haplotypes) is typically affected by chromothripsis (Stephens et al., 2011). DNA rearrangements arising through chromothripsis can lead to several simultaneous tumorigenic DNA alterations (Rausch et al., 2012a; Stephens et al., 2011) (illustrated in Figure 1A and Figure 2B). FISH experiments further showed rearrangement outcomes of chromothripsis to be detectable throughout practically all cells in a tumor and not solely in tumor subclones (e.g., Figure 2B), suggesting that chromothripsis occurs as a relatively early tumorigenic event (Rausch et al., 2012a; Stephens et al., 2011). Hence, chromothripsis is thought to contribute to, or even represent a driving force of, cancer development and progression.

Figure 1. Cataclysmic DNA Rearrangement Processes


(A) Tumorigenesis is classically thought to involve the stepwise acquisition of somatic DNA driver alterations (dashed blue arrows). Cellular crises, such as chromothripsis, may accelerate this process by resulting in several DNA alterations at once (solid black arrows). The red color symbolizes the acquisition of malignant phenotypes in the cell (white = nonmalignant cell; red = aggressive/highly malignant cell). (B) Chromothripsis, a cellular crisis altering chromosomes in a one-off burst thought to involve a single cell cycle (adapted from Stephens et al., 2011, Rausch et al., 2012a). (C) The breakage-fusion-bridge cycle, a prototypic process (McClintock, 1941) involving chromosome end-to-end fusions that lead to clustered breakpoints but not to extensive copy-number state oscillations. This form of crisis typically involves several subsequent cell cycles. Though in the classical breakage-fusion bridge cycle only a single DNA break is thought to occur in each cell-division cycle, it is hypothesized that chromosome end-to-end fusions may also lead to chromothripsis events (Stephens et al., 2011).

The characteristic signature of massive DNA rearrangements resulting from chromothripsis has been observed in 2%3% of cancer samples (Stephens et al., 2011). Distinct malignancies

display different rates of chromothripsis (reviewed in Jones and Jallepalli, 2012), and the outcomes of such one-off chromosomal crises have been reported in diverse cancer entities,
Cell 152, March 14, 2013 2013 Elsevier Inc. 1227

Figure 2. Appearances of Chromothripsis and Progressive DNA Rearrangements


(A) DNA rearrangement pattern of SNU-C1, a tetraploid colorectal cancer cell line, with >200 rearrangements on chromosome 15 associated with widespread DNA fragment loss (reproduced from Stephens et al., 2011). Oscillating copy-number proles derived from SNP6 microarray data are depicted in the upper panel of points. Allelic ratios for each SNP, depicting segments with retained heterozygosity interspersed with LOH, are shown in the lower panel of dots. Homozygous SNPs cluster at allelic ratios near 0 or 1. Heterozygous SNPs cluster around 0.5. The structural rearrangement graph with intrachromosomal rearrangements of all four possible orientations is depicted as colored lines that connect DNA segments. The box to the right shows a zoomed-in version of the 15q region. Abundant regions with LOH indicate that chromothripsis preceded genome duplication in this cancer cell line. (B) Chromothripsis in a primary Shh-driven pediatric medulloblastoma sample LFS-MB4 associated with the formation of a circular double-minute chromosome derived from chromosome 2 fragments (reproduced from Rausch et al., 2012a). The outermost rings in the illustrated circular plot depict chromosome coordinates and annotated genes with known oncogenes shown in red. FISH analysis veried the colocalization of the synchronously amplied MYCN (red) and GLI2 (green) oncogenes in the chromothripsis-associated, amplied double-minute chromosomes, and demonstrated their presence throughout virtually all tumor cells (reproduced from Rausch et al., 2012a).

including bone cancer, pediatric medulloblastoma, neuroblastoma, colorectal cancer, melanoma, and hematological malignancies (Hirsch et al., 2012; Kloosterman et al., 2011b; Magrangeas et al., 2011; Molenaar et al., 2012; Northcott et al., 2012;
1228 Cell 152, March 14, 2013 2013 Elsevier Inc.

Rausch et al., 2012a; Stephens et al., 2011). Furthermore, chromothripsis has been associated with poor patient survival in several cancers (Hirsch et al., 2012; Magrangeas et al., 2011; Molenaar et al., 2012; Rausch et al., 2012a), indicating its

potential relevance as a prognostic marker, and suggesting chromothripsis as a feature of some particularly aggressive forms of cancer. In sonic hedgehog (Shh)-driven medulloblastoma, chromothripsis has been linked with predisposing (germline) mutations in the gene encoding the p53 tumor suppressor (TP53) (Rausch et al., 2012a), and in group-3-subtype medulloblastoma and acute myeloid leukemia with somatic DNA alterations of TP53 (Northcott et al., 2012; Rausch et al., 2012a). Hence, chromothripsis appears to be prone to occur in specic contextsi.e., in conjunction with, or even instigated by, progressively acquired DNA alterations. Mechanisms Hypothesized to be Involved in Chromothripsis Although we can nd evidence of these cataclysmic events in genomes, the mechanisms that give rise to them are still being worked out. Computational analyses of breakpoint junction sequences performed at nucleotide resolution have provided initial clues on the mechanism for rejoining the shattered DNA fragments. Abundant 24 nt long repeating sequences (i.e., observed microhomology) at the respective rearrangement breakpoints (Stephens et al., 2011) are consistent with the repair of shattered DNA fragments by nonhomologous end-joining (NHEJ). Simulation-based computational analyses, described in more detail below, have further provided compelling evidence that the complex chromosome aberrations resulting from chromothripsis result from singular, catastrophic DNA rearrangement event (Rausch et al., 2012a; Stephens et al., 2011). Several hypothetical mechanisms have been proposed to lead to the massive DNA rearrangements observed in conjunction with chromothripsis (recently reviewed in Forment et al., 2012; Jones and Jallepalli, 2012; Maher and Wilson, 2012). Most proposed mechanisms assume that chromothripsis acts on condensed chromosomes in association with mitosis, which may explain the highly localized nature of DNA breakpoints on a single (or few) chromosomes (Stephens et al., 2011)although localized DNA shattering could also occur in the context of the regular spatial organization of interphase chromosomes (Lichter et al., 1988; Rausch et al., 2012a). In brief, the following mechanistic hypotheses have been presented and discussed: ionizing radiation acting upon condensed chromosomes (Stephens et al., 2011); critical telomere shortening followed by chromosome end-to-end fusions and subsequent massive DNA breakage (Stephens et al., 2011); abortive apoptosis events (Tubio and Estivill, 2011); premature chromosome compaction, in which chromosomes condense before completing DNA replication and may consequently shatter (Johnson and Rao, 1970; Meyerson and Pellman, 2011); and DNA damage associated with the packaging of mitotically delayed chromosomes into separate cellular compartments known as micronuclei (Crasta et al., 2012). In this regard, a particularly relevant observation made by Crasta and coworkers is that of DNA fragmentation affecting isolated chromosomes packaged into micronuclei, which addresses the conceptual problem of how highly localized DNA shattering, in the context of chromothripsis, might be achieved at the molecular level. Beyond reports of chromothripsis in many cancers, there is evidence that a similar (or perhaps identical) process may act

upon germline DNA, resulting in constitutional disorders (Chiang et al., 2012; Kloosterman et al., 2011a; Liu et al., 2011). Nucleotide resolution analyses of the DNA breakpoint junctions of constitutional chromothripsis events revealed the presence of microhomology compatible with NHEJ in some patients (Kloosterman et al., 2011a; Kloosterman et al., 2012). In others, sequence-based evidence for replication-associated structural rearrangements involving the proposed microhomologymediated break-induced replication (MMBIR) mechanism was reported (Liu et al., 2011), with MMBIR thought to be frequently associated with duplication events and with the insertion of short DNA-template-derived sequences (i.e., templated insertions) at the respective breakpoint junctions. The frequent association of chromothripsis in cancer with sequence loss (Stephens et al., 2011), rather than with duplication, and the lack of template-derived insertions at the respective DNA breakpoints in medulloblastoma (Rausch et al., 2012a) suggests that they may differ mechanistically from constitutional chromothripsis events. As is the case for chromothripsis in cancer, the molecular mechanism driving constitutional chromothripsis has not yet been experimentally elucidated. Challenges in the Assessment of Chromothripsis in Cancer Genomes Accurate inference of chromothripsis is crucial for further characterization of the underlying molecular process. However, the genomic signature left by other processes can resemble that of chromothripsis potentially resulting in misclassication of chromothripsis events that may hamper research on the mechanistic basis of chromothripsis and impede attempts to exploit chromothripsis as a biomarker for disease prognosis. To robustly and reproducibly identify DNA rearrangements arising from chromothripsis, those alterations underlying a one-off event must be distinguished from DNA alterations occurring in a stepwise manner. Different operational denitions have been applied for inferring chromothripsis in microarray based copy-number proling data. These operational denitions have been geared toward recognizing oscillating copy-number proles, by requiring the detection of at least 10, 20, or 50 copy-number alterations (i.e., identiable shifts in the copy-number prole) on a particular chromosome, with these alterations oscillating between only two or three copy-number states (Hirsch et al., 2012; Jones et al., 2012; Magrangeas et al., 2011; Molenaar et al., 2012; Northcott et al., 2012; Rausch et al., 2012a; Stephens et al., 2011). In addition to requiring a xed number of copy-number alterations (such as 50) as a threshold, the number of DNA breakpoints associated with oscillating copy-number alterations has been put in relation to the total number of breakpoints on a chromosome to dene a threshold for inferring chromothripsis in microarray data (Kim et al., 2013). Marked differences in the spatial distribution, number, and types of somatically acquired DNA rearrangements observed between cancer entities (Yates and Campbell, 2012), however, limit the utility of a dened threshold in terms of identied copy-number alterations for ascertaining chromothripsis. Specically, cancers displaying pronounced genomic instability, such as ovarian cancer (Cancer Genome Atlas Research
Cell 152, March 14, 2013 2013 Elsevier Inc. 1229

Figure 3. Amalgam of DNA Rearrangements in a Cancer Genome from an Ovarian Cancer Patient
The large number and diversity of DNA rearrangements detectable in this cancer genome highlight the necessity to use rigorous statistics for distinguishing chromothripsis events from progressive DNA alterations. Ovarian cancers show widespread DNA copy-number alterations throughout the genome, most of which involve progressive rearrangements (depicted by light blue arrows). Although chromosome 7 may potentially have undergone chromothripsis (purple arrow), the large genome-wide number of alterations limits the utility of operational denitions for inferencehence calling for rigorous statistical testing. This cancer genome copy-number alteration prole was determined using microarrays (Cancer Genome Atlas Research Network, 2011). Array data were reanalyzed with Nexus 6v10 (Biodiscovery) copy-number software, as described in Rausch et al., 2012a. Scales corresponding to array log2 ratios of 1 (gain) and 1 (loss) are indicated beneath the axis corresponding to the X chromosome.

Network, 2011), can harbor such a high number of progressively acquired somatic DNA alterations per chromosome (Figure 3) that based on operational denitions, those cancers may mistakenly be suspected to have undergone chromothripsis. Additionally, accumulations of DNA alterations on the same chromosome can be achieved by multistep processes, rather than one-off events, e.g., through successional breakagefusion-bridge cycles (Bignell et al., 2007; Rudolph et al., 2001) or through consecutive deletions that originate from fragile sites or are driven by positive selection (Bignell et al., 2010). Thus, although operational denitions can facilitate the screening for chromothripsis in microarray copy-number proling data, from which copy-number state information but not the relative order or orientation of rearrangements can be reconstructed, their utility is noticeably limitedand because operational denitions are prone to subjectivity, they can interfere with reproducibility. Criteria for Statistical Assessment of Chromothripsis A more robust and accurate distinction between DNA rearrangements arising from chromothripsis and those occurring in a stepwise fashion can be achieved by applying criteria that enable rigorous statistical evaluation of cancer genome
1230 Cell 152, March 14, 2013 2013 Elsevier Inc.

sequencing data (Rausch et al., 2012a; Stephens et al., 2011). The aim of these criteria is to evaluate the model that a particular set of DNA rearrangements resulted from stepwise somatic DNA alterations as compared to the alternative model that the rearrangements arose through a single catastrophic event (i.e., chromothripsis). The following sections outline the rationale behind several different criteria, each of which can facilitate the statistical inference of chromothripsis, allowing for more reproducible and accurate ascertainment of chromothripsis than otherwise possible using solely operational denitions. Most of these criteria take into account the entire set of structural rearrangements that have occurred on a chromosome in question, including the relative order and orientation of rearranged segments, which are typically detected using whole-genome paired-end DNA sequencing data, and which can be represented in the form of a structural rearrangement graph (Figure 2A and Box 1). Clustering of Breakpoints DNA breakpoints occurring in conjunction with chromothripsis typically show pronounced clustering (depicted in Figure 4A). Often, 510 breaks can be observed within 50 kb, followed by

Box 1. Construction of DNA Structural Rearrangement Graphs A crucial prerequisite for the inference of chromothripsis is the accurate mapping of somatically acquired DNA structural rearrangements in samples of interest to obtain a structural rearrangement graph, which represents the set of somatic rearrangements that occurred on a chromosome, comprising copy-number state information and data on the relative order and orientation of segments subsequent to rearrangement (see e.g., Figure 2A). Accurate structural rearrangement graphs can be obtained using sequence variant discovering approaches in massively parallel DNA sequencing data. These approaches include paired-end mapping, which is based on sequencing the ends of size-selected DNA fragments, and detecting DNA rearrangements by identifying paired ends that map abnormally onto the human reference assembly (Campbell et al., 2008; Korbel et al., 2007; Mills et al., 2011). Deletion-type rearrangements (tail-tohead) are inferred based on the abnormal distance of mapped ends, tandem duplication-type (head-to-tail) alterations based on their abnormal relative mapping order, and inversion-type alterations (head-to-head or tail-to-tail) based on their abnormal relative mapping orientation. The sensitivity of paired-end mapping for detecting DNA alterations is improved when DNA sequencing libraries with different library insert sizes are used (Mills et al., 2011; Rausch et al., 2012a). Read-depth analysis (Campbell et al., 2008; Chiang et al., 2009), an approach based on identifying copy-number alterations by analyzing the DNA read depth of coverage, can also be used to discover structural rearrangements and to infer the copy-number status of segments. Split-read (or clipped-read) analysis, which is based on evaluating gapped read alignments onto the human reference genome assembly, enables the ne-mapping of DNA rearrangement breakpoints (Rausch et al., 2012b; Wang et al., 2011; Ye et al., 2009). In theory, DNA sequence assembly can further improve the detection of structural rearrangement events, although recent analyses suggest that assembly using short DNA read data displays low sensitivity compared to the aforementioned sequence variant discovery approaches (Mills et al., 2011). Data from several of these rearrangement discovery approaches are typically combined to describe the somatic DNA structural rearrangement graph. This graph serves as the starting point for the described criteria for inferring chromothripsis.

consideration for rigorous statistical evaluation of breakpoint clustering). Regularity of Oscillating Copy-Number States The aforementioned oscillating behavior of copy-number states resulting from chromothripsis (e.g., as evident from the chromothripsis example shown in Figure 2A) can be evaluated rigorously, as illustrated in Figure 4B, by simulating a gradual process in which each of the structural rearrangements detected on a chromosome, according to the rearrangement graph, are introduced onto an in silico (modeled) chromosome one-afteranother (Rausch et al., 2012a; Stephens et al., 2011). By introducing these rearrangements in a stepwise fashion using Monte Carlo simulations, we can assess the ability of the progressive rearrangement null model to reproduce the regular (oscillating) nature of copy-number state switches characteristic for chromothripsis. Support for chromothripsis is obtained in cases where the null model is ruled out based on these simulations. Prevalence of Regions with Interspersed Loss and Retention of Heterozygosity Chromothripsis frequently leads to massive loss of segments on the affected chromosome with segmental losses being interspersed with regions displaying normal (disomic) copy-number (e.g., copy-number states oscillating between copy-number = 1 and copy-number = 2). Although monosomic regions have evidently lost heterozygosity, the key feature of chromothripsis is that the segments in the higher (disomic) copy-number state have retained heterozygosity (Stephens et al., 2011). The result is a highly regular (oscillating) pattern of segments with retained heterozygosity interspersed with loss-of-heterozygosity (LOH) (see Figure 2A, and illustration in Figure 4C). Once lost to the cell through deletion, heterozygosity cannot be regained. Hence, in the presence of an abundance of copy-number states oscillating between the states 1 and 2, perfect concordance between disomic regions and heterozygous regions will be unlikely in the event of gradually acquired rearrangements (Figure 4C). A simulation in which rearrangements are randomly and sequentially drawn from the available structural rearrangement graph (Box 1) can be employed to assess this concordance and hence to evaluate the hypothesis that DNA rearrangements were gradually acquired. It is worth noting that if chromothripsis occurs in the context of polyploidy, the lower copy-number state may not display LOH, but instead may reect the resulting allelic contribution in lost genomic segments (e.g., alternating between allelic ratios of 1:1 and 2:1, if genome duplication precedes chromothripsis). Nonetheless, in the case of chromothripsis, the resulting allelic ratios will oscillate between segments that are lost and retained, and evaluation of concordance of this oscillating behavior with the segmental copy-number state changes can hence facilitate the discrimination of one-off from progressive rearrangements in the context of polyploidy. Prevalence of Rearrangements Affecting a Single Haplotype When chromothripsis occurs, fragments resulting from chromosomal DNA shattering typically originate from a single parental
Cell 152, March 14, 2013 2013 Elsevier Inc. 1231

long tracts of intact chromosomal sequence. Breakpoints can be conned to individual chromosome arms with the clustering presumably resulting from whatever process drives the chromosome fragmentation (Stephens et al., 2011). Thus, an analysis of breakpoint clustering can be used as means to obtain evidence for chromothripsis (Rausch et al., 2012a; Stephens et al., 2011), as outlined in Box 2. Under a progressive rearrangements model, tendencies of breakpoints to cluster substantially imply a memory of previous rearrangements from one cell division to the next. Although less pronounced than in chromothripsis, local accumulation of breakpoints can be observed in progressive rearrangement scenarios where it may be driven by either chromosomal fragility or selection for particular genes within a chromosomal region (Campbell et al., 2010). As a consequence, under progressive rearrangement scenarios, breakpoint clustering tends to be recurrent across patients because both the locations of cancer genes and fragile sites represent intrinsic features of the human genome (a priori information that can be taken into

Figure 4. Criteria for the Inference of Chromothripsis


(A) Breakpoint clustering can yield evidence for chromothripsis (left), as stepwise alterations (right) do not typically lead to a similar level of clustering of DNA breaks. Curved colored lines depict individual rearrangements. (B) Oscillating copy-number proles. The left panel depicts a particular set of rearrangements resulting in oscillating copy number, indicative of chromothripsis. The null hypothesis of stepwise alterations can be rejected if simulations making use of all rearrangements depicted in the rearrangement graph fail to result in oscillations involving so few (in this case two) copy-number states. This is illustrated in the right panel, where the copy-number prole displays four different states. (C) Interspersed regions with loss and retention of heterozygosity often result from chromothripsis (left) and can be used for statistical testing as in the presence of stepwise alterations (right) such regularity of patterns is unlikely to occur. (D) Chromothripsis-associated rearrangements are typically detectable on a single parental copy (haplotype) of affected chromosomes (referred to as H1 in the left), whereas stepwise alterations do not typically show such preference. (E) Because fragments are randomly joined following DNA shattering (left), it follows that the relative order of rearranged fragments and the type of fragment joins should be uniformly distributed. By comparison, clustered stepwise alterations often show biases toward certain rearrangement forms (right), and are thus not expected to result in such uniform joining and ordering of segments. (F) In a region of chromothripsis, each fragment is either retained in or lost from the derivative chromosome, enabling an unambiguous walk through the rearrangements created. As a result, when viewed on the reference genome, adjacent reads demarcating breakpoints inferred by paired-end mapping show perfect alternations between head (h) and tail (t) paired-end reads (left). In contrast, most progressive DNA alteration scenarios that result in nested rearrangements (right) do not have this property.

chromosome (or haplotype). Considering that DNA rearrangements can be associated with a specic haplotype using phasing (Box 3), the extent to which rearrangements are biased toward a single haplotype, rather than occurring on both haplotypes (assuming disomy), can be used to obtain further evidence for chromothripsis (Figure 4D). Under the assumption that progressive rearrangements affect each haplotype randomly, a statistical test can provide evidence for chromothripsis by dening the extent to which rearrangements are concentrated on a single haplotypefor example, by using the Poisson assumption that in the presence of progressive rearrangements structural rearrangements occur on both haplotypes (null hypothesis), rather than only on a single one.
1232 Cell 152, March 14, 2013 2013 Elsevier Inc.

Selection for particular genes within a chromosomal region can bias progressive rearrangements to occur preferentially on only one rather than on both haplotypes. However, the rearranged genomic regions would be recurrent across patients if driven by selection, providing a possible rationale to account for such potentially confounding factor. Randomness of DNA Fragment Joins The assumption underlying the chromothripsis theory is that the chromosome fragments are randomly stitched together (joined), involving a DNA double-strand repair process. The implication is that at each join, the orientation of the two DNA fragment ends should be random (illustrated in Figure 4E), in

Box 2. Outline of Statistical Algorithms for Inferring Chromothripsis Guidelines for evaluating the following four criteria are outlined: 1. Clustering of breakpoints: Let fx1 ; x2 ; .; xn g be the set of breakpoint locations on a given chromosome, ordered from the lowest to the highest (as positioned on the reference genome). The null model of random breakpoint locations implies that the distances between adjacent breakpoints, fx2 x1 ; x3 x2 ; .; xn xn1 g, should be distributed according to an exponential distribution with mean Pn1 1 xi + 1 xi =n 1 which can be readily evaluated using a goodness-of-t test. In our experience, chromothripsis is typically associated with a strong departure from this null distribution, although some situations of progressive rearrangements (e.g., rearrangements arising through successive breakage fusion bridge cycles; Figure 1C) are too. 2. Randomness of DNA fragment joins: Let frDel ; rTD ; rH2H ; rT2T g be the counts of observed rearrangements that have a deletion-type, tandem duplication-type, head-to-head-inverted, and tail-to-tail-inverted orientation respectively. If more than one chromosome is involved, then interchromosomal rearrangements can be interpreted in the same four categories using orientation of the strands at the breakpoint. Then, in a region of chromothripsis, we would expect these counts to P be distributed as a multinomial distribution with parameters n = ri and probability pi = 1=4. A departure from this distribution can be employed as evidence against the rearrangements arising from a chromothripsis process. 3. Randomness of DNA fragment order: In a chromothripsis event, the presumption is that the original position of a fragment on the reference genome carries no information about the origins of the fragments it is joined to at either end. To test this, let fx1 ; x2 ; .; xn g be the set of breakpoint locations, ordered from the lowest to the highest (as positioned on the reference genome). Each observed rearrangement consists of two DNA breaks joined together and can be denoted as fI1 ; I2 g, where I refers to the index of the ordered breakpoints fxI g. Under the chromothripsis model, the paired indices should be random draws without replacement from f1; 2; .; ng. There are suites of tools available for statistically assessing randomness that could be adapted here. One possibility, for example, would be to calculate the mean of fjI2 I1 jg and compare this to 1,000 Monte Carlo simulations. When we have tested this in practice, the fragment order from a chromothripsis process is not entirely random, implying some spatial structure to the DNA repair process but considerably more random than most progressive rearrangement scenarios. 4. Ability to walk the derivative chromosome: As can be seen in Figure 4F, the ability to walk the derivative chromosome implies that adjacent DNA reads demarcating breakpoints inferred by paired-end mapping (Box 1) must alternate between the head of the paired-end fragment and the tail of that fragment. Let fx1 ; x2 ; .; xn g be the set of breakpoint locations, ordered from the lowest to the highest (as positioned on the reference genome), and let fs1 ; s2 ; .; sn g be the paired-end DNA read (head or tail) associated with each of these breakpoints. In a region of chromothripsis, if all rearrangements were observed, fs1 ; s2 ; .; sn g would be a perfect alternating sequence of heads and tails when ordered along the reference genome assembly (Figure 4F). Because some rearrangements are likely to be missed in the sequencing, the problem is one of whether there are longer runs of alternating heads and tails than expected by chance. This circumstance could be assessed by adapting the Wald-Wolfowitz test for runs. Note that some progressive rearrangement processes could give similar runs, such as a series of deletions on a given chromosome, but that many processes, especially those associated with amplication, will not.

Box 3. Application of Haplotype Phasing to Improve Structural Rearrangement Analysis A normal human genome is diploid (2n), and cancer genomes can display different karyotype congurations (e.g., tetraploidy, 4n). According to the theory of chromothripsis, structural rearrangements arising should normally display a bias toward occurring on a single chromosome homolog (i.e., haplotype), rather than on both haplotypes for disomic karyotypes (or all four in the case of 4n). Hence, the ability to relate rearrangements to a specic haplotype would allow inferring chromothripsis events with increased power. Using short read DNA sequencing, haplotype phases of 300400 kb could be used to monitor whether adjacent DNA breakpoints arose on a single DNA molecule, using the 1000 Genomes Project integrated haplotype reference panel (1000 Genomes Project Consortium et al., 2012) in conjunction with computational approaches based on imputation (Browning and Browning, 2011). Chromosome-wide phasing data can be obtained when germline whole-genomic sequencing data from both parents or somatic genome sequencing data from aneuploid secondary tumors (which are common in the context of hereditary disorders such as Li-Fraumeni syndrome; Li and Fraumeni, 1969) are available for a patient sample in question.

analogy to a pearl necklace that after being disrupted is put together, with the pearls added to the chain in random order and orientation. To evaluate rearrangement patterns for chromothripsis, the uniformity of orientation of joined DNA fragments can be inferred by interpreting the structural rearrangement graph (Box 1)that is, the number of tail-to-head (deletion-type), head-to-tail (tandem-duplication-type), head-to-head and tail-to-tail (inversion-type) rearrangements observed should be broadly equal (Box 2). This criterion applies whether the rearrangements are intrachromosomal or interchromosomal. In contrast, for many other types of clustered rearrangements, this property does not apply. For example, in regions affected by recurrent breakage-fusion-bridge cycles, there will be predominance of head-to-head and tail-to-tail inverted rearrangements, whereas for chromosomal fragile sites, deletions tend to dominate among the spectrum of rearrangements (Campbell et al., 2010). Randomness of DNA Fragment Order Because chromosome fragments are randomly joined, their relative order, namely their position on the derivative chromosome, also should be approximately random (Figure 4E) provided that there is no preference for joining particular ends together, such as maintaining a centromere or telomere. Hence, as an extension to the criterion to evaluate the randomness of DNA fragment orientation, an assessment of the randomness of DNA fragment order can be used to obtain further evidence for chromothripsis (Box 2). This criterion applies to both intrachromosomal and interchromosomal rearrangements. Walking the Derivative Chromosome If all DNA rearrangements in a region with chromothripsis are detectable, it should be possible to reconstruct the relative order in which segments are joined based on the structural rearrangement graph. Computational approaches for piecing together
Cell 152, March 14, 2013 2013 Elsevier Inc. 1233

such digital karyotypes are being developed (Greenman et al., 2012). For our purposes here, the chromothripsis model means that each DNA segment included in the derivative chromosome resulting from chromothripsis has consistent orientation (namely it has a head at one end and a tail at the other). The derivative chromosome then forms a single, coherent chain of segments with the constraint that either end of each segment must have consistent conguration. Each DNA segment retained in the derivative chromosome must be demarcated at either end by genomic rearrangementswhen viewed from the perspective of the reference genome, each separate segment will start with a rearrangement from the head of the segment and nish with a rearrangement from the tail of the segment. This constraint will lead to an alternating head/tail sequence of DNA rearrangements, detected by paired-end mapping (Box 1), when pairedends demarcating breakpoints are represented along the reference genome (see illustration in Figure 4F). Importantly, this organization of alternating heads and tails need not be the case under the alternative model because when the rearrangements occur sequentially, some segments can be reused in the derivative chromosome. This would generally break the perfectly alternating head/tail series. Consider, for example, two tandem duplications, one nested entirely within the other. There are four breakpoints from two DNA rearrangements. The two breakpoints at the lowest genomic reference coordinates are both demarcated by tails (see Figure 4F, right). This would break the alternating head/tail sequence and would be inconsistent with chromothripsis. As described in Box 2, it is relatively straightforward to test for consistency with this criterion. Summary and Outlook In this primer, we describe the characterization of chromothripsis within a genome as a statistical question geared toward discriminating rearrangements resulting from chromothripsis from those that result from subsequent stepwise DNA alterations. Approaches for inferring the presence of chromothripsis in genomes harboring appreciable levels of gradually acquired alterations can be viewed as conceptually similar to detecting driver alterations among the tumult of passenger mutations and structural abnormalities typically observed in a cancer genome. In this analogy, the discrimination of driver from passenger alterations in studies focusing on generating cancer gene catalogs benets from statistical approaches for rejecting the hypothesis that an event corresponds to a stochastically occurring, inconsequential, passenger alteration (Dees et al., 2012). Not all of the aforementioned criteria for inferring chromothripsis can be applied to each cancer sample. In cancers harboring extreme levels of genomic instability, the characteristic stamp of chromothripsis may be hidden behind the mass of stepwise alterations in such a way that it may not be condently detectable with the approaches described here. Tumor heterogeneity and ploidy may affect the inference of chromothripsis. Heterogeneity, as a confounding factor, can be partially dealt with by focusing analyses on those DNA alterations that affected the same subset of cells based on haplotype-specic analyses of subclonal alterations (e.g., using approaches described in Nik1234 Cell 152, March 14, 2013 2013 Elsevier Inc.

Zainal et al., 2012). Exome sequencing or microarray based copy-number proles cannot be used to infer order and orientation of rearranged segments, limiting criteria that can be used for inferring chromothripsis to the evaluation of breakpoint clustering, or to operational denitions (such as the enumeration of copy-number state changes). Even the most widely used massively parallel DNA sequencing techniques have remaining limitations, with short DNA reads (%150 nt) and the most commonly used paired-end library (Box 1) insert sizes (<400 bp) remaining ineffective for ascertaining sequence variation in highly repetitive DNA (Onishi-Seebacher and Korbel, 2011). This technological constraint inevitably limits analyses to mappable genomic regions, which have been estimated to comprise 90% of the human reference assembly (1000 Genomes Project Consortium et al., 2012). Hence, the available data may in some cases not be sufcient to infer chromothripsis reliably, in which case the criteria we describe may be biased toward presuming that progressive DNA rearrangements occurred. Despite these challenges, the criteria described here will enable researchers to ascertain chromothripsis in cancer genomes in a rigorous, and more reliable, fashion than feasible on the basis of operational denitions. We recommend assessment of each of the criteria we described on cancer samples harboring rearrangements that can be clearly attributed to chromothripsis as well as on such harboring DNA alterations that undoubtedly underlie a stepwise process, because this will facilitate identifying optimal parameters for discriminating one-off from progressive alterations, which may depend on sequencing depth and protocol used. With massively parallel DNA sequencing technology increasingly prevailing over microarray-based approaches for cancer genome analysis, we propose that future studies should verify the occurrence of chromothripsis by using sequencing data, and by demonstrating the applicability of differente.g., at least twocriteria as minimal evidence for discriminating stepwise from one-off events. We foresee that using robust, reproducible criteria for classication, future research will reveal electrifying insights into the functional consequences and mechanistic basis of chromothripsis.
ACKNOWLEDGMENTS J.O.K. acknowledges funding from the European Commission (HealthF2-2010-260791). P.J.C. is a Wellcome Trust Senior Clinical Fellow. We thank Tobias Rausch, Stephanie Sungalee, Balca Mardin, Joachim Weischenfeldt, Christopher Buccitelli, and Wolfgang Huber for their thoughtful comments.

REFERENCES 1000 Genomes Project Consortium, Abecasis, G.R., Auton, A., Brooks, L.D., DePristo, M.A., Durbin, R.M., Handsaker, R.E., Kang, H.M., Marth, G.T., and McVean, G.A. (2012). An integrated map of genetic variation from 1,092 human genomes. Nature 491, 5665. Bignell, G.R., Santarius, T., Pole, J.C., Butler, A.P., Perry, J., Pleasance, E., Greenman, C., Menzies, A., Taylor, S., Edkins, S., et al. (2007). Architectures of somatic genomic rearrangement in human cancer amplicons at sequence-level resolution. Genome Res. 17, 12961303.

Bignell, G.R., Greenman, C.D., Davies, H., Butler, A.P., Edkins, S., Andrews, J.M., Buck, G., Chen, L., Beare, D., Latimer, C., et al. (2010). Signatures of mutation and selection in the cancer genome. Nature 463, 893898. Browning, S.R., and Browning, B.L. (2011). Haplotype phasing: existing methods and new developments. Nat. Rev. Genet. 12, 703714. Campbell, P.J., Stephens, P.J., Pleasance, E.D., OMeara, S., Li, H., Santarius, T., Stebbings, L.A., Leroy, C., Edkins, S., Hardy, C., et al. (2008). Identication of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing. Nat. Genet. 40, 722729. Campbell, P.J., Yachida, S., Mudie, L.J., Stephens, P.J., Pleasance, E.D., Stebbings, L.A., Morsberger, L.A., Latimer, C., McLaren, S., Lin, M.L., et al. (2010). The patterns and dynamics of genomic instability in metastatic pancreatic cancer. Nature 467, 11091113. Cancer Genome Atlas Research Network. (2011). Integrated genomic analyses of ovarian carcinoma. Nature 474, 609615. Chiang, D.Y., Getz, G., Jaffe, D.B., OKelly, M.J., Zhao, X., Carter, S.L., Russ, C., Nusbaum, C., Meyerson, M., and Lander, E.S. (2009). High-resolution mapping of copy-number alterations with massively parallel sequencing. Nat. Methods 6, 99103. Chiang, C., Jacobsen, J.C., Ernst, C., Hanscom, C., Heilbut, A., Blumenthal, I., Mills, R.E., Kirby, A., Lindgren, A.M., Rudiger, S.R., et al. (2012). Complex reorganization and predominant non-homologous repair following chromosomal breakage in karyotypically balanced germline rearrangements and transgenic integration. Nat. Genet. 44, 390397, S1. Crasta, K., Ganem, N.J., Dagher, R., Lantermann, A.B., Ivanova, E.V., Pan, Y., Nezi, L., Protopopov, A., Chowdhury, D., and Pellman, D. (2012). DNA breaks and chromosome pulverization from errors in mitosis. Nature 482, 5358. Dees, N.D., Zhang, Q., Kandoth, C., Wendl, M.C., Schierding, W., Koboldt, D.C., Mooney, T.B., Callaway, M.B., Dooling, D., Mardis, E.R., et al. (2012). MuSiC: identifying mutational signicance in cancer genomes. Genome Res. 22, 15891598. Forment, J.V., Kaidi, A., and Jackson, S.P. (2012). Chromothripsis and cancer: causes and consequences of chromosome shattering. Nat. Rev. Cancer 12, 663670. Greenman, C.D., Pleasance, E.D., Newman, S., Yang, F., Fu, B., Nik-Zainal, S., Jones, D., Lau, K.W., Carter, N., Edwards, P.A., et al. (2012). Estimation of rearrangement phylogeny for cancer genomes. Genome Res. 22, 346361. Hirsch, D., Kemmerling, R., Davis, S., Camps, J., Meltzer, P.S., Ried, T., and Gaiser, T. (2012). Chromothripsis and focal copy number alterations determine poor outcome in malignant melanoma. Cancer Res., in press. Published online December 27, 2012. http://dx.doi.org/10.1158/0008-5472. Johnson, R.T., and Rao, P.N. (1970). Mammalian cell fusion: induction of premature chromosome condensation in interphase nuclei. Nature 226, 717722. Jones, M.J., and Jallepalli, P.V. (2012). Chromothripsis: chromosomes in crisis. Dev. Cell 23, 908917. ger, N., Kool, M., Zichner, T., Hutter, B., Sultan, M., Cho, Y.J., Jones, D.T., Ja tz, A.M., et al. (2012). Dissecting the genomic Pugh, T.J., Hovestadt, V., Stu complexity underlying medulloblastoma. Nature 488, 100105. Kim, T.M., Xi, R., Luquette, L.J., Park, R.W., Johnson, M.D., and Park, P.J. (2013). Functional genomic analysis of chromosomal aberrations in a compendium of 8000 cancer genomes. Genome Res. 23, 217227. Kloosterman, W.P., Guryev, V., van Roosmalen, M., Duran, K.J., de Bruijn, E., Bakker, S.C., Letteboer, T., van Nesselrooij, B., Hochstenbach, R., Poot, M., and Cuppen, E. (2011a). Chromothripsis as a mechanism driving complex de novo structural rearrangements in the germline. Hum. Mol. Genet. 20, 19161924. Kloosterman, W.P., Hoogstraat, M., Paling, O., Tavakoli-Yaraki, M., Renkens, I., Vermaat, J.S., van Roosmalen, M.J., van Lieshout, S., Nijman, I.J., Roessingh, W., et al. (2011b). Chromothripsis is a common mechanism driving

genomic rearrangements in primary and metastatic colorectal cancer. Genome Biol. 12, R103. Kloosterman, W.P., Tavakoli-Yaraki, M., van Roosmalen, M.J., van Binsbergen, E., Renkens, I., Duran, K., Ballarati, L., Vergult, S., Giardino, D., Hansson, K., et al. (2012). Constitutional chromothripsis rearrangements involve clustered double-stranded DNA breaks and nonhomologous repair mechanisms. Cell Rep. 1, 648655. Knudson, A.G., Jr. (1971). Mutation and cancer: statistical study of retinoblastoma. Proc. Natl. Acad. Sci. USA 68, 820823. Korbel, J.O., Urban, A.E., Affourtit, J.P., Godwin, B., Grubert, F., Simons, J.F., Kim, P.M., Palejev, D., Carriero, N.J., Du, L., et al. (2007). Paired-end mapping reveals extensive structural variation in the human genome. Science 318, 420426. Li, F.P., and Fraumeni, J.F., Jr. (1969). Soft-tissue sarcomas, breast cancer, and other neoplasms. A familial syndrome? Ann. Intern. Med. 71, 747752. Lichter, P., Cremer, T., Borden, J., Manuelidis, L., and Ward, D.C. (1988). Delineation of individual human chromosomes in metaphase and interphase cells by in situ suppression hybridization using recombinant DNA libraries. Hum. Genet. 80, 224234. Liu, P., Erez, A., Nagamani, S.C., Dhar, S.U., Ko1odziejska, K.E., Dharmadhikari, A.V., Cooper, M.L., Wiszniewska, J., Zhang, F., Withers, M.A., et al. (2011). Chromosome catastrophes involve replication mechanisms generating complex genomic rearrangements. Cell 146, 889903. Magrangeas, F., Avet-Loiseau, H., Munshi, N.C., and Minvielle, S. (2011). Chromothripsis identies a rare and aggressive entity among newly diagnosed multiple myeloma patients. Blood 118, 675678. Maher, C.A., and Wilson, R.K. (2012). Chromothripsis and human disease: piecing together the shattering process. Cell 148, 2932. McClintock, B. (1941). The Stability of Broken Ends of Chromosomes in Zea Mays. Genetics 26, 234282. Meyerson, M., and Pellman, D. (2011). Cancer genomes evolve by pulverizing single chromosomes. Cell 144, 910. Mills, R.E., Walter, K., Stewart, C., Handsaker, R.E., Chen, K., Alkan, C., Abyzov, A., Yoon, S.C., Ye, K., Cheetham, R.K., et al.; 1000 Genomes Project. (2011). Mapping copy number variation by population-scale genome sequencing. Nature 470, 5965. Molenaar, J.J., Koster, J., Zwijnenburg, D.A., van Sluis, P., Valentijn, L.J., van der Ploeg, I., Hamdi, M., van Nes, J., Westerman, B.A., van Arkel, J., et al. (2012). Sequencing of neuroblastoma identies chromothripsis and defects in neuritogenesis genes. Nature 483, 589593. Nik-Zainal, S., Van Loo, P., Wedge, D.C., Alexandrov, L.B., Greenman, C.D., Lau, K.W., Raine, K., Jones, D., Marshall, J., Ramakrishna, M., et al.; Breast Cancer Working Group of the International Cancer Genome Consortium. (2012). The life history of 21 breast cancers. Cell 149, 9941007. Northcott, P.A., Shih, D.J., Peacock, J., Garzia, L., Morrissy, A.S., Zichner, T., tz, A.M., Korshunov, A., Reimand, J., Schumacher, S.E., et al. (2012). Stu Subgroup-specic structural variation across 1,000 medulloblastoma genomes. Nature 488, 4956. Onishi-Seebacher, M., and Korbel, J.O. (2011). Challenges in studying genomic structural variant formation mechanisms: the short-read dilemma and beyond. Bioessays 33, 840850. tz, A.M., Zichner, T., Weischenfeldt, Rausch, T., Jones, D.T., Zapatka, M., Stu ger, N., Remke, M., Shih, D., Northcott, P.A., et al. (2012a). Genome J., Ja sequencing of pediatric medulloblastoma links catastrophic DNA rearrangements with TP53 mutations. Cell 148, 5971. tz, A.M., Benes, V., and Korbel, J.O. Rausch, T., Zichner, T., Schlattl, A., Stu (2012b). DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics 28, i333i339. Rudolph, K.L., Millard, M., Bosenberg, M.W., and DePinho, R.A. (2001). Telomere dysfunction and evolution of intestinal carcinoma in mice and humans. Nat. Genet. 28, 155159. Stephens, P.J., Greenman, C.D., Fu, B., Yang, F., Bignell, G.R., Mudie, L.J., Pleasance, E.D., Lau, K.W., Beare, D., Stebbings, L.A., et al. (2011). Massive

Cell 152, March 14, 2013 2013 Elsevier Inc. 1235

genomic rearrangement acquired in a single catastrophic event during cancer development. Cell 144, 2740. Stratton, M.R., Campbell, P.J., and Futreal, P.A. (2009). The cancer genome. Nature 458, 719724. Tubio, J.M., and Estivill, X. (2011). Cancer: When catastrophe strikes a cell. Nature 470, 476477. Wang, J., Mullighan, C.G., Easton, J., Roberts, S., Heatley, S.L., Ma, J., Rusch, M.C., Chen, K., Harris, C.C., Ding, L., et al. (2011). CREST maps somatic

structural variation in cancer genomes with base-pair resolution. Nat. Methods 8, 652654. Yates, L.R., and Campbell, P.J. (2012). Evolution of the cancer genome. Nat. Rev. Genet. 13, 795806. Ye, K., Schulz, M.H., Long, Q., Apweiler, R., and Ning, Z. (2009). Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics 25, 28652871.

1236 Cell 152, March 14, 2013 2013 Elsevier Inc.

Leading Edge

Review
Transcriptional Regulation and Its Misregulation in Disease
Tong Ihn Lee1 and Richard A. Young1,2,*
Institute for Biomedical Research, Cambridge, MA 02142, USA of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA *Correspondence: young@wi.mit.edu http://dx.doi.org/10.1016/j.cell.2013.02.014
2Department 1Whitehead

The gene expression programs that establish and maintain specic cell states in humans are controlled by thousands of transcription factors, cofactors, and chromatin regulators. Misregulation of these gene expression programs can cause a broad range of diseases. Here, we review recent advances in our understanding of transcriptional regulation and discuss how these have provided new insights into transcriptional misregulation in disease.
Introduction The key concepts of transcriptional control were established half a century ago in bacterial systems (Jacob and Monod, 1961). That pioneering work and many subsequent studies established that DNA binding transcription factors (also known as trans-factors) occupy specic DNA sequences at control elements (cis-elements) and recruit and regulate the transcription apparatus. In eukaryotic systems, there has been extensive study of specic transcription factors and their cofactors, the general transcription apparatus, and various chromatin regulators, leading to a present-day consensus model for selective gene control (Adelman and Lis, 2012; Bannister and Kouzarides, 2011; Bonasio et al., 2010; Conaway and Conaway, 2011; Fuda et al., 2009; Ho and Crabtree, 2010; Roeder, 2005; Spitz and Furlong, 2012; Taatjes, 2010; Zhou et al., 2012b). Our knowledge of mammalian regulatory elements and the transcriptional and chromatin regulators that operate at these sites has increased considerably in the last decade. There have also been substantial advances in our understanding of the control of large portions of the gene expression program in embryonic stem cells (ESCs) and in a number of more differentiated cell types. In these relatively well-studied cells, for example, it is now understood that a small fraction of the hundreds of transcription factors that are present dominate the control of much of the active gene expression program (Graf, 2011; Ng and Surani, 2011; Orkin and Hochedlinger, 2011; Young, 2011). The recent insights into control of cellular gene expression programs have had an important impact on our understanding of misregulation of gene expression in disease. Many different diseases and syndromes, including cancer, autoimmunity, neurological disorders, diabetes, cardiovascular disease, and obesity, can be caused by mutations in regulatory sequences and in the transcription factors, cofactors, chromatin regulators, and noncoding RNAs that interact with these regions. New insights into the global effects of some of these mutations have recently emerged. These insights alter our view of the underlying cause of some diseases and are the primary focus of this Review. We begin with a brief review of the basic features of human genes and the fundamentals of gene regulation. This leads to a discussion of cellular gene expression programs and the mechanisms involved in global regulation of transcription. We then describe how recent advances in our understanding of the control of gene expression have led to new insights into the mechanisms involved in misregulation of gene expression in various human diseases and disorders. Genes and Enhancer Elements There are a remarkable variety and number of genes that are transcribed into protein-coding and noncoding RNA (ncRNA) species in mammalian cells (Table 1). The human genome is thought to contain 20,000 protein-coding genes and at least as many ncRNA genes (Djebali et al., 2012). Functions have been determined or inferred for many of the protein-coding genes, but less is understood about the functions of the ncRNA genes. Many of the ncRNAs contribute to control of gene expression through modulation of transcriptional or posttranscriptional processes (Bartel, 2009; Ebert and Sharp, 2012; Lee, 2012; Orom and Shiekhattar, 2011; Rinn and Chang, 2012; Wright and Ciosk, 2013). For example, the microRNAs (miRNAs), which are the best studied of the various classes of ncRNAs, ne tune the levels of target messenger RNAs (mRNAs). Some of the long ncRNAs (lncRNAs) recruit chromatin regulators to specic regions of the genome and thereby modify gene expression, and some apparently do not have a function but are simply a product of a transcriptional event that is itself regulatory (Latos et al., 2012). Transcription factors typically regulate gene expression by binding enhancer elements and recruiting cofactors and RNA polymerase II to target genes (Lelli et al., 2012; Ong and Corces, 2011; Spitz and Furlong, 2012). Multiple transcription factors typically bind in a cooperative fashion to individual enhancers (Panne, 2008) and regulate transcription from the core promoters of nearby or distant genes through physical contacts that involve looping of the DNA between enhancers and the core promoters (Krivega and Dean, 2012). The core promoter elements, which include sites where transcription
Cell 152, March 14, 2013 2013 Elsevier Inc. 1237

Table 1. Human Genes Class Messenger RNA Ribosomal RNA Transfer RNA Small nuclear RNA Small nucleolar RNA Antisense RNA Long noncoding RNA MicroRNA Small interfering RNA Piwi-interacting RNA Enhancer RNA
a b

Abbreviation mRNA rRNA tRNA snRNA snoRNA aRNA lncRNA miRNA siRNA piRNA eRNA

Function protein coding structural and functional component of ribosome translational adaptor molecule processing of pre-mRNA processing of rRNA, tRNA, and snRNA gene regulation gene regulation translational inhibition and mRNA degradation posttranscriptional gene silencing protect genome integrity unknown

Estimated Number 20,078a 531a 512b 1,923a 1,529a 4,424a 12,933a >500c n/ad n/ae variablef

GENCODE (Harrow et al., 2012). HGNC database (Seal et al., 2011; http://www.genenames.org). c 2,0003,000 putative miRNA have been annotated (Harrow et al., 2012), but the majority are not validated. d Human endogenous siRNA are rare and have not been systematically identied. e Piwi-interacting RNA (piRNA) have not been systematically identied, although estimates indicate that hundreds of piRNAs are derived from each of more than 100 loci (Aravin et al., 2006; Brennecke et al., 2007; Girard et al., 2006). f Enhancer RNAs (eRNAs) are generated from active enhancers, thus the number of eRNAs depends on the set of active enhancers in a cell (Kim et al., 2010; Wang et al., 2011). Current estimates indicate that 25%80% of active enhancers generate eRNAs.

initiation occurs, can also be bound by certain transcription factors (Dikstein, 2011; Goodrich and Tjian, 2010). Enhancers can be identied by proling the locations of key transcriptional regulators genome wide and by testing whether these DNA elements are active in enhancer-reporter vectors, and a large population of ESC enhancers has been identied in this manner (Chen et al., 2008). Enhancers are occupied by nucleosomes with specic modications and are sensitive to DNase treatment, and these features can be used to identify putative enhancers when the key transcriptional regulators are not known (Buecker and Wysocka, 2012; Thurman et al., 2012). Approximately one million putative enhancers have recently been identied in the human genome by using, in multiple cell types, a variety of high-throughput techniques that detect these features of enhancers (Dunham et al., 2012; Thurman et al., 2012). These putative enhancers provide a resource for identifying regions of the genome where sequence variation may impact factor binding and gene regulation and thus contribute to disease. Recent studies suggest that a considerable portion of the genetic variation that is associated with disease occurs in these regulatory regions (Maurano et al., 2012). Transcriptional Control of Genes Transcriptional regulation occurs at two interconnected levels: the rst involves transcription factors and the transcription apparatus, and the second involves chromatin and its regulators (Figure 1). We briey discuss the fundamentals of transcriptional control in this order, noting recent advances and reviews where the reader can obtain more detailed information. Transcription factors can be separated into two classes based on their regulatory responsibilities: control of initiation versus control of elongation (Adelman and Lis, 2012; Fuda et al., 2009; Rahl et al., 2010; Yankulov et al., 1994; Zhou et al., 2012b). This distinction is not absolute, as some transcription
1238 Cell 152, March 14, 2013 2013 Elsevier Inc.

factors may contribute to control of both initiation and elongation. Transcription factors typically bind cofactors, which are protein complexes that contribute to activation (coactivators) and repression (corepressors) but do not have DNA-binding properties of their own. Most transcription factors are thought to contribute to transcription initiation and do so by recruiting coactivators. These coactivators include the Mediator complex, P300, and general transcription factors, among others (Juven-Gershon and Kadonaga, 2010; Malik and Roeder, 2010; Sikorski and Buratowski, 2009; Taatjes, 2010). Recent studies have highlighted the importance of Mediator in integrating information from transcriptional activators, repressors, signaling pathways, and other regulators during transcription initiation and during the switch to elongation (Berk, 2012; Borggrefe and Yue, 2011; Conaway and Conaway, 2011; Kagey et al., 2010; ` re et al., 2012; Malik and Roeder, 2010; Kornberg, 2005; Larivie Spaeth et al., 2011; Taatjes, 2010). Once the recruited RNA polymerase II molecules initiate transcription, they generally transcribe a short distance, typically 2050 bp, and then pause (Figure 1) (Adelman and Lis, 2012). This process is controlled by the pause control factors DSIF and NELF, which are physically associated with the paused RNA polymerase II molecules. The paused polymerases may transition to active elongation through pause release, or they may ultimately terminate transcription with release of the small RNA species. Pause release and subsequent elongation occur through recruitment and activation of positive transcription elongation factor b (P-TEFb), which phosphorylates the paused polymerase and its associated pause control factors. P-TEFb can be brought to these sites in the form of a large complex called the super elongation complex (SEC) (Luo et al., 2012a; Smith et al., 2011b). Additional complexes, such as PAFc, also contribute to the regulation of elongation (Jaehning, 2010). Transcription factors such as c-Myc stimulate P-TEFb-mediated

Figure 1. Transcriptional Regulation


(A) Formation of a preinitiation complex. Transcription factors bind to specic DNA elements (enhancers) and to coactivators, which bind to RNA polymerase II, which in turn binds to general transcription factors at the transcription start site (arrow). The DNA loop formed between the enhancer and the start site is stabilized by cofactors such as the Mediator complex and cohesin. (B) Initiation and pausing by RNA polymerase II. RNA polymerase II begins transcription from the initiation site, but pause control factors cause it to stall some tens of base pairs downstream. (C) Pause release and elongation. Various transcription factors and cofactors recruit elongation factors such as P-TEFb, which phosphorylates the pause release factors and polymerase, allowing elongation to proceed. (D) Chromatin structure is regulated by ATP-dependent remodeling complexes that can mobilize the nucleosome, allowing regulators and the transcription apparatus increased access to DNA sequences. (E) Transcriptional activity is inuenced by proteins that modify and bind the histone components of nucleosomes. Some proteins add modications (writers), some remove modications (erasers), and others bind via these modications (readers). The modications include acetylation (Ac), methylation (Me), phosphorylation (P), sumoylation (Su), and ubiquitination (Ub). (F) Histone modications occur in characteristic patterns associated with different transcriptional activities. As an example, the characteristic patterns observed at actively transcribed genes are shown for histone H3 lysine 27 acetylation (H3K27Ac), histone H3 lysine 4 trimethylation (H3K4me3), histone H3 lysine 79 dimethylation (H3K79me2), and histone H3 lysine 36 trimethylation (H3K36me3).

release of RNA polymerase II from these pause sites and thus contribute to the control of transcription elongation (Rahl et al., 2010). Recent studies have provided new insights into cofactors that play important roles in DNA loop formation and maintenance, which are key to proper gene control. During transcription initiation, the DNA loop formed between enhancers and core promoter elements is stabilized by cohesin, which is recruited by the NIPBL cohesin-loading protein that is associated with Mediator (Kagey et al., 2010). The cohesin complex has circular

dimensions capable of encircling two nucleosome-bound molecules of DNA. Reducing the levels of cohesin or NIPBL has the same adverse effect on transcription as reducing the levels of Mediator, so these cofactors apparently play a similarly important role in gene activity (Kagey et al., 2010). Although cohesin is recruited to active promoters, it also becomes associated with the DNA-binding factor CTCF, which has been implicated in formation of insulator elements. Thus, cohesin is thought to have roles in transcription activation at some genes and in silencing at others (Dorsett, 2011; Hadjur et al., 2009; Parelho
Cell 152, March 14, 2013 2013 Elsevier Inc. 1239

Figure 2. Master Transcriptional Regulators and Reprogramming Factors


Transcription factors that have dominant roles in the control of specic cell states and that are capable of reprogramming cell states when ectopically expressed in various cell types (Buganim et al., 2012; Davis et al., 1987; Huang et al., 2011; Ieda et al., 2010; Kajimura et al., 2009; Marro et al., 2011; Pang et al., 2011; Sekiya and Suzuki, 2011; Takahashi and Yamanaka, 2006; Vierbuchen et al., 2010; Xie et al., 2004; Zhou et al., 2008).

et al., 2008; Phillips and Corces, 2009; Schmidt et al., 2010; Seitan and Merkenschlager, 2012; Wendt et al., 2008). The fundamental unit of chromatin, the nucleosome, is regulated by protein complexes that can mobilize the nucleosome or modify its histone components (Figure 1). Gene activation is accompanied by recruitment of ATP-dependent chromatin remodeling complexes of the SWI/SNF family, which mobilize nucleosomes to facilitate access of the transcription apparatus and its regulators to DNA (Clapier and Cairns, 2009; Hargreaves and Crabtree, 2011). In addition, there is recruitment, by transcription factors and the transcription apparatus, of an array of histone-modifying enzymes that acetylate, methylate, ubiqutinylate, and otherwise chemically modify nucleosomes in a stereotypical fashion across the span of each active gene (Bannister and Kouzarides, 2011; Campos and Reinberg, 2009; Gardner et al., 2011; Rando, 2012; Zhu et al., 2013). These modications provide interaction surfaces for protein complexes that contribute to transcriptional control. Enzymes that remove these modications are also typically present at the active genes, producing a highly dynamic process of chromatin modication as RNA polymerase is recruited and goes through the various steps of initiation and elongation of the RNA species. Repressed genes are embedded in chromatin with modications that are characteristic of specic repression mechanisms (Beisel and Paro, 2011; Cedar and Bergman, 2012; Jones, 2012; Moazed, 2009; Reyes-Turcu and Grewal, 2012). One type of repressed chromatin, which contains nucleosome modications generated by the Polycomb complex (e.g., histone H3K27me3), is found at genes that are silent but poised for activation at some later stage of development and differentiation (Orkin and Hochedlinger, 2011; Young, 2011). Another type of repressed chromatin is found in regions of the genome that are fully silenced, such as that containing retrotransposons and
1240 Cell 152, March 14, 2013 2013 Elsevier Inc.

other repetitive elements (Feng et al., 2010; Lejeune and Allshire, 2011). The mechanisms that silence this latter set of genes can involve both nucleosome modication (e.g., histone H3K9me3) and DNA methylation. Control of Gene Expression Programs The set of genes that are transcribed largely denes the cell. The gene expression program of a specic cell type includes RNA species from genes that are active in most cells (housekeeping genes) and genes that are active predominantly in one or a limited number of cell types (cell-type-specic genes). In ESCs, for example, at least 60% of the protein-coding genes are transcribed into full-length mRNA species, but only a minority are cell-type specic and thus dening for ESCs (Assou et al., 2007). Mammals contain hundreds and possibly thousands of cell types, and most of these have yet to be studied with respect to the set of transcripts they contain. Thus, the terms housekeeping and cell-type specic are relative rather than absolute and have yet to be precisely dened. Furthermore, the transcriptome of specic cells, derived from highthroughput sequencing, does not show a distinct boundary between active and silent genes, but rather shows a broad distribution of RNA levels that ranges from less than one RNA molecule/gene/cell to millions of RNA molecules/gene/cell, and it is not clear what level is functionally sufcient for each RNA species. The particular set of transcription factors that are expressed in any one cell type controls the selective transcription of a subset of genes by RNA polymerase II, thereby producing the gene expression program of the cell. Studies of the transcription factors that are key to establishing and maintaining specic cell states suggest that only a small number of the transcription factors that are expressed in cells are necessary to establish celltype-specic gene expression programs (Figure 2). For example,

Figure 3. Features of Master Transcription Factors of ES Cells that Likely Extend to Other Cell Types
(A) Master transcription factors are expressed at high levels (30,000300,000 molecules/cell) relative to other transcription factors. (B) Master transcription factors dominate control of the gene expression program by forming enhancers that are associated with most active ESC genes. (C) Master transcription factors positively regulate transcription of cell-type-specifying genes and, together with Polycomb group proteins, negatively regulate the expression of genes that specify other cell types. (D) Master transcription factors (circles) positively regulate their own genes (boxes), forming interconnected autoregulatory loops.

although more than half of the 1,200 genes encoding transcription factors show some evidence of transcription in ESCs, only a few of these transcription factors are needed to reprogram a broad range of cell types into induced pluripotent stem cells (iPSCs) with features essentially indistinguishable from ESCs (Graf, 2011; Ng and Surani, 2011; Orkin and Hochedlinger, 2011; Yamanaka, 2012; Yeo and Ng, 2013; Young, 2011). These ESC transcription factors, which include OCT4, SOX2, and NANOG, are expressed at high levels, bind regulatory elements associated with most active ESC genes, are involved in Polycomb-mediated repression of genes that specify other cell types, and positively regulate their own gene expression through interconnected autoregulatory loops (Figure 3) (Young, 2011). Activation of these endogenous interconnected autoregulatory loops may be key to cellular reprogramming by introduction of exogenous transcription factors. Other cell types express celltype-specic, or lineage-specic, master transcription factors

that are likely to share these key properties of the ESC master transcription factors. Most of the transcription factors that are key to control of cell state and that can act as reprogramming factors are thought to control transcription initiation at the genes they regulate. For example, the ESC transcription factors OCT4 and NANOG bind to the P300 and Mediator coactivators (Chen et al., 2008; Kagey et al., 2010), which can then drive the formation of open chromatin and recruitment of the transcription apparatus. Similarly, many of the transcription factors that can reprogram or transdifferentiate cells, including MYOD, C/EBPb, HNF1a, HNF4a, BRN2, and GATA4, bind to at least one of these coactivators (Borggrefe and Yue, 2011). Recent studies have revealed that certain transcription factors can exert a broad effect on the gene expression programs of cells through elongation control (Figure 4). The c-Myc transcription factor can stimulate increased elongation from essentially
Cell 152, March 14, 2013 2013 Elsevier Inc. 1241

stromal cells, allowing expression of the broad spectrum of selfantigens necessary to induce immune tolerance (Abramson  et al., 2010; Giraud et al., 2012; Oven et al., 2007; Zumer et al., 2011). In hematopoiesis, the TIF1g transcription factor controls erythroid cell fate by interacting with P-TEFb and regulating transcription elongation at a specic set of target genes (Bai et al., 2010). Development generally appears to be dependent on proper elongation control; the transcription elongation factor Tcea3 (TFIIS) contributes to the ability of ESCs to respond appropriately to differentiation cues (Park et al., 2013), and mutations in the P-TEFb repressor HEXIM cause gross developmental defects (Nguyen et al., 2012). Summary of Recent Advances in Gene Regulation The key themes that have emerged from recent studies in transcriptional control and that are highlighted here are the following. Sequence variation in enhancers plays an important role in misregulation of gene expression and disease. A small number of key transcription factors dominate control of gene expression programs. Some transcription factors regulate transcription initiation, whereas other factors control elongation, and factors that control this latter step can have profound effects on cell state. The Mediator coactivator complex integrates signals from diverse regulators and recruits cohesin complexes to active genes, which, in turn, contributes to both chromatin looping and gene activity. Diverse chromatin regulators mobilize nucleosomes and dynamically modify nucleosomes during active gene transcription and in gene silencing, and some chromatin regulators are regulated by lncRNAs. These advances in our understanding of sequences involved in gene control, transcriptional circuitry, the transcription apparatus, and chromatin regulation have led to new insights into the mechanisms involved in misregulation of gene expression in various human diseases and disorders. We discuss some of these below. Misregulated Gene Expression in Disease Many diseases and syndromes are associated with mutations in regulatory regions and in transcription factors, cofactors, chromatin regulators and noncoding RNAs (Table S1 available online). These mutations can contribute to cancer, autoimmunity, neurological disorders, developmental syndromes, diabetes, cardiovascular disease, and obesity, among others. We highlight here several insights into disease mechanisms that have emerged from advances in our understanding of gene regulation. Cancer Recent studies have highlighted the link between diseaseassociated variants in regulatory DNA and breast cancer (Jiang et al., 2011), prostate cancer (Demichelis et al., 2012), colorectal cancer (Lubbe et al., 2012), renal cancer (Schodel et al., 2012), lung cancer (Liu et al., 2011), nasopharyngeal cancer (Yew et al., 2012), and melanoma (Huang et al., 2013; Horn et al., 2013). The genome instability that is a hallmark of cancer almost certainly contributes to further alter sequences in regulatory regions that can promote tumor progression. Mutations in transcription factors have long been known to contribute to tumorigenesis, and recent studies indicate that overexpressed oncogenic transcription factors can alter the core autoregulatory circuitry of the cell. The oncogenic

Figure 4. Global Alterations in Gene Expression Programs through Transcription Elongation


(A) Transcriptional amplication of the gene expression program. The transcription factor c-Myc stimulates increased elongation of most actively transcribed genes, producing increased levels of transcripts for most genes in the gene expression program of the cell. (B) Expanded pause release extends the gene expression program. In some cells, RNA polymerase will initiate transcription at some genes but fails to transition to elongation. AIRE stimulates pause release for many of these initiated genes, thus producing transcripts for many genes that are normally expressed only in peripheral tissues. (C) Specic pause release. Some elongation factors stimulate pause release at specic sets of genes that are important for a particular cells function.

the entire active gene expression program in diverse cell types (Lin et al., 2012; Nie et al., 2012; Rahl et al., 2010). The transcription factor AIRE functions to expand the set of genes that undergo RNA polymerase II pause release in specialized thymic
1242 Cell 152, March 14, 2013 2013 Elsevier Inc.

transcription factor TAL1, which is overexpressed in approximately half of the cases of T cell acute lymphoblastic leukemia (T-ALL), forms an interconnected autoregulatory loop with several key transcription factor partners, and this circuitry contributes to the sustained activation of TAL1-regulated oncogenic program (Sanda et al., 2012). Thus, high levels of TAL1 produce a modied autoregulatory circuitry that drives the oncogenic program in T-ALL. Most tumor cells depend on the transcription factor c-Myc for their growth and proliferation (Littlewood et al., 2012). MYC is the most frequently amplied oncogene, and the elevated expression of its gene product is associated with tumor aggression and poor clinical outcome. Elevated levels of c-Myc can promote tumorigenesis in a wide range of tissues. In tumor cells expressing high levels of c-Myc, the transcription factor accumulates in the promoter regions of most active genes, recruits the transcription elongation factor P-TEFb, and causes transcriptional amplication, producing increased levels of transcripts within the cells gene expression program (Lin et al., 2012; Nie et al., 2012). Thus, rather than binding and regulating a new set of genes when overexpressed, c-Myc amplies the output of the existing gene expression program (Figure 4). These results suggest that transcriptional amplication reduces rate-limiting constraints for tumor cell growth and proliferation. Mutations in the Mediator coactivator complex have recently been implicated in the development of various tumors. Uterine leiomyomas, or broids, are benign tumors that affect millions of women. The MED12 gene is altered in the majority of uterine leiomyomas, and its expression is absent in many uterine leiomyosarcomas, the malignant counterparts of leiomyomas kinen et al., 2011a, 2011b). MED12 mutations also occur (Ma frequently in prostate cancer (Barbieri et al., 2012). MED12 is part of the CDK module of the Mediator complex, and the CDK8 subunit of this module has been reported to act as an oncogene in both colon cancer and melanoma (Firestein et al., 2008; Kapoor et al., 2010; Morris et al., 2008). Mediator has roles in gene activation and repression and can function both in transcription initiation and elongation, so further study is needed to establish how Mediator mutations contribute to these tumors. Mediator-associated NIPBL recruits cohesin. Alterations in cohesin expression and function have been noted in some cancer cells, and there is speculation that cohesin misregulation may also contribute to development of various cancers, but direct evidence for a role of cohesin in cancer remains to be established (Mannini and Musio, 2011; Xu et al., 2011). Mutations in a variety of chromatin regulators have been implicated in development of cancer cells, and the normal functions of these regulators provide some clues to the mechanisms involved in altered gene expression. Loss-of-function mutations in several nucleosome remodeling proteins, including ARID1A, SMARCA4 (BRG1), and SMARCB1 (INI1), are associated with multiple types of cancer (Dawson and Kouzarides, 2012; Hargreaves and Crabtree, 2011; Tsai and Baylin, 2011; Wilson and Roberts, 2011), suggesting that defects in mobilizing nucleosomes near the promoters of active genes are involved. Similarly, various mutations in the Polycomb components EZH2 and SUZ12 and in the DNA methylation apparatus occur in multiple cancers, suggesting that, in these instances, it is the

loss of proper gene silencing that contributes to tumorigenesis (Cedar and Bergman, 2012; Jones, 2012; Margueron and Reinberg, 2011; Mills, 2010). The majority of malignant melanomas overexpress SETDB1, a histone H3K9 methyltransferase that can contribute to gene activation or silencing, and this causes deregulation of HOX genes and accelerates melanoma (Ceol et al., 2011). Gene fusions with the chromatin regulator MLL in leukemias are now known to alter transcription elongation (Luo et al., 2012b; Marschalek, 2010; Slany, 2009; Smith et al., 2011a). Several translocation partners of MLL are components of a SEC that includes P-TEFb and ELL proteins, which have also been shown to control transcription elongation (Lin et al., 2010, 2011; Luo et al., 2012a; Smith et al., 2011b). It is thought that translocation of any of the SEC subunits to the amino-terminal domain of MLL abnormally stabilizes the localization of the SEC at MLL target genes, including HOXA9 and HOXA10, which leads to excessive stimulation of RNA polymerase II into productive elongation at these genomic loci, which in turn contributes to aggressive acute leukemia. Specic lncRNAs have recently been implicated in cancer progression. The ANRIL lncRNA mediates transcriptional repression of members of the INK4A/ARF/INK4B locus, which encode tumor suppressors whose repression is associated with various cancers (Aguilo et al., 2011; Popov and Gil, 2010). ANRIL functions by recruiting polycomb repressive complexes 1 and 2 (PRC1 and PRC2), and misregulation of ANRIL may lead to abnormal silencing of tumor suppressors and thus contribute to cancer progression (Kotake et al., 2011; Yap et al., 2010). Interestingly, genome-wide association studies have identied numerous polymorphisms that affect the expression and processing of ANRIL and are associated with increased susceptibility to an increasing variety of disease states, including multiple types of cancer, coronary artery disease, and type 2 diabetes (Pasmant et al., 2011; Harismendy et al., 2011). Autoimmunity and Inammation Mutations in the autoimmune regulator (AIRE) protein cause type I autoimmune polyendocrinopathy syndrome. AIRE is a transcription factor whose role in promoting transcriptional elongation at genes with paused RNA polymerase II in the thymus explains why loss of AIRE function leads to autoimmune disease. Self-reactive T cells are normally eliminated during maturation in the thymus due to the specialized ability of thymic stromal cells and, in particular, medullary epithelial cells (MECs) to transcribe a large repertoire of genes encoding peripheral tissue antigens (Kyewski and Klein, 2006). This ectopic gene expression is controlled in a large part by AIRE, which is expressed almost exclusively in MECs. Mice and humans with an AIRE gene defect express only a fraction of the peripheral tissue antigens and develop immune inltrates and autoantibodies directed at multiple peripheral tissues (Akirav et al., 2011; Gardner et al., 2009; Mathis and Benoist, 2009; Metzger and Anderson, 2011). Recent studies have shown that AIRE interacts with P-TEFb and inuences transcription elongation in primary MECs (Abramson et al., 2010; Giraud et al., 2012; Oven et al.,  2007; Zumer et al., 2011). AIRE is physically associated with all the active genes in MECs but has its greatest effect on genes that do not experience pause release in its absence (Figure 4).
Cell 152, March 14, 2013 2013 Elsevier Inc. 1243

These results are consistent with the idea that AIRE causes the release of RNA polymerase II molecules that are nonproductively paused at the promoters of a broad spectrum of genes that are otherwise expressed only in peripheral tissues. Thus, in MECs, AIREs function is to expand the set of genes that undergoes RNA polymerase II pause release. Misregulation of the immune response transcriptional regulator NF-kB has been linked to inammatory and autoimmune diseases, improper immune development, and cancer. NF-kB is found in most cell types and is involved in cellular responses to stimuli such as infection and stress (Hayden and Ghosh, 2012). The transcription factor controls genes involved in inammation and is chronically active in inammatory diseases such as inammatory bowel disease, arthritis, sepsis, gastritis, asthma, and atherosclerosis. Although most research into the mechanism of transcriptional activation by this and other regulators has focused on coactivator recruitment, evidence that NF-kB interacts with BRD4 and P-TEFb suggests that this ubiquitous regulator plays a role in elongation control at inammatory genes during immune and stress responses (Barboric et al., 2001; Huang et al., 2009; Nowak et al., 2008). This view is supported by evidence that inhibitors of BRD4, which contributes to recruiting active P-TEFb, suppress expression of key inammatory genes in activated macrophages and confer protection against lipopolysaccharide-induced endotoxic shock and bacteriainduced sepsis (Nicodeme et al., 2010). Developmental Disorders: Neurological Mutations in various components of the Mediator coactivator have been linked to a variety of neurological disorders and other developmental deciencies (Ding et al., 2008; Goh and Grants, 2012; Hashimoto et al., 2011; Kaufmann et al., 2010; Leal et al., 2009; Philibert et al., 2007; Risheg et al., 2007; Rump et al., 2011; Schwartz et al., 2007; Zhou et al., 2012a). Mutations in MED23 alter the interaction between enhancer-bound transcription factors and Mediator, leading to transcriptional dysregulation of mitogen-responsive immediate-early genes that affect brain development and plasticity. A similar defect in immediate-early gene expression is observed in cells from patients with another intellectual disability, Opitz-Kaveggia syndrome, which is caused by MED12 mutations. It would not be surprising to nd that additional Mediator mutations contribute to neurological disorders, given the role of this coactivator in integrating information from transcriptional activators, repressors, and signaling pathways. Heterozygous germline mutations in components of the SWI/ SNF chromatin remodeling complex were recently identied in patients with various neurological syndromes whose common features are severe intellectual disability and speech delay (Hoyer et al., 2012; Santen et al., 2012a, 2012b; Tsurusaki et al., 2012; Van Houdt et al., 2012). These mutations were found in SMARCB1, SMARCA4, SMARCA2, SMARCE1, ARID1A, and ARID1B. It has been suggested that up to 3% of unexplained intellectual disability may be caused by mutations in genes encoding SWI/SNF components (Santen et al., 2012b). It is interesting to note that ARID1B component of human SWI/SNF interacts with elongin C (Li et al., 2010), a component of the SIII transcription elongation factor, which enhances transcription elongation by suppressing transient pausing of RNA
1244 Cell 152, March 14, 2013 2013 Elsevier Inc.

polymerase II (Aso et al., 1995). Thus, alterations in SWI/SNF complexes have the potential to affect both chromatin remodeling and transcription elongation. Developmental Disorders: Cohesinopathies Cohesinopathies are characterized by a wide variety of developmental defects, including growth and mental retardation, limb deformities, and craniofacial anomalies (Bose and Gerton, 2010; Liu and Krantz, 2008). This broad spectrum of phenotypes is now thought to be due to reduced cohesin loading and cohesin function in gene expression during development. A variety of cohesinopathies have been described, including Cornelia de Lange syndrome and Roberts syndrome, in which patients have mutations in the cohesin loading protein NIPBL or the proteins that constitute the cohesin complex. With recent evidence for roles of cohesin complexes in regulation of gene expression and DNA looping (Kagey et al., 2010; Kawauchi et al., 2009; Liu et al., 2009; Schaaf et al., 2009; Seitan et al., 2011), it has become apparent that these deciencies lead to defects in transcriptional regulation and probably to the overall structure of chromatin in the nucleus of disease cells. Diabetes Diabetes mellitus is a group of metabolic diseases in which a person has elevated blood sugar either because the pancreas fails to produce adequate amounts of insulin or because cells do not respond properly to the insulin that is produced. Mutations in pancreatic master transcription factors and the sequences they bind have been implicated in diabetes. The gene expression programs of pancreatic cells appear to be controlled by a small set of key transcription factors, including HNF1a, HNF1b, HNF4a, PDX1, and NEUROD1, some of which contribute to the interconnected autoregulatory circuitry of these cells (Odom et al., 2004). Mutations in any of these factors can result in various forms of maturity-onset diabetes of the young (MODY) (Maestro et al., 2007; Malecki, 2005). These mutations almost certainly have a deleterious effect on the interconnected autoregulatory circuitry formed by these factors and their target genes. The frequency of SNPs that are linked to defects in glucose homeostasis and diabetes is greatly enriched in the binding sites for these transcription factors (Maurano et al., 2012). This observation indicates that perturbations that affect the regulatory circuitry of pancreatic cells may contribute to diabetes. It also suggests that previously undiscovered regulatory networks and network architectures may be uncovered by incorporating information about disease-associated genetic variants and knowledge of the binding sites of diverse transcription factors. Cardiovascular Disease Misregulated development of the cardiovascular system is among the most common class of congenital birth defects, and diseases of the cardiovascular system are among the most prevalent clinical issues for adult populations (Bruneau, 2008; Kathiresan and Srivastava, 2012; Roger et al., 2012). It is well established that loss-of-function mutations in certain transcription factors cause various cardiovascular deciencies (Table S1), but new studies have highlighted the role that mutations in ncRNA species can play in cardiovascular diseases. Specic miRNAs have been implicated in both the promotion and inhibition of differentiation into cardiac lineages, cardiac hypertrophy, vascular differentiation, and erythropoiesis

(Han et al., 2011; Papageorgiou et al., 2012; Small and Olson, 2011). MicroRNAs have also been linked to causative and protective roles for multiple types of cardiovascular disease, including arrhythmia, brosis, hypertrophy due to high pressure, and misregulation of cardiac energy metabolism (Callis et al., ` et al., 2007; Grueter et al., 2012; Thum et al., 2008; 2009; Care van Rooij et al., 2007, 2008; Yang et al., 2007). MicroRNAs are thought to ne-tune gene expression, and thus, the alterations in these cases are thought to lead to deciencies in ne-tuning the cardiovascular gene expression program. Summary and Outlook Several concepts have emerged from recent studies of gene expression programs in healthy and in diseased cells. Genetic variation may contribute to disease largely through misregulation of gene expression. Mutations in the transcription factors that control cell state may impact the autoregulatory loops that are at the core of cellular regulatory circuitry, leading to the loss of a normal healthy cell state. Some transcription factors control RNA polymerase II pause release and elongation and, when their expression or function is altered, can produce aggressive tumor cells (c-Myc) or some forms of autoimmunity (AIRE). Mutations in the coactivator complexes that integrate information from many transcription factors and contribute to DNA looping can cause a broad spectrum of developmental diseases. Alterations in specic chromatin regulators can contribute to development of cancer and many other diseases. Misregulation of noncoding RNAs can also contribute to disease. Additional insights into the role of transcriptional misregulation in human disease will require improved genome annotation, knowledge of the DNA sequences whose alterations contribute to disease, identication of the key transcriptional regulators of all cells of medical relevance, and further understanding of the roles of cofactors, chromatin regulators, and ncRNAs. It is essential to improve human genome annotation in order to more fully understand gene expression programs and their regulation and, thus, gene misregulation in disease. As a rst step, it would be ideal to identify the complete set of protein-coding and noncoding genes and to ascertain which of these are actively transcribed in specic cell types. There are considerable challenges associated with dening the complete set of expressed genes in any one type of mammalian cell. Such characterization has traditionally required large numbers of cells, and for most primary cell types, it is challenging to obtain a homogeneous population of cells. Whereas protein-coding genes can be recognized, at least in part, by the presence of a coding sequence, it is challenging to produce a complete and accurate annotation of ncRNA genes due to limitations in the read length of widely used sequence technologies and the short lifetime of many ncRNAs. Nonetheless, recent studies have identied a vast number and variety of ncRNAs in human cells, so there is promise that improved human genome annotation is at hand (Djebali et al., 2012). Furthermore, new technologies allow investigators to monitor RNA polymerase II molecules that are actively engaged in transcription (Core et al., 2008). Such approaches have recently provided evidence that most lncRNA species are the product of divergent transcription from the promoters of active protein coding genes (Sigova et al., 2013). This suggests

that most active protein-coding genes in humans are actually divergently transcribed mRNA/lncRNA gene pairs. Knowledge of the sequence variation that contributes to disease is being gained at a rapid pace, and this will improve our understanding of disease mechanisms and lead to new approaches to disease diagnosis and therapy. Several lines of evidence suggest that much of genetic variation contributes to disease through misregulation of gene expression. A substantial portion of the genomic sequences that are under positive selection are thought to be regulatory (Grossman et al., 2010). Disease-associated SNPs are enriched in regulatory regions (Ernst et al., 2011; Hindorff et al., 2009; Maurano et al., 2012). Many recent studies have identied links between disease-associated variants in regulatory DNA and a broad spectrum of human diseases, including cancer (Demichelis et al., 2012; Huang et al., 2013; Horn et al., 2013; Jiang et al., 2011; Liu et al., 2011; Lubbe et al., 2012; Schodel et al., 2012; Yew et al., 2012), congenital heart disease (Zhao et al., 2012), inammatory lung disease (Han et al., 2012), multiple sclerosis (Alcina et al., 2013), Alzheimers disease (Gaj et al., 2012), abdominal aortic aneurysm (Bown et al., 2011), amyotrophic lateral sclerosis (Iida et al., 2011), and coronary artery disease (Harismendy et al., 2011). Disease-associated genetic variants in regulatory regions are most often found in regions that are utilized in a cell-type-specic manner and are associated with diseases of the corresponding cell type (Ernst et al., 2011; Maurano et al., 2012). Thus, cell-type-specic enhancer use can explain how genetic variants produce tissue-specic diseases. Disease-associated genetic variants can exist in regulatory regions that are very distant from the genes they control, but knowledge of the nature of loops between such distal enhancers and their target genes can explain how these distant variants affect a specic gene and its biological functions (Li et al., 2012; Maurano et al., 2012). Genetic investigations and reprogramming studies suggest that only a small number of the hundreds of transcription factors that are expressed in cells are essential for establishing and maintaining the regulatory networks that produce specic cell states. If this holds true for most cell types, then it would be ideal to identify the key transcription factors for all cell types of medical relevance. It should be possible to identify these transcription factors if they have features identied for their counterparts in well-studied cells: relative high expression, occupancy of enhancers associated with a large fraction of active genes, and formation of interconnected autoregulatory loops. Discovering how gene expression programs are controlled in many different cell types should lead to further understanding of regulatory circuitry, facilitate cellular reprogramming, and accelerate the new eld of regenerative medicine. Cofactors and chromatin regulators are generally expressed in most cell types, but mutations in these genes often produce diseases or syndromes that exhibit tissue-specic disease phenotypes (Table S1). For example, defects in Mediator subunits contribute to nonsyndromic intellectual disability, Charcot-Marie-Tooth disease, Opitz-Kaveggia and Lujan syndromes, and infantile cerebral and cerebellar atrophy and have been implicated in prostate cancer, deciencies in systemic energy homeostasis (Grueter et al., 2012), and altered
Cell 152, March 14, 2013 2013 Elsevier Inc. 1245

hair-cycling (Nakajima et al., 2013; Oda et al., 2012). Improved understanding of the interactions between cofactors and transcription factors and the mechanisms involved in information integration by these complex apparatuses will be valuable for understanding the mechanisms that produce tissue-specic phenotypes. Similarly, it will be important to further understand the collaboration between the transcription apparatus and chromatin regulators in global control of gene expression programs. Recent efforts to target chromatin regulators for cancer therapy (Dawson et al., 2012) would benet from a fuller understanding of the regulatory mechanisms and pathways that are impacted by these potential therapeutics. Our future understanding of disease and the advance of personalized medicine will benet from models of human transcriptional regulatory circuitry that integrate information about regulatory sequences and the key transcription factors, cofactors, chromatin regulators, and ncRNAs that operate at regulatory sites. The development of these models should thus be among the priorities of biomedical research.
SUPPLEMENTAL INFORMATION Supplemental Information includes one table and can be found with this article online at http://dx.doi.org/10.1016/j.cell.2013.02.014. ACKNOWLEDGMENTS Our description of the themes highlighted in this review beneted from discussions with Karen Adelman, Jay Bradner, Gerald Crabtree, Rudolf Jaenisch, Ian Krantz, Lee Lawton, David Levens, John Lis, Alex Marson, Matthias Merkenschlager, Alan Mullen, Duncan Odom, David Price, Peter Rahl, Robert Roeder, Ali Shilatifard, Phil Sharp, Alla Sigova, Alexander Stark, Dylan Taatjes, Robert Weinberg, and Leonard Zon. We thank David Orlando for help with data collation and analysis. This work was supported by National Institutes of Health grants HG002668 (R.A.Y.) and CA146445 (R.A.Y. and T.I.L.). REFERENCES Abramson, J., Giraud, M., Benoist, C., and Mathis, D. (2010). Aires partners in the molecular control of immunological tolerance. Cell 140, 123135. Adelman, K., and Lis, J.T. (2012). Promoter-proximal pausing of RNA polymerase II: emerging roles in metazoans. Nat. Rev. Genet. 13, 720731. Aguilo, F., Zhou, M.M., and Walsh, M.J. (2011). Long noncoding RNA, polycomb, and the ghosts haunting INK4b-ARF-INK4a expression. Cancer Res. 71, 53655369. Akirav, E.M., Ruddle, N.H., and Herold, K.C. (2011). The role of AIRE in human autoimmune disease. Nat. Rev. Endocrinol. 7, 2533. ndez, O., Saiz, A., Izquierdo, G., Lucas, M., Leyva, Alcina, A., Fedetz, M., Ferna n, J.A., Abad-Grau, Mdel.M., Alloza, I., et al. (2013). Identicaa-Leo L., Garc tion of a functional variant in the KIF5A-CYP27B1-METTL1-FAM119B locus associated with multiple sclerosis. J. Med. Genet. 50, 2533. Aravin, A., Gaidatzis, D., Pfeffer, S., Lagos-Quintana, M., Landgraf, P., Iovino, N., Morris, P., Brownstein, M.J., Kuramochi-Miyagawa, S., Nakano, T., et al. (2006). A novel class of small RNAs bind to MILI protein in mouse testes. Nature 442, 203207. Aso, T., Lane, W.S., Conaway, J.W., and Conaway, R.C. (1995). Elongin (SIII): a multisubunit regulator of elongation by RNA polymerase II. Science 269, 14391443. m, S., Gabelle, A., Marty, S., Nadal, Assou, S., Le Carrour, T., Tondeur, S., Stro me, T., Hugnot, J.P., et al. (2007). A meta-analysis of L., Pantesco, V., Re

human embryonic stem cells transcriptome integrated into a web-based expression atlas. Stem Cells 25, 961973. Bai, X., Kim, J., Yang, Z., Jurynec, M.J., Akie, T.E., Lee, J., LeBlanc, J., Sessa, A., Jiang, H., DiBiase, A., et al. (2010). TIF1gamma controls erythroid cell fate by regulating transcription elongation. Cell 142, 133143. Bannister, A.J., and Kouzarides, T. (2011). Regulation of chromatin by histone modications. Cell Res. 21, 381395. Barbieri, C.E., Baca, S.C., Lawrence, M.S., Demichelis, F., Blattner, M., Theurillat, J.P., White, T.A., Stojanov, P., Van Allen, E., Stransky, N., et al. (2012). Exome sequencing identies recurrent SPOP, FOXA1 and MED12 mutations in prostate cancer. Nat. Genet. 44, 685689. Barboric, M., Nissen, R.M., Kanazawa, S., Jabrane-Ferrat, N., and Peterlin, B.M. (2001). NF-kappaB binds P-TEFb to stimulate transcriptional elongation by RNA polymerase II. Mol. Cell 8, 327337. Bartel, D.P. (2009). MicroRNAs: target recognition and regulatory functions. Cell 136, 215233. Beisel, C., and Paro, R. (2011). Silencing chromatin: comparing modes and mechanisms. Nat. Rev. Genet. 12, 123135. Berk, A.J. (2012). Yin and yang of mediator function revealed by human mutants. Proc. Natl. Acad. Sci. USA 109, 1951919520. Bonasio, R., Tu, S., and Reinberg, D. (2010). Molecular signals of epigenetic states. Science 330, 612616. Borggrefe, T., and Yue, X. (2011). Interactions between subunits of the Mediator complex with gene-specic transcription factors. Semin. Cell Dev. Biol. 22, 759768. Bose, T., and Gerton, J.L. (2010). Cohesinopathies, gene expression, and chromatin organization. J. Cell Biol. 189, 201210. Bown, M.J., Jones, G.T., Harrison, S.C., Wright, B.J., Bumpstead, S., Baas, A.F., Gretarsdottir, S., Badger, S.A., Bradley, D.T., Burnand, K., et al.; CARDIoGRAM Consortium; Global BPgen Consortium; DIAGRAM Consortium; VRCNZ Consortium. (2011). Abdominal aortic aneurysm is associated with a variant in low-density lipoprotein receptor-related protein 1. Am. J. Hum. Genet. 89, 619627. Brennecke, J., Aravin, A.A., Stark, A., Dus, M., Kellis, M., Sachidanandam, R., and Hannon, G.J. (2007). Discrete small RNA-generating loci as master regulators of transposon activity in Drosophila. Cell 128, 10891103. Bruneau, B.G. (2008). The developmental genetics of congenital heart disease. Nature 451, 943948. Buecker, C., and Wysocka, J. (2012). Enhancers as information integration hubs in development: lessons from genomics. Trends Genet. 28, 276284. Buganim, Y., Itskovich, E., Hu, Y.C., Cheng, A.W., Ganz, K., Sarkar, S., Fu, D., Welstead, G.G., Page, D.C., and Jaenisch, R. (2012). Direct reprogramming of broblasts into embryonic Sertoli-like cells by dened factors. Cell Stem Cell 11, 373386. Callis, T.E., Pandya, K., Seok, H.Y., Tang, R.H., Tatsuguchi, M., Huang, Z.P., Chen, J.F., Deng, Z., Gunn, B., Shumate, J., et al. (2009). MicroRNA-208a is a regulator of cardiac hypertrophy and conduction in mice. J. Clin. Invest. 119, 27722786. Campos, E.I., and Reinberg, D. (2009). Histones: annotating chromatin. Annu. Rev. Genet. 43, 559599. ` , A., Catalucci, D., Felicetti, F., Bonci, D., Addario, A., Gallo, P., Bang, Care M.L., Segnalini, P., Gu, Y., Dalton, N.D., et al. (2007). MicroRNA-133 controls cardiac hypertrophy. Nat. Med. 13, 613618. Cedar, H., and Bergman, Y. (2012). Programming of DNA methylation patterns. Annu. Rev. Biochem. 81, 97117. Ceol, C.J., Houvras, Y., Jane-Valbuena, J., Bilodeau, S., Orlando, D.A., , F., et al. (2011). The Battisti, V., Fritsch, L., Lin, W.M., Hollmann, T.J., Ferre histone methyltransferase SETDB1 is recurrently amplied in melanoma and accelerates its onset. Nature 471, 513517. Chen, X., Xu, H., Yuan, P., Fang, F., Huss, M., Vega, V.B., Wong, E., Orlov, Y.L., Zhang, W., Jiang, J., et al. (2008). Integration of external signaling pathways

1246 Cell 152, March 14, 2013 2013 Elsevier Inc.

with the core transcriptional network in embryonic stem cells. Cell 133, 1106 1117. Clapier, C.R., and Cairns, B.R. (2009). The biology of chromatin remodeling complexes. Annu. Rev. Biochem. 78, 273304. Conaway, R.C., and Conaway, J.W. (2011). Function and regulation of the Mediator complex. Curr. Opin. Genet. Dev. 21, 225230. Core, L.J., Waterfall, J.J., and Lis, J.T. (2008). Nascent RNA sequencing reveals widespread pausing and divergent initiation at human promoters. Science 322, 18451848. Davis, R.L., Weintraub, H., and Lassar, A.B. (1987). Expression of a single transfected cDNA converts broblasts to myoblasts. Cell 51, 9871000. Dawson, M.A., and Kouzarides, T. (2012). Cancer epigenetics: from mechanism to therapy. Cell 150, 1227. Dawson, M.A., Kouzarides, T., and Huntly, B.J. (2012). Targeting epigenetic readers in cancer. N. Engl. J. Med. 367, 647657. Demichelis, F., Setlur, S.R., Banerjee, S., Chakravarty, D., Chen, J.Y., Chen, C.X., Huang, J., Beltran, H., Oldridge, D.A., Kitabayashi, N., et al. (2012). Identication of functionally active, low frequency copy number variants at 15q21.3 and 12q21.31 associated with prostate cancer risk. Proc. Natl. Acad. Sci. USA 109, 66866691. Dikstein, R. (2011). The unexpected traits associated with core promoter elements. Transcription 2, 201206. Ding, N., Zhou, H., Esteve, P.O., Chin, H.G., Kim, S., Xu, X., Joseph, S.M., Friez, M.J., Schwartz, C.E., Pradhan, S., and Boyer, T.G. (2008). Mediator links epigenetic silencing of neuronal gene expression with x-linked mental retardation. Mol. Cell 31, 347359. Djebali, S., Davis, C.A., Merkel, A., Dobin, A., Lassmann, T., Mortazavi, A., Tanzer, A., Lagarde, J., Lin, W., Schlesinger, F., et al. (2012). Landscape of transcription in human cells. Nature 489, 101108. Dorsett, D. (2011). Cohesin: genomic insights into controlling gene transcription and development. Curr. Opin. Genet. Dev. 21, 199206. Dunham, I., Kundaje, A., Aldred, S.F., Collins, P.J., Davis, C.A., Doyle, F., Epstein, C.B., Frietze, S., Harrow, J., Kaul, R., et al.; ENCODE Project Consortium. (2012). An integrated encyclopedia of DNA elements in the human genome. Nature 489, 5774. Ebert, M.S., and Sharp, P.A. (2012). Roles for microRNAs in conferring robustness to biological processes. Cell 149, 515524. Ernst, J., Kheradpour, P., Mikkelsen, T.S., Shoresh, N., Ward, L.D., Epstein, C.B., Zhang, X., Wang, L., Issner, R., Coyne, M., et al. (2011). Mapping and analysis of chromatin state dynamics in nine human cell types. Nature 473, 4349. Feng, S., Jacobsen, S.E., and Reik, W. (2010). Epigenetic reprogramming in plant and animal development. Science 330, 622627. Firestein, R., Bass, A.J., Kim, S.Y., Dunn, I.F., Silver, S.J., Guney, I., Freed, E., Ligon, A.H., Vena, N., Ogino, S., et al. (2008). CDK8 is a colorectal cancer oncogene that regulates beta-catenin activity. Nature 455, 547551. Fuda, N.J., Ardehali, M.B., and Lis, J.T. (2009). Dening mechanisms that regulate RNA polymerase II transcription in vivo. Nature 461, 186192. Gaj, P., Paziewska, A., Bik, W., Da browska, M., Baranowska-Bik, A., _ Styczynska, M., Chodakowska-Zebrowska, M., Pfeffer-Baczuk, A., Barcikowska, M., Baranowska, B., and Ostrowski, J. (2012). Identication of a late onset Alzheimers disease candidate risk variant at 9q21.33 in Polish patients. J. Alzheimers Dis. 32, 157168. Gardner, J.M., Fletcher, A.L., Anderson, M.S., and Turley, S.J. (2009). AIRE in the thymus and beyond. Curr. Opin. Immunol. 21, 582589. Gardner, K.E., Allis, C.D., and Strahl, B.D. (2011). Operating on chromatin, a colorful language where context matters. J. Mol. Biol. 409, 3646. Girard, A., Sachidanandam, R., Hannon, G.J., and Carmell, M.A. (2006). A germline-specic class of small RNAs binds mammalian Piwi proteins. Nature 442, 199202. Giraud, M., Yoshida, H., Abramson, J., Rahl, P.B., Young, R.A., Mathis, D., and Benoist, C. (2012). Aire unleashes stalled RNA polymerase to induce ectopic

gene expression in thymic epithelial cells. Proc. Natl. Acad. Sci. USA 109, 535540. Goh, Y.S., and Grants, J.M. (2012). Mutations in the Mediator subunit MED23 link intellectual disability to immediate early gene regulation. Clin. Genet. 81, 430. Goodrich, J.A., and Tjian, R. (2010). Unexpected roles for core promoter recognition factors in cell-type-specic transcription and gene regulation. Nat. Rev. Genet. 11, 549558. Graf, T. (2011). Historical origins of transdifferentiation and reprogramming. Cell Stem Cell 9, 504516. Grossman, S.R., Shlyakhter, I., Karlsson, E.K., Byrne, E.H., Morales, S., Frieden, G., Hostetter, E., Angelino, E., Garber, M., Zuk, O., et al. (2010). A composite of multiple signals distinguishes causal variants in regions of positive selection. Science 327, 883886. Grueter, C.E., van Rooij, E., Johnson, B.A., DeLeon, S.M., Sutherland, L.B., Qi, X., Gautron, L., Elmquist, J.K., Bassel-Duby, R., and Olson, E.N. (2012). A cardiac microRNA governs systemic energy homeostasis by regulation of MED13. Cell 149, 671683. Hadjur, S., Williams, L.M., Ryan, N.K., Cobb, B.S., Sexton, T., Fraser, P., Fisher, A.G., and Merkenschlager, M. (2009). Cohesins form chromosomal cis-interactions at the developmentally regulated IFNG locus. Nature 460, 410413. Han, M., Toli, J., and Abdellatif, M. (2011). MicroRNAs in the cardiovascular system. Curr. Opin. Cardiol. 26, 181189. Han, Y.J., Ma, S.F., Wade, M.S., Flores, C., and Garcia, J.G. (2012). An intronic MYLK variant associated with inammatory lung disease regulates promoter activity of the smooth muscle myosin light chain kinase isoform. J. Mol. Med. 90, 299308. Hargreaves, D.C., and Crabtree, G.R. (2011). ATP-dependent chromatin remodeling: genetics, genomics and mechanisms. Cell Res. 21, 396420. Harismendy, O., Notani, D., Song, X., Rahim, N.G., Tanasa, B., Heintzman, N., Ren, B., Fu, X.D., Topol, E.J., Rosenfeld, M.G., and Frazer, K.A. (2011). 9p21 DNA variants associated with coronary artery disease impair interferon-g signalling response. Nature 470, 264268. Harrow, J., Frankish, A., Gonzalez, J.M., Tapanari, E., Diekhans, M., Kokocinski, F., Aken, B.L., Barrell, D., Zadissa, A., Searle, S., et al. (2012). GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 22, 17601774. Hashimoto, S., Boissel, S., Zarhrate, M., Rio, M., Munnich, A., Egly, J.M., and Colleaux, L. (2011). MED23 mutation links intellectual disability to dysregulation of immediate early gene expression. Science 333, 11611163. Hayden, M.S., and Ghosh, S. (2012). NF-kB, the rst quarter-century: remarkable progress and outstanding questions. Genes Dev. 26, 203234. Hindorff, L.A., Sethupathy, P., Junkins, H.A., Ramos, E.M., Mehta, J.P., Collins, F.S., and Manolio, T.A. (2009). Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl. Acad. Sci. USA 106, 93629367. Ho, L., and Crabtree, G.R. (2010). Chromatin remodelling during development. Nature 463, 474484. Horn, S., Figl, A., Rachakonda, P.S., Fischer, C., Sucker, A., Gast, A., Kadel, S., Moll, I., Nagore, E., Hemminki, K., et al. (2013). TERT promoter mutations in familial and sporadic melanoma. Science 339, 959961. Hoyer, J., Ekici, A.B., Endele, S., Popp, B., Zweier, C., Wiesener, A., Wohlleber, E., Dufke, A., Rossier, E., Petsch, C., et al. (2012). Haploinsufciency of ARID1B, a member of the SWI/SNF-a chromatin-remodeling complex, is a frequent cause of intellectual disability. Am. J. Hum. Genet. 90, 565572. Huang, B., Yang, X.D., Zhou, M.M., Ozato, K., and Chen, L.F. (2009). Brd4 coactivates transcriptional activation of NF-kappaB via specic binding to acetylated RelA. Mol. Cell. Biol. 29, 13751387. Huang, P., He, Z., Ji, S., Sun, H., Xiang, D., Liu, C., Hu, Y., Wang, X., and Hui, L. (2011). Induction of functional hepatocyte-like cells from mouse broblasts by dened factors. Nature 475, 386389.

Cell 152, March 14, 2013 2013 Elsevier Inc. 1247

Huang, F.W., Hodis, E., Xu, M.J., Kryukov, G.V., Chin, L., and Garraway, L.A. (2013). Highly recurrent TERT promoter mutations in human melanoma. Science 339, 957959. Ieda, M., Fu, J.D., Delgado-Olguin, P., Vedantham, V., Hayashi, Y., Bruneau, B.G., and Srivastava, D. (2010). Direct reprogramming of broblasts into functional cardiomyocytes by dened factors. Cell 142, 375386. Iida, A., Takahashi, A., Kubo, M., Saito, S., Hosono, N., Ohnishi, Y., Kiyotani, K., Mushiroda, T., Nakajima, M., Ozaki, K., et al. (2011). A functional variant in ZNF512B is associated with susceptibility to amyotrophic lateral sclerosis in Japanese. Hum. Mol. Genet. 20, 36843692. Jacob, F., and Monod, J. (1961). Genetic regulatory mechanisms in the synthesis of proteins. J. Mol. Biol. 3, 318356. Jaehning, J.A. (2010). The Paf1 complex: platform or player in RNA polymerase II transcription? Biochim. Biophys. Acta 1799, 379388. Jiang, Y., Shen, H., Liu, X., Dai, J., Jin, G., Qin, Z., Chen, J., Wang, S., Wang, X., Hu, Z., and Shen, H. (2011). Genetic variants at 1p11.2 and breast cancer risk: a two-stage study in Chinese women. PLoS ONE 6, e21563. Jones, P.A. (2012). Functions of DNA methylation: islands, start sites, gene bodies and beyond. Nat. Rev. Genet. 13, 484492. Juven-Gershon, T., and Kadonaga, J.T. (2010). Regulation of gene expression via the core promoter and the basal transcriptional machinery. Dev. Biol. 339, 225229. Kagey, M.H., Newman, J.J., Bilodeau, S., Zhan, Y., Orlando, D.A., van Berkum, N.L., Ebmeier, C.C., Goossens, J., Rahl, P.B., Levine, S.S., et al. (2010). Mediator and cohesin connect gene expression and chromatin architecture. Nature 467, 430435. Kajimura, S., Seale, P., Kubota, K., Lunsford, E., Frangioni, J.V., Gygi, S.P., and Spiegelman, B.M. (2009). Initiation of myoblast to brown fat switch by a PRDM16-C/EBP-beta transcriptional complex. Nature 460, 11541158. Kapoor, A., Goldberg, M.S., Cumberland, L.K., Ratnakumar, K., Segura, M.F., Emanuel, P.O., Menendez, S., Vardabasso, C., Leroy, G., Vidal, C.I., et al. (2010). The histone variant macroH2A suppresses melanoma progression through regulation of CDK8. Nature 468, 11051109. Kathiresan, S., and Srivastava, D. (2012). Genetics of human cardiovascular disease. Cell 148, 12421257. Kaufmann, R., Straussberg, R., Mandel, H., Fattal-Valevski, A., Ben-Zeev, B., Naamati, A., Shaag, A., Zenvirt, S., Konen, O., Mimouni-Bloch, A., et al. (2010). Infantile cerebral and cerebellar atrophy is associated with a mutation in the MED17 subunit of the transcription preinitiation mediator complex. Am. J. Hum. Genet. 87, 667670. Kawauchi, S., Calof, A.L., Santos, R., Lopez-Burks, M.E., Young, C.M., Hoang, M.P., Chua, A., Lao, T., Lechner, M.S., Daniel, J.A., et al. (2009). Multiple organ system defects and transcriptional dysregulation in the Nipbl(+/-) mouse, a model of Cornelia de Lange Syndrome. PLoS Genet. 5, e1000650. Kim, T.K., Hemberg, M., Gray, J.M., Costa, A.M., Bear, D.M., Wu, J., Harmin, D.A., Laptewicz, M., Barbara-Haley, K., Kuersten, S., et al. (2010). Widespread transcription at neuronal activity-regulated enhancers. Nature 465, 182187. Kornberg, R.D. (2005). Mediator and the mechanism of transcriptional activation. Trends Biochem. Sci. 30, 235239. Kotake, Y., Nakagawa, T., Kitagawa, K., Suzuki, S., Liu, N., Kitagawa, M., and Xiong, Y. (2011). Long non-coding RNA ANRIL is required for the PRC2 recruitment to and silencing of p15(INK4B) tumor suppressor gene. Oncogene 30, 19561962. Krivega, I., and Dean, A. (2012). Enhancer and promoter interactions-long distance calls. Curr. Opin. Genet. Dev. 22, 7985. Kyewski, B., and Klein, L. (2006). A central role for central tolerance. Annu. Rev. Immunol. 24, 571606. ` re, L., Seizl, M., and Cramer, P. (2012). A structural perspective on Larivie Mediator function. Curr. Opin. Cell Biol. 24, 305313. Latos, P.A., Pauler, F.M., Koerner, M.V., S xenergin, H.B., Hudson, Q.J., Stocsits, R.R., Allhoff, W., Stricker, S.H., Klement, R.M., Warczok, K.E.,

et al. (2012). Airn transcriptional overlap, but not its lncRNA products, induces imprinted Igf2r silencing. Science 338, 14691472. Leal, A., Huehne, K., Bauer, F., Sticht, H., Berger, P., Suter, U., Morera, B., Del Valle, G., Lupski, J.R., Ekici, A., et al. (2009). Identication of the variant Ala335Val of MED25 as responsible for CMT2B2: molecular data, functional studies of the SH3 recognition motif and correlation between wild-type MED25 and PMP22 RNA levels in CMT1A animal models. Neurogenetics 10, 275287. Lee, J.T. (2012). Epigenetic regulation by long noncoding RNAs. Science 338, 14351439. Lejeune, E., and Allshire, R.C. (2011). Common ground: small RNA programming and chromatin modications. Curr. Opin. Cell Biol. 23, 258265. Lelli, K.M., Slattery, M., and Mann, R.S. (2012). Disentangling the many layers of eukaryotic transcriptional regulation. Annu. Rev. Genet. 46, 4368. Li, X.S., Trojer, P., Matsumura, T., Treisman, J.E., and Tanese, N. (2010). Mammalian SWI/SNFa subunit BAF250/ARID1 is an E3 ubiquitin ligase that targets histone H2B. Mol. Cell. Biol. 30, 16731688. Li, G., Ruan, X., Auerbach, R.K., Sandhu, K.S., Zheng, M., Wang, P., Poh, H.M., Goh, Y., Lim, J., Zhang, J., et al. (2012). Extensive promoter-centered chromatin interactions provide a topological basis for transcription regulation. Cell 148, 8498. Lin, C., Smith, E.R., Takahashi, H., Lai, K.C., Martin-Brown, S., Florens, L., Washburn, M.P., Conaway, J.W., Conaway, R.C., and Shilatifard, A. (2010). AFF4, a component of the ELL/P-TEFb elongation complex and a shared subunit of MLL chimeras, can link transcription elongation to leukemia. Mol. Cell 37, 429437. Lin, C., Garrett, A.S., De Kumar, B., Smith, E.R., Gogol, M., Seidel, C., Krumlauf, R., and Shilatifard, A. (2011). Dynamic transcriptional events in embryonic stem cells mediated by the super elongation complex (SEC). Genes Dev. 25, 14861498. n, J., Rahl, P.B., Paranal, R.M., Burge, C.B., Bradner, J.E., Lee, Lin, C.Y., Love T.I., and Young, R.A. (2012). Transcriptional amplication in tumor cells with elevated c-Myc. Cell 151, 5667. Littlewood, T.D., Kreuzaler, P., and Evan, G.I. (2012). All things to all people. Cell 151, 1113. Liu, J., and Krantz, I.D. (2008). Cohesin and human disease. Annu. Rev. Genomics Hum. Genet. 9, 303320. Liu, J., Zhang, Z., Bando, M., Itoh, T., Deardorff, M.A., Clark, D., Kaur, M., Tandy, S., Kondoh, T., Rappaport, E., et al. (2009). Transcriptional dysregulation in NIPBL and cohesin mutant human cells. PLoS Biol. 7, e1000119. Liu, G., Gramling, S., Munoz, D., Cheng, D., Azad, A.K., Mirshams, M., Chen, Z., Xu, W., Roberts, H., Shepherd, F.A., et al. (2011). Two novel BRM insertion promoter sequence variants are associated with loss of BRM expression and lung cancer risk. Oncogene 30, 32953304. Lubbe, S.J., Pittman, A.M., Olver, B., Lloyd, A., Vijayakrishnan, J., Naranjo, S., mez-Skarmeta, J.L., and Houlston, R.S. (2012). Dobbins, S., Broderick, P., Go The 14q22.2 colorectal cancer variant rs4444235 shows cis-acting regulation of BMP4. Oncogene 31, 37773784. Luo, Z., Lin, C., Guest, E., Garrett, A.S., Mohaghegh, N., Swanson, S., Marshall, S., Florens, L., Washburn, M.P., and Shilatifard, A. (2012a). The super elongation complex family of RNA polymerase II elongation factors: gene target specicity and transcriptional output. Mol. Cell. Biol. 32, 2608 2617. Luo, Z., Lin, C., and Shilatifard, A. (2012b). The super elongation complex (SEC) family in transcriptional control. Nat. Rev. Mol. Cell Biol. 13, 543547. Maestro, M.A., Cardalda, C., Boj, S.F., Luco, R.F., Servitja, J.M., and Ferrer, J. (2007). Distinct roles of HNF1beta, HNF1alpha, and HNF4alpha in regulating pancreas development, beta-cell function and growth. Endocr. Dev. 12, 3345. kinen, N., Heinonen, H.R., Moore, S., Tomlinson, I.P., van der Spuy, Z.M., Ma and Aaltonen, L.A. (2011a). MED12 exon 2 mutations are common in uterine leiomyomas from South African patients. Oncotarget 2, 966969. kinen, N., Mehine, M., Tolvanen, J., Kaasinen, E., Li, Y., Lehtonen, H.J., Ma Gentile, M., Yan, J., Enge, M., Taipale, M., et al. (2011b). MED12, the mediator

1248 Cell 152, March 14, 2013 2013 Elsevier Inc.

complex subunit 12 gene, is mutated at high frequency in uterine leiomyomas. Science 334, 252255. Malecki, M.T. (2005). Genetics of type 2 diabetes mellitus. Diabetes Res. Clin. Pract. 68(Suppl 1), S10S21. Malik, S., and Roeder, R.G. (2010). The metazoan Mediator co-activator complex as an integrative hub for transcriptional regulation. Nat. Rev. Genet. 11, 761772. Mannini, L., and Musio, A. (2011). The dark side of cohesin: the carcinogenic point of view. Mutat. Res. 728, 8187. Margueron, R., and Reinberg, D. (2011). The Polycomb complex PRC2 and its mark in life. Nature 469, 343349. dhof, T.C., Marro, S., Pang, Z.P., Yang, N., Tsai, M.C., Qu, K., Chang, H.Y., Su and Wernig, M. (2011). Direct lineage conversion of terminally differentiated hepatocytes to functional neurons. Cell Stem Cell 9, 374382. Marschalek, R. (2010). Mixed lineage leukemia: roles in human malignancies and potential therapy. FEBS J. 277, 18221831. Mathis, D., and Benoist, C. (2009). Aire. Annu. Rev. Immunol. 27, 287312. Maurano, M.T., Humbert, R., Rynes, E., Thurman, R.E., Haugen, E., Wang, H., Reynolds, A.P., Sandstrom, R., Qu, H., Brody, J., et al. (2012). Systematic localization of common disease-associated variation in regulatory DNA. Science 337, 11901195. Metzger, T.C., and Anderson, M.S. (2011). Control of central and peripheral tolerance by Aire. Immunol. Rev. 241, 89103. Mills, A.A. (2010). Throwing the cancer switch: reciprocal roles of polycomb and trithorax proteins. Nat. Rev. Cancer 10, 669682. Moazed, D. (2009). Small RNAs in transcriptional gene silencing and genome defence. Nature 457, 413420. Morris, E.J., Ji, J.Y., Yang, F., Di Stefano, L., Herr, A., Moon, N.S., Kwon, E.J., a r, A.M., and Dyson, N.J. (2008). E2F1 represses beta-catenin Haigis, K.M., Na transcription and is antagonized by both pRB and CDK8. Nature 455, 552556. Nakajima, T., Inui, S., Fushimi, T., Noguchi, F., Kitagawa, Y., Reddy, J.K., and Itami, S. (2013). Roles of MED1 in quiescence of hair follicle stem cells and maintenance of normal hair cycling. J. Invest. Dermatol. 133, 354360. Ng, H.H., and Surani, M.A. (2011). The transcriptional and signalling networks of pluripotency. Nat. Cell Biol. 13, 490496. Nguyen, D., Krueger, B.J., Sedore, S.C., Brogie, J.E., Rogers, J.T., Rajendra, T.K., Saunders, A., Matera, A.G., Lis, J.T., Uguen, P., and Price, D.H. (2012). The Drosophila 7SK snRNP and the essential role of dHEXIM in development. Nucleic Acids Res. 40, 52835297. Nicodeme, E., Jeffrey, K.L., Schaefer, U., Beinke, S., Dewell, S., Chung, C.W., Chandwani, R., Marazzi, I., Wilson, P., Coste, H., et al. (2010). Suppression of inammation by a synthetic histone mimic. Nature 468, 11191123. Nie, Z., Hu, G., Wei, G., Cui, K., Yamane, A., Resch, W., Wang, R., Green, D.R., Tessarollo, L., Casellas, R., et al. (2012). c-Myc is a universal amplier of expressed genes in lymphocytes and embryonic stem cells. Cell 151, 6879. Nowak, D.E., Tian, B., Jamaluddin, M., Boldogh, I., Vergara, L.A., Choudhary, S., and Brasier, A.R. (2008). RelA Ser276 phosphorylation is required for activation of a subset of NF-kappaB-dependent genes by recruiting cyclindependent kinase 9/cyclin T1 complexes. Mol. Cell. Biol. 28, 36233638. Oda, Y., Hu, L., Bul, V., Elalieh, H., Reddy, J.K., and Bikle, D.D. (2012). Coactivator MED1 ablation in keratinocytes results in hair-cycling defects and epidermal alterations. J. Invest. Dermatol. 132, 10751083. Odom, D.T., Zizlsperger, N., Gordon, D.B., Bell, G.W., Rinaldi, N.J., Murray, H.L., Volkert, T.L., Schreiber, J., Rolfe, P.A., Gifford, D.K., et al. (2004). Control of pancreas and liver gene expression by HNF transcription factors. Science 303, 13781381. Ong, C.T., and Corces, V.G. (2011). Enhancer function: new insights into the regulation of tissue-specic gene expression. Nat. Rev. Genet. 12, 283293. Orkin, S.H., and Hochedlinger, K. (2011). Chromatin connections to pluripotency and cellular reprogramming. Cell 145, 835850. Orom, U.A., and Shiekhattar, R. (2011). Noncoding RNAs and enhancers: complications of a long-distance relationship. Trends Genet. 27, 433439.

, N., Kohoutek, J., Vaupotic, T., Narat, M., and Peterlin, Oven, I., Brdickova B.M. (2007). AIRE recruits P-TEFb for transcriptional elongation of target genes in medullary thymic epithelial cells. Mol. Cell. Biol. 27, 88158823. Pang, Z.P., Yang, N., Vierbuchen, T., Ostermeier, A., Fuentes, D.R., Yang, dhof, T.C., and Wernig, M. T.Q., Citri, A., Sebastiano, V., Marro, S., Su (2011). Induction of human neuronal cells by dened transcription factors. Nature 476, 220223. Panne, D. (2008). The enhanceosome. Curr. Opin. Struct. Biol. 18, 236242. Papageorgiou, N., Tousoulis, D., Androulakis, E., Siasos, G., Briasoulis, A., Vogiatzi, G., Kampoli, A.M., Tsiamis, E., Tentolouris, C., and Stefanadis, C. (2012). The role of microRNAs in cardiovascular disease. Curr. Med. Chem. 19, 26052610. Parelho, V., Hadjur, S., Spivakov, M., Leleu, M., Sauer, S., Gregson, H.C., Jarmuz, A., Canzonetta, C., Webster, Z., Nesterova, T., et al. (2008). Cohesins functionally associate with CTCF on mammalian chromosome arms. Cell 132, 422433. Park, K.S., Cha, Y., Kim, C.H., Ahn, H.J., Kim, D., Ko, S., Kim, K.H., Chang, M.Y., Ko, J.H., Noh, Y.S., et al. (2013). Transcription elongation factor tcea3 regulates the pluripotent differentiation potential of mouse embryonic stem cells via the lefty1-nodal-smad2 pathway. Stem Cells 31, 282292. ` che, I. (2011). ANRIL, a long, Pasmant, E., Sabbagh, A., Vidaud, M., and Bie noncoding RNA, is an unexpected major hotspot in GWAS. FASEB J. 25, 444448. Philibert, R.A., Bohle, P., Secrest, D., Deaderick, J., Sandhu, H., Crowe, R., and Black, D.W. (2007). The association of the HOPA(12bp) polymorphism with schizophrenia in the NIMH Genetics Initiative for Schizophrenia sample. Am. J. Med. Genet. B. Neuropsychiatr. Genet. 144B, 743747. Phillips, J.E., and Corces, V.G. (2009). CTCF: master weaver of the genome. Cell 137, 11941211. Popov, N., and Gil, J. (2010). Epigenetic regulation of the INK4b-ARF-INK4a locus: in sickness and in health. Epigenetics 5, 685690. Rahl, P.B., Lin, C.Y., Seila, A.C., Flynn, R.A., McCuine, S., Burge, C.B., Sharp, P.A., and Young, R.A. (2010). c-Myc regulates transcriptional pause release. Cell 141, 432445. Rando, O.J. (2012). Combinatorial complexity in chromatin structure and function: revisiting the histone code. Curr. Opin. Genet. Dev. 22, 148155. Reyes-Turcu, F.E., and Grewal, S.I. (2012). Different means, same end-heterochromatin formation by RNAi and RNAi-independent RNA processing factors in ssion yeast. Curr. Opin. Genet. Dev. 22, 156163. Rinn, J.L., and Chang, H.Y. (2012). Genome regulation by long noncoding RNAs. Annu. Rev. Biochem. 81, 145166. Risheg, H., Graham, J.M., Jr., Clark, R.D., Rogers, R.C., Opitz, J.M., Moeschler, J.B., Peiffer, A.P., May, M., Joseph, S.M., Jones, J.R., et al. (2007). A recurrent mutation in MED12 leading to R961W causes OpitzKaveggia syndrome. Nat. Genet. 39, 451453. Roeder, R.G. (2005). Transcriptional regulation and the role of diverse coactivators in animal cells. FEBS Lett. 579, 909915. Roger, V.L., Go, A.S., Lloyd-Jones, D.M., Benjamin, E.J., Berry, J.D., Borden, W.B., Bravata, D.M., Dai, S., Ford, E.S., Fox, C.S., et al.; American Heart Association Statistics Committee and Stroke Statistics Subcommittee. (2012). Executive summary: heart disease and stroke statistics2012 update: a report from the American Heart Association. Circulation 125, 188197. Rump, P., Niessen, R.C., Verbruggen, K.T., Brouwer, O.F., de Raad, M., and Hordijk, R. (2011). A novel mutation in MED12 causes FG syndrome (OpitzKaveggia syndrome). Clin. Genet. 79, 183188. Sanda, T., Lawton, L.N., Barrasa, M.I., Fan, Z.P., Kohlhammer, H., Gutierrez, A., Ma, W., Tatarek, J., Ahn, Y., Kelliher, M.A., et al. (2012). Core transcriptional regulatory circuit controlled by the TAL1 complex in human T cell acute lymphoblastic leukemia. Cancer Cell 22, 209221. Santen, G.W., Aten, E., Sun, Y., Almomani, R., Gilissen, C., Nielsen, M., Kant, S.G., Snoeck, I.N., Peeters, E.A., Hilhorst-Hofstee, Y., et al. (2012a). Mutations in SWI/SNF chromatin remodeling complex gene ARID1B cause Cofn-Siris syndrome. Nat. Genet. 44, 379380.

Cell 152, March 14, 2013 2013 Elsevier Inc. 1249

Santen, G.W., Kriek, M., and van Attikum, H. (2012b). SWI/SNF complex in disorder: SWItching from malignancies to intellectual disability. Epigenetics 7, 12191224. Schaaf, C.A., Misulovin, Z., Sahota, G., Siddiqui, A.M., Schwartz, Y.B., Kahn, T.G., Pirrotta, V., Gause, M., and Dorsett, D. (2009). Regulation of the Drosophila Enhancer of split and invected-engrailed gene complexes by sister chromatid cohesion proteins. PLoS ONE 4, e6202. Schmidt, D., Schwalie, P.C., Ross-Innes, C.S., Hurtado, A., Brown, G.D., Carroll, J.S., Flicek, P., and Odom, D.T. (2010). A CTCF-independent role for cohesin in tissue-specic transcription. Genome Res. 20, 578588. Schodel, J., Bardella, C., Sciesielski, L.K., Brown, J.M., Pugh, C.W., Buckle, V., Tomlinson, I.P., Ratcliffe, P.J., and Mole, D.R. (2012). Common genetic variants at the 11q13.3 renal cancer susceptibility locus inuence binding of HIF to an enhancer of cyclin D1 expression. Nat. Genet. 44, 420425, S421422. Schwartz, C.E., Tarpey, P.S., Lubs, H.A., Verloes, A., May, M.M., Risheg, H., Friez, M.J., Futreal, P.A., Edkins, S., Teague, J., et al. (2007). The original Lujan syndrome family has a novel missense mutation (p.N1007S) in the MED12 gene. J. Med. Genet. 44, 472477. Seal, R.L., Gordon, S.M., Lush, M.J., Wright, M.W., and Bruford, E.A. (2011). genenames.org: the HGNC resources in 2011. Nucleic Acids Res. 39(Database issue), D514D519. Seitan, V.C., and Merkenschlager, M. (2012). Cohesin and chromatin organisation. Curr. Opin. Genet. Dev. 22, 93100. Seitan, V.C., Hao, B., Tachibana-Konwalski, K., Lavagnolli, T., Mira-Bontenbal, H., Brown, K.E., Teng, G., Carroll, T., Terry, A., Horan, K., et al. (2011). A role for cohesin in T-cell-receptor rearrangement and thymocyte differentiation. Nature 476, 467471. Sekiya, S., and Suzuki, A. (2011). Direct conversion of mouse broblasts to hepatocyte-like cells by dened factors. Nature 475, 390393. Sigova, A.A., Mullen, A.C., Moline, B., Gupta, S., Orlando, D.A., Guenther, M.G., Almada, A.A., Lin, C., Burge, C.B., Sharp, P.A., et al. (2013). Divergent transcription of long noncoding RNA/mRNA gene pairs in embryonic stem cells. Proc. Natl. Acad. Sci. USA. 110, 28762881. Sikorski, T.W., and Buratowski, S. (2009). The basal initiation machinery: beyond the general transcription factors. Curr. Opin. Cell Biol. 21, 344351. Slany, R.K. (2009). The molecular biology of mixed lineage leukemia. Haematologica 94, 984993. Small, E.M., and Olson, E.N. (2011). Pervasive roles of microRNAs in cardiovascular biology. Nature 469, 336342. Smith, E., Lin, C., and Shilatifard, A. (2011a). The super elongation complex (SEC) and MLL in development and disease. Genes Dev. 25, 661672. Smith, E.R., Lin, C., Garrett, A.S., Thornton, J., Mohaghegh, N., Hu, D., Jackson, J., Saraf, A., Swanson, S.K., Seidel, C., et al. (2011b). The little elongation complex regulates small nuclear RNA transcription. Mol. Cell 44, 954965. Spaeth, J.M., Kim, N.H., and Boyer, T.G. (2011). Mediator and human disease. Semin. Cell Dev. Biol. 22, 776787. Spitz, F., and Furlong, E.E. (2012). Transcription factors: from enhancer binding to developmental control. Nat. Rev. Genet. 13, 613626. Taatjes, D.J. (2010). The human Mediator complex: a versatile, genome-wide regulator of transcription. Trends Biochem. Sci. 35, 315322. Takahashi, K., and Yamanaka, S. (2006). Induction of pluripotent stem cells from mouse embryonic and adult broblast cultures by dened factors. Cell 126, 663676. Thum, T., Gross, C., Fiedler, J., Fischer, T., Kissler, S., Bussen, M., Galuppo, P., Just, S., Rottbauer, W., Frantz, S., et al. (2008). MicroRNA-21 contributes to myocardial disease by stimulating MAP kinase signalling in broblasts. Nature 456, 980984. Thurman, R.E., Rynes, E., Humbert, R., Vierstra, J., Maurano, M.T., Haugen, E., Shefeld, N.C., Stergachis, A.B., Wang, H., Vernot, B., et al.

(2012). The accessible chromatin landscape of the human genome. Nature 489, 7582. Tsai, H.C., and Baylin, S.B. (2011). Cancer epigenetics: linking basic biology to clinical medicine. Cell Res. 21, 502517. Tsurusaki, Y., Okamoto, N., Ohashi, H., Kosho, T., Imai, Y., Hibi-Ko, Y., Kaname, T., Naritomi, K., Kawame, H., Wakui, K., et al. (2012). Mutations affecting components of the SWI/SNF complex cause Cofn-Siris syndrome. Nat. Genet. 44, 376378. Van Houdt, J.K., Nowakowska, B.A., Sousa, S.B., van Schaik, B.D., Seuntjens, E., Avonce, N., Sifrim, A., Abdul-Rahman, O.A., van den Boogaard, M.J., Bottani, A., et al. (2012). Heterozygous missense mutations in SMARCA2 cause Nicolaides-Baraitser syndrome. Nat. Genet. 44, 445449, S441. van Rooij, E., Sutherland, L.B., Qi, X., Richardson, J.A., Hill, J., and Olson, E.N. (2007). Control of stress-dependent cardiac growth and gene expression by a microRNA. Science 316, 575579. van Rooij, E., Sutherland, L.B., Thatcher, J.E., DiMaio, J.M., Naseem, R.H., Marshall, W.S., Hill, J.A., and Olson, E.N. (2008). Dysregulation of microRNAs after myocardial infarction reveals a role of miR-29 in cardiac brosis. Proc. Natl. Acad. Sci. USA 105, 1302713032. dhof, T.C., and Vierbuchen, T., Ostermeier, A., Pang, Z.P., Kokubu, Y., Su Wernig, M. (2010). Direct conversion of broblasts to functional neurons by dened factors. Nature 463, 10351041. Wang, D., Garcia-Bassets, I., Benner, C., Li, W., Su, X., Zhou, Y., Qiu, J., Liu, W., Kaikkonen, M.U., Ohgi, K.A., et al. (2011). Reprogramming transcription by distinct classes of enhancers functionally dened by eRNA. Nature 474, 390394. Wendt, K.S., Yoshida, K., Itoh, T., Bando, M., Koch, B., Schirghuber, E., Tsutsumi, S., Nagae, G., Ishihara, K., Mishiro, T., et al. (2008). Cohesin mediates transcriptional insulation by CCCTC-binding factor. Nature 451, 796801. Wilson, B.G., and Roberts, C.W. (2011). SWI/SNF nucleosome remodellers and cancer. Nat. Rev. Cancer 11, 481492. Wright, J.E., and Ciosk, R. (2013). RNA-based regulation of pluripotency. Trends Genet. 29, 99107. Xie, H., Ye, M., Feng, R., and Graf, T. (2004). Stepwise reprogramming of B cells into macrophages. Cell 117, 663676. Xu, H., Tomaszewski, J.M., and McKay, M.J. (2011). Can corruption of chromosome cohesion create a conduit to cancer? Nat. Rev. Cancer 11, 199210. Yamanaka, S. (2012). Induced pluripotent stem cells: past, present, and future. Cell Stem Cell 10, 678684. Yang, B., Lin, H., Xiao, J., Lu, Y., Luo, X., Li, B., Zhang, Y., Xu, C., Bai, Y., Wang, H., et al. (2007). The muscle-specic microRNA miR-1 regulates cardiac arrhythmogenic potential by targeting GJA1 and KCNJ2. Nat. Med. 13, 486491. Yankulov, K., Blau, J., Purton, T., Roberts, S., and Bentley, D.L. (1994). Transcriptional elongation by RNA polymerase II is stimulated by transactivators. Cell 77, 749759. oz-Cabello, A.M., Raguz, S., Zeng, L., Mujtaba, S., Gil, J., Yap, K.L., Li, S., Mun Walsh, M.J., and Zhou, M.M. (2010). Molecular interplay of the noncoding RNA ANRIL and methylated histone H3 lysine 27 by polycomb CBX7 in transcriptional silencing of INK4a. Mol. Cell 38, 662674. Yeo, J.C., and Ng, H.H. (2013). The transcriptional regulation of pluripotency. Cell Res. 23, 2032. Yew, P.Y., Mushiroda, T., Kiyotani, K., Govindasamy, G.K., Yap, L.F., Teo, S.H., Lim, P.V., Govindaraju, S., Ratnavelu, K., Sam, C.K., et al.; Malaysian NPC Study Group. (2012). Identication of a functional variant in SPLUNC1 associated with nasopharyngeal carcinoma susceptibility among Malaysian Chinese. Mol. Carcinog. 51(Suppl 1), E74E82. Young, R.A. (2011). Control of the embryonic stem cell state. Cell 144, 940954. Zhao, J.Y., Yang, X.Y., Gong, X.H., Gu, Z.Y., Duan, W.Y., Wang, J., Ye, Z.Z., Shen, H.B., Shi, K.H., Hou, J., et al. (2012). Functional variant in methionine

1250 Cell 152, March 14, 2013 2013 Elsevier Inc.

synthase reductase intron-1 signicantly increases the risk of congenital heart disease in the Han Chinese population. Circulation 125, 482490. Zhou, Q., Brown, J., Kanarek, A., Rajagopal, J., and Melton, D.A. (2008). In vivo reprogramming of adult pancreatic exocrine cells to beta-cells. Nature 455, 627632. Zhou, H., Spaeth, J.M., Kim, N.H., Xu, X., Friez, M.J., Schwartz, C.E., and Boyer, T.G. (2012a). MED12 mutations link intellectual disability syndromes with dysregulated GLI3-dependent Sonic Hedgehog signaling. Proc. Natl. Acad. Sci. USA 109, 1976319768.

Zhou, Q., Li, T., and Price, D.H. (2012b). RNA polymerase II elongation control. Annu. Rev. Biochem. 81, 119143. Zhu, J., Adli, M., Zou, J.Y., Verstappen, G., Coyne, M., Zhang, X., Durham, T., Miri, M., Deshpande, V., De Jager, P.L., et al. (2013). Genome-wide chromatin state transitions associated with developmental and environmental cues. Cell 152, 642654.  Zumer, K., Plemenita s, A., Saksela, K., and Peterlin, B.M. (2011). Patient mutation in AIRE disrupts P-TEFb binding and target gene transcription. Nucleic Acids Res. 39, 79087919.

Cell 152, March 14, 2013 2013 Elsevier Inc. 1251

Review
Dynamic Integration of Splicing within Gene Regulatory Pathways

Leading Edge

Ulrich Braunschweig,1,4 Serge Gueroussov,1,2,4 Alex M. Plocik,3,4 Brenton R. Graveley,3,* and Benjamin J. Blencowe1,2,*
and Best Department of Medical Research, Donnelly Centre of Molecular Genetics University of Toronto, Toronto, ON M5S 1A8, Canada 3Department of Genetics and Developmental Biology, Institute for Systems Genomics, University of Connecticut Health Center, 400 Farmington Avenue, Farmington, CT 06030-6403, USA 4These authors contributed equally to this work *Correspondence: graveley@neuron.uchc.edu (B.R.G.), b.blencowe@utoronto.ca (B.J.B.) http://dx.doi.org/10.1016/j.cell.2013.02.034
2Department 1Banting

Precursor mRNA splicing is one of the most highly regulated processes in metazoan species. In addition to generating vast repertoires of RNAs and proteins, splicing has a profound impact on other gene regulatory layers, including mRNA transcription, turnover, transport, and translation. Conversely, factors regulating chromatin and transcription complexes impact the splicing process. This extensive crosstalk between gene regulatory layers takes advantage of dynamic spatial, physical, and temporal organizational properties of the cell nucleus, and further emphasizes the importance of developing a multidimensional understanding of splicing control.
Introduction The splicing of messenger RNA precursors (pre-mRNA) to mature mRNAs is a highly dynamic and exible process that impacts almost every aspect of eukaryotic cell biology. The formation of active splicing complexesor spliceosomes occurs via step-wise assembly pathways on pre-mRNAs. Small nuclear ribonucleoprotein particles (snRNPs): U1, U2, U4/U6, and U5, in the case of the major spliceosome, and U11, U12, U4atac/U6atac, and U5, in the case of the minor spliceosome, together with an additional 150 proteins, associate with premRNAs, initially through direct recognition of short sequences at the exon/intron boundaries. Key features of spliceosome formation are shown in Figure 1 and have been reviewed in detail elsewhere (Hoskins and Moore, 2012; Wahl et al., 2009). Spliceosome assembly can be regulated in extraordinarily diverse ways, particularly in metazoans. The major steps involve formation of the commitment complex followed by the presplicing complex and culminating with assembly of the active spliceosome. These steps appear to be reversible and potential points of regulation (Hoskins et al., 2011), and accumulating evidence indicates that formation of the commitment and presplicing complexes may be the most often subject to control (Chen and Manley, 2009). Analysis of human genome architecture emphasizes a major challenge for accurate recognition and regulation of splice sites by the splicing machinery, namely that exons represent only 3% of the human genome (ENCODE Project Consortium, 2012). Accumulating evidence indicates that the high-delity process of splice site selection is not simply governed by the interaction of snRNPs and non-snRNP protein factors with pre-mRNA but that factors associated with chromatin and the transcriptional machinery are also important (Luco et al., 2011). Moreover, splicing can reach back to impact chromatin composition
1252 Cell 152, March 14, 2013 2013 Elsevier Inc.

and transcriptional activity, as well as inuence parallel or downstream steps in gene expression including 30 -end processing, mRNA turnover, and translation (de Almeida and Carmo-Fonseca, 2012; Moore and Proudfoot, 2009). Therefore, understanding fundamental biological processes such as cell differentiation and development, as well as disease mechanisms, will require knowledge of the crosstalk between splicing and other regulatory layers in cells. A major facet of developing such knowledge is to understand how splicing is physically, spatially, and temporally integrated with other gene expression processes in the cell nucleus. This review focuses on these topics, with an emphasis on knowledge that has been gained from the application of genome-wide strategies, together with focused molecular, biochemical, and cell biological approaches. Regulation of Splicing at the Level of RNA Regulatory RNA Sequences Alternative splicing (AS) is the process by which different pairs of splice sites are selected in a pre-mRNA transcript to produce distinct mRNA and protein isoforms. The importance of understanding AS regulation is underscored by its widespread nature and its numerous dened roles in critical biological processes including cell growth, cell death, pluripotency, cell differentiation, development, circadian rhythms, responses to environmental challenge, pathogen exposure, and disease (Irimia and Blencowe, 2012; Kalsotra and Cooper, 2011). Analysis of data from high-throughput RNA sequencing (RNA-Seq) of organ transcriptomes has indicated that at least 95% of human multi-exon genes produce alternatively spliced transcripts (Pan et al., 2008; Wang et al., 2008) and that the frequency of AS scales with cell type and species complexity (Barbosa-Morais et al., 2012; Nilsen and Graveley, 2010). The main types of AS found in eukaryotes are cassette exon skipping, alternative 50 and 30

Cotranscriptional spliceosome assembly initiates with the binding of U1 snRNP to the 50 splice site, which is enhanced by exon-bound SR proteins and, for the rst exon, the cap binding complex (CBC). A cross-intron commitment complex is formed upon association of U2 snRNP auxiliary factor (U2AF) with the 30 splice site and adjacent intronic polypyrimidine tract, and branch point binding protein (BBP/SF1) with the branch site. Bridging interactions between these factors across internal exons, or exon denition, occurs within the commitment complex. Transition from a commitment complex to a presplicing complex entails communication between 50 and 30 splice sites, and the addition of U2 snRNP to the branch site along with numerous additional proteins (not shown). Subsequent association of U4/U6/U5 tri-snRNP, together with still more protein factors, and dynamic remodeling of RNA-protein, protein-protein, and RNA-RNA interactions, ultimately leads to formation of the catalytically active spliceosome. The two trans-esterication steps of splicing yield the excised intron in the form of the characteristic branched lariat structure and the ligated exons that form mature mRNA. The assembly of most splicing factors and splicing of constitutive introns is thought to occur cotranscriptionally, whereas splicing of regulated alternative introns often occurs posttranscriptionally. In the example shown, exon 4 is a regulated alternative exon controlled by an hnRNP protein, which prevents the splicing factors bound to anking splice sites from engaging in productive interactions and therefore promotes exon skipping. At terminal exons (exon 5), interactions between the splicing factors bound to the upstream 30 splice site and the exon interact with components of the cleavage and polyadenlyation machinery (CPSF and CstF are shown; see also Figure 4A). The association of the splicing factors with the pre-mRNA is enhanced throughout the transcription process by interactions with the C-terminal domain of RNA polymerase II. The EJC is recruited upstream of splice junctions upon splicing. The EJC and SR proteins mutually stabilize one another to generate the mature mRNP, which is then exported to the cytoplasm.

Figure 1. Cotranscriptional and Posttranscriptional Aspects of Pre-mRNA Splicing

splice site selection, alternative retained introns, and mutually exclusive exons. The vast majority of AS events have not been functionally characterized on any level, and this represents a major challenge for biological research. However, large-scale studies of splice variants employing a mix of computational and experimental approaches have provided evidence for widespread roles of regulated alternative exons in the control of protein interaction networks, and in cell signaling (Buljan et al., 2012; Ellis et al., 2012; Weatheritt and Gibson, 2012). The selection of correct pairs of 50 and 30 splice sites in pre-mRNA is governed in part by cis-acting RNA sequences that collectively comprise the splicing code (Wang and Burge, 2008). The code utilizes a surprisingly minimal set of highly conserved features; these are the intronic dinucleotides GU and AG (with variations used by the minor spliceosome) at the 50 and 30 splice sites, respectively, and the intronic adenosine residue that forms the branched lariat structure. Additional nucleotides surrounding these positions display sequence preferences that reect requirements for base-pairing interac-

tions with the snRNA components of snRNPs during spliceosome formation (Wahl et al., 2009). Although these minimal core elements delineate sites of splicing, they lack sufcient information to discriminate correct from incorrect splice sites and to regulate AS. Combinations of additional sequence elements referred to as exonic/intronic splicing enhancers (E/ISEs) and silencers (E/ISSs) serve to promote and repress splice site selection. They operate in the context of achieving delity and in the regulation of this process (Wang and Burge, 2008). The majority of the code elements comprise short and degenerate linear motifs, although interesting examples of structured RNA elements have been discovered that function in splice site selection (Graveley, 2005; McManus and Graveley, 2011). The major contribution of linear motifs to splicing regulation is reected by the ability of increasingly sophisticated computer algorithms to predict splicing outcomes from genomic sequence alone (Barash et al., 2010; Zhang et al., 2010). The emerging picture, supported by site-directed mutagenesis of cis elements, is that
Cell 152, March 14, 2013 2013 Elsevier Inc. 1253

splice site selection involves the concerted action of multiple enhancer and silencer elements that are concentrated in regions proximal (typically within 300 nts) to splice sites (Barash et al., 2010). In particular, enhancers that support constitutive exon splicing are typically concentrated in exons, whereas enhancers and silencers that function in the regulation of AS can be located in alternative exons, although they are most often concentrated in the immediate anking intronic regions (Barash et al., 2010). Additionally, silencer elements are enriched in sequences surrounding cryptic splice sitessequences that resemble splice sites but are not functional splice sites (Wang and Burge, 2008). Regulatory Proteins Two major classes of widely expressed trans-acting factors that control splice site recognition are the SR proteins and heterogeneous ribonucleoproteins (hnRNPs) (Long and Caceres, 2009; Martinez-Contreras et al., 2007). Depending on their binding location and the surrounding sequence context, members of each class can promote or repress splice site selection through associating with enhancers or silencers, respectively. For example, members of the SR family of proteins contain one or two RNA recognition motifs that bind ESEs and are thought to promote splicing by facilitating exon-spanning interactions that occur between splice sites (referred to as exon denition) and also by forging interactions with core spliceosomal proteins (Figure 1). In addition to widely expressed trans-acting factors, several tissue-specic RNA-binding splicing regulators have been characterized (Irimia and Blencowe, 2012; Licatalosi and Darnell, 2010). These include the neural-specic factors Nova, PTBP2/nPTB/brPTB, and nSR100/SRRM4, and factors such as RBFOX, MBNL, CELF, TIA, and STAR family proteins that are differentially expressed between a variety of cell and tissue types. Through the use of splicing-sensitive microarrays and RNA-Seq to detect exons affected by the knockout or knockdown of these factors, in combination with splicing code predictions and in vivo crosslinking coupled to immunoprecipitation and sequencing (HITSCLIP or CLIP-Seq), maps of several of these proteins have been generated that correlate their binding location (i.e., within alternative exons and/or the anking introns) with functions in promoting exon inclusion or skipping (Licatalosi and Darnell, 2010; Witten and Ule, 2011). As mentioned earlier, where studied, these proteins appear to act primarily at the earliest stages of spliceosome formation to control splice site selection. Integration of Splicing with Chromatin and Transcription Despite major progress in the characterization of factors that control splicing at the level of RNA, the impact of linked steps in gene regulation and of nuclear organization on the splicing process is less well understood. The fact that synthetic pre-mRNAs can be efciently spliced in nuclear extracts demonstrates that splicing can be uncoupled from other nuclear processes in vitro. However, mounting evidence indicates that splicing, transcription, and chromatin modication are highly integrated in the cell. Thus, key to understanding the role of chromatin and transcription in the control of splicing is knowing which aspects of the splicing process occur co- or posttranscriptionally. Some of the rst mechanistic insights into the cotranscriptional nature of splicing came from chromatin immunoprecipita1254 Cell 152, March 14, 2013 2013 Elsevier Inc.

tion studies in yeast. These experiments revealed that splicing factors fail to associate with intronless genes but are recruited to intron-containing genes concomitant with the transcription rnemann et al., 2005; of the splice sites they recognize (Go Lacadie and Rosbash, 2005). The main exceptions were genes containing short last exons, in which case U1 snRNP was recruited cotranscriptionally, but U2 snRNP was recruited posttranscriptionally (Tardiff et al., 2006). Similar approaches have been used in human cells with similar results (Listerman et al., 2006). These data paint a general picture in which the splicing machinery is typically recruited to pre-mRNA in a cotranscriptional manner. Although splicing factors are cotranscriptionally recruited, it does not necessarily follow that the splicing reaction itself occurs cotranscriptionally. Recently, Vargas et al. used in situ hybridization methods with single-molecule resolution and found that constitutively spliced introns, which typically are efciently spliced, were removed cotranscriptionally (Vargas et al., 2011). However, mutations that decreased the splicing efciency, for instance by sequestering splicing signals in RNA secondary structures, caused introns to be posttranscriptionally spliced. More interestingly, two alternatively spliced introns examined were found to be posttranscriptionally spliced. This study suggested that introns could be either cotranscriptionally or posttranscriptionally spliced, in part depending on the strength and type of surrounding cis-regulatory elements. The extent to which specic classes of splicing events occur co- or posttranscriptionally has since been examined on a genome-wide level. Several groups have analyzed RNA-Seq data generated from total cellular RNA, total nuclear RNA, nucleoplasmic RNA, or chromatin-associated RNA (Ameur et al., 2011; Bhatt et al., 2012; Khodor et al., 2012; Khodor et al., 2011; Tilgner et al., 2012). Each group used a different method to assess the extent of cotranscriptional splicing. Though the precise frequency differed in each study, most introns appeared to be cotranscriptionally spliced. The likelihood of cotranscriptional splicing increases with increased distance of introns from the 30 ends of genes (Khodor et al., 2012). Strikingly, the set of posttranscriptionally spliced introns is strongly enriched for alternatively spliced introns. Moreover, it was observed that most human transcripts are cleaved and polyadenylated before splicing of all introns is complete, yet these transcripts remain associated with the chromatin until splicing is nished (Bhatt et al., 2012). Because most splicing events (constitutive and alternative) occur cotranscriptionally, an important goal is to determine the extent to which chromatin and transcription factors impact them. Understanding such links necessitates considering the possible contribution of each step in transcription, through initiation, elongation, and termination, and therefore also how transcription is impacted by different chromatin states. Promoter-Directed Control of Splicing Pioneering studies performed in the late 90s employing transfected minigene reporter experiments demonstrated that the type of promoter used to drive transcription by RNA polymerase II (Pol II) can impact the level of AS of a downstream exon (Cramer et al., 1997). Two nonexclusive models were

proposed to explain this effect (Figure 2). In the recruitment model, a change in promoter architecture results in the recruitment of one or more splicing factors to the transcription machinery that in turn impact splicing of the nascent RNA. In the kinetic model, the change in promoter architecture affects the elongation rate of Pol II, such that there is more or less time for splice sites or other splicing signals anking the alternative exon to be recognized by trans-acting factors (Kornblihtt, 2007). For example, if these splice sites are weak (i.e., they deviate from consensus splice site sequences associated with efcient recognition by the splicing machinery), rapid elongation will expose distal, stronger splice sites such that exon skipping occurs, as productive splicing complexes will associate with the stronger splice sites rst. If elongation is slow, there is increased time for splicing factors to bind to the weak sites in the nascent RNA and promote exon inclusion. Alternatively, reduced Pol II elongation kinetics can also favor the recognition of splicing silencer elements surrounding an alternative exon, resulting in increased exon skipping. Although the mechanistic basis of promoter-dependent effects on AS has been investigated using model splicing reporters (see below), it is unclear to what extent and under what conditions natural switching of promoters may function in the regulation of downstream AS events in vivo. The analysis of large collections of full-length transcript sequences has revealed weak correlations between the use of alternative transcript start sites and the splicing of downstream cassette exons (Chern et al., 2008), although it was not determined whether such correlations may reect tissue-dependent effects that independently result in the increased complexity of transcription start site usage, and the increased complexity of AS. With the accumulation of data sets from the modENCODE/ENCODE projects and other studies that have yielded parallel genomewide surveys of multiple aspects of gene regulation, including transcription factor occupancy, epigenetic modications, longrange chromatin interactions and transcriptome proles, it should in principle be possible to obtain higher resolution predictions of causative promoter-dependent effects on splicing and other RNA processing steps. Despite our incomplete understanding of promoter-dependent effects on RNA processing in vivo, evidence from numerous model systems indicates that the strength and composition of a promoter can impact splicing outcomes. For example, the recruitment of the multifunctional proteins PSF/p54nrb by promoter-bound activators stimulates splicing of rst introns (Rosonina et al., 2005). Activation of hormone receptors by cognate ligands has been linked to specic splicing outcomes (Auboeuf et al., 2002), and the association of PGC-1, a transcriptional coactivator that plays a major role in the regulation of adaptive thermogenesis, alters splicing activity when it is bound to a gene (Monsalve et al., 2000). Interestingly, PGC-1 contains an RS domain that may function to recruit splicing factors to PGC-1-activated promoters. In the above and additional examples, the type of promoter-bound activator may inuence splicing outcomes, in part by altering the composition and/or the processivity of Pol II (David and Manley, 2011). Understanding such effects therefore entails knowledge of factors that bridge activators and Pol II, and of components of Pol II

that in turn transmit information to the nascent RNA to impact splicing. A recent study suggests that the Mediator complex may be involved in integrating and relaying information to direct splicing decisions (Huang et al., 2012). Mediator is a large multisubunit complex that functions as a general factor at the interface between promoter-bound transcriptional activators and Pol II (Malik and Roeder, 2010). In addition to its general role, locus-specic functions have been ascribed to Mediator, where changes in its composition can lead to differential outcomes in transcription, and possibly RNA processing. Huang and colleagues showed that the MED23 subunit of Mediator physically interacts with several splicing and polyadenylation factors, most notably hnRNP L (Huang et al., 2012). Indeed, MED23 was required for regulating the AS of a subset of hnRNP L targets. It will be of interest to determine how and to what extent Mediator relays information to impact the splicing machinery on hnRNP L-regulated targets, and whether it acts similarly to regulate RNA processing through other RNAbinding proteins. The RNA Polymerase II CTD in Splicing Control The C-terminal domain (CTD) of Pol IIs largest subunit impacts different stages of mRNA biogenesis, including addition of a protective cap structure on the 50 -end, splicing and formation of the mature 30 -end. The CTD consists of a repeating heptad amino acid sequence with the consensus Y1S2P3T4S5P6S7, and is predicted to be unstructured in isolation of other factors (Hsin and Manley, 2012). The CTD can be posttranslationally modied by phosphorylation on each of the residues Y1S2T4S5S7, and these changes play important and distinct roles in transcription and RNA processing (Hsin and Manley, 2012). Initial evidence for a role of the CTD in RNA processing came from experiments employing expression of an alpha-amanitin resistant mutant of Pol II that harbors a truncated CTD. Truncation to ve repeats led to defects in capping, splicing, and 30 -end processing of model pre-mRNA reporters (McCracken et al., 1997b; McCracken et al., 1997a), and the CTD was later found to affect AS outcomes (de la Mata and Kornblihtt, 2006; Rosonina and Blencowe, 2004). The CTD promotes capping and 30 -end formation through direct interactions with sets of factors dedicated to these processes, and increasing evidence indicates that it also serves as a platform to recruit splicing factors that may participate in commitment complex formation and the regulation of AS (David and Manley, 2011; Hsin and Manley, 2012). Afnity chromatography identied splicing and dual splicing/ transcription-associated factors as CTD-binding proteins. These include yeast Prp40, human TCERG1/CA150, p54nrb/ PSF proteins, SR proteins, and U2AF (Hsin and Manley, 2012). Recent work supports an RNA-dependent interaction of U2AF with the phosphorylated CTD to stimulate splicing in vitro through an association with the core spliceosomal factor PRP19C (David et al., 2011). Taken together with previous work showing that a phosphorylated CTD polypeptide can stimulate splicing in vitro (Hirose et al., 1999) and that the CTD is more active in promoting splicing of a substrate that has the capacity to form exon-denition interactions compared to a substrate that cannot (Zeng and Berget, 2000), it is interesting
Cell 152, March 14, 2013 2013 Elsevier Inc. 1255

Figure 2. Models for Chromatin and Transcription Elongation-Mediated Modulation of Alternative Splicing
(Top Left) Promoter recruitment model. Different promoters differentially recruit splicing factors to the transcription complex. At promoters which fail to recruit a key splicing factor (shown as an SR protein), the regulated alternative exon (exon 2) will be skipped, whereas genes containing promoters that recruit the splicing factor will include exon 2. (Top Right) Promoter-directed kinetic model. Different promoters assemble transcription complexes capable of different transcription elongation rates. At promoters that assemble fast transcription elongation complexes, the regulated alternative exon (exon 2) will be skipped, whereas genes containing promoters that assemble slow elongation complexes will include exon 2. This model requires that the alternative exon contains weak 30 and/or 50 splice sites in order to be skipped when the gene is rapidly transcribed. (Bottom Left) Chromatin-mediated recruitment model. The splicing of an alternative exon can be regulated by the chromatin-mediated recruitment of a splicing repressor. In cells that skip the exon, an adaptor protein associates with the nucleosome assembled at the alternative exon, which in turn recruits a splicing repressor. In cells that include the alternative exon, the adaptor protein and/or repressor are not expressed, or the nucleosome at the regulated alternative exon is (legend continued on next page)

1256 Cell 152, March 14, 2013 2013 Elsevier Inc.

to consider that the CTD might function as a platform to facilitate exon denition and commitment complex formation (Figures 1 and 2). In this manner, the CTD may also serve to tether exons separated by great intronic distances to promote cotranscriptional splicing (Dye et al., 2006). It will be important to determine whether the CTD plays such roles in vivo in future work. RNA Polymerase II Elongation and the Control of Alternative Splicing Numerous studies employing model experimental systems designed to alter the rate of Pol II elongation have provided evidence supporting the aforementioned kinetic model (Kornblihtt, 2007; Luco et al., 2011). More recent work has applied genome-wide approaches to understand the extent and functional relevance of this mode of regulation. In one study, UV-induced DNA damage was found to result in a hyperphosphorylated form of the CTD and reduced Pol II elongation kinetics, and these changes were proposed to cause changes in AS of genes that function in cell cycle control and apoptosis oz et al., 2009). Another study globally monitored AS (Mun changes following treatment of cells with camptothecin and 5,6-dichloro-1-b-D-ribofuranosyl-benzimidazole (DRB), which act through different mechanisms to inhibit Pol II elongation (Ip et al., 2011). Concentrations of these drugs that partially inhibit Pol II elongation preferentially affected AS and transcript levels of genes encoding RNA splicing factors and other RNAbinding protein (RBP) genes. Many of the induced AS changes introduced premature termination codons (PTCs) that elicited nonsense-mediated mRNA decay (NMD; see below), which further contributed to reductions in transcript levels. These results suggest that conditions globally impacting elongation rates can lead to the AS-mediated downregulation of RNA processing factors, such that the levels of these factors are calibrated with the overall RNA processing needs of the cell. This type of Pol II-coupled AS network appears to be highly conserved, because amino acid starvation, which causes reduced elongation and/or increased Pol II pausing, was also found to affect the AS of transcripts from splicing factor genes, including several that can elicit NMD, in C. elegans (Ip et al., 2011). Chromatin Structure Distinguishes Exons from Introns Although recognition of splice sites fundamentally has to occur through direct interactions with pre-mRNA, chromatin features can shape decisions about splice site usage and exon selection. The basic unit of chromatin structure is the nucleosome, which comprises 147 base-pairs of DNA wrapped around a histone octamer consisting of two copies each of histones H2A, H2B, H3, and H4 (Luger et al., 1997). Chromatin function can be regulated by substituting canonical histones with nonallelic variants and through posttranslational modication of histone tail residues most notably by methylation and acetylation (Kouzarides, 2007; Talbert and Henikoff, 2010). These histone marks and direct modications of DNA, including the addition of

5-methylcytosine, 5-hydroxymethylcytosine, and other derivatives (Wu and Zhang, 2011), affect the functional state of chromatin by altering its compaction and by modulating the binding of effector proteins. It is well established that these features have nonuniform distribution along genes with unique signatures marking promoters and gene bodies in a transcription-dependent manner (Smolle and Workman, 2013). More recently, it has become apparent that these chromatin features are also differentially distributed with respect to exon-intron boundaries, and that this differential marking participates in exon recognition. Analysis of data sets from chromatin immunoprecipitation high-throughput sequencing (ChIP-seq), and from micrococcal nuclease digestion followed by sequencing revealed that nucleosomes in a range of organisms display increased occupancy over exons relative to neighboring intronic sequence (Andersson et al., 2009; Chodavarapu et al., 2010; Schwartz et al., 2009; Spies et al., 2009; Tilgner et al., 2009; Wilhelm et al., 2011). Suggesting a possible role in facilitating splicing, exons that have weak splice sites and that are surrounded by relatively long introns have greater levels of nucleosome occupancy than do exons with strong splice sites or that are anked by short introns (Spies et al., 2009; Tilgner et al., 2009). To assess whether exonenriched nucleosomes might be compositionallyand therefore functionallydistinct, a number of studies examined global distributions of specic histone modications with respect to exon-intron boundaries (Andersson et al., 2009; Dhami et al., 2010; Hon et al., 2009; Huff et al., 2010; Kolasinska-Zwierz et al., 2009; Schwartz et al., 2009; Spies et al., 2009). Some of these studies reached different conclusions as to which modications show enrichment over exons and to what extent such enrichment is a consequence of increased nucleosome occupancy. Nevertheless, trimethylation of lysine 36 on histone H3 (H3K36me3) was shown in multiple studies to be enriched over exons above background nucleosome levels (Andersson et al., 2009; Huff et al., 2010; Spies et al., 2009). Exon-enriched nucleosomes may also differ in their histone variant composition. The H2A variant, H2A.Bbd, which is associated with active, introncontaining genes, is enriched in positioned nucleosomes anking both 50 and 30 splice sites (Tolstorukov et al., 2012). Such specic histone marks or variants could therefore play a widespread role in splicing (see below). Base pair composition affects physical properties of the DNA and is not uniform across the genome. Exons are in general associated with higher GC content, which is an important feature governing nucleosome occupancy (Tillo and Hughes, 2009). A recent study found differences in relative GC content between exons and introns that may have evolved to contribute to splicing (Amit et al., 2012). In a reconstructed ancestral state, genes contained exons with a low GC content that were anked by short introns of an even lower GC content. These subsequently diverged to yield two different types of gene architectures in animal species. In one architectural state, genes retained low

not modied and therefore cannot recruit the repressor. Similar to this model, a nucleosome-associated adaptor protein may also function to recruit a splicing activator, as proposed for Psip1/Ledgf (Pradeepa et al., 2012). (Bottom Right) Chromatin-mediated kinetic model. The splicing of an alternative exon can be regulated by a chromatin-mediated change in the rate of transcription elongation. Unmodied nucleosomes can be transcribed rapidly, resulting in skipping of the regulated alternative exon. In cells where the nucleosome assembled on exon 2 has an H3K9me3 mark, CBX3 interacts with the modied nucleosome, slows down the transcription elongation complex, and enhances splicing of the regulated alternative exon.

Cell 152, March 14, 2013 2013 Elsevier Inc. 1257

exonic GC content with lower GC content in introns but experienced an increase in intron length. In the other state, genes retained short intron length but saw an overall increase in GC content that eliminated differential exon-intron composition (Amit et al., 2012). Bioinformatic and experimental evidence supports a role for differential GC content in promoting exon recognition in the context of the rst type of architecture (Amit et al., 2012). However, to what extent differential GC content between exons and introns inuences exon recognition through possible mechanisms associated with (modied) nucleosome deposition is unclear. Studies employing genome-wide bisulphite sequencing have suggested a role for modied cytosines at exonic CpG dinucleotides in exon recognition and the regulation of AS. Modied CpG dinucleotides are enriched within exons relative to introns in both plants and animals (Chodavarapu et al., 2010; Feng et al., 2010; Laurent et al., 2010) with characteristic patterns at the 50 and 30 splice sites (Laurent et al., 2010). Moreover, widespread differences in CpG methylation have been detected between worker and queen bee genomes, and intriguingly, some of these differential methylation patterns appear to correlate with differential AS (Lyko et al., 2010). Highlighting a possible role of DNA epigenetic marks in mediating tissue-specic differences, in mammalian neuronal tissues hydroxymethylation rather than methylation was found to have signicant exonic enrichment (Khare et al., 2012). The possible mechanisms by which such modications affect splicing await future work. Chromatin-Dependent Recruitment of the Splicing Machinery Analogous to roles of promoter architecture and the Pol II CTD, accumulating evidence suggests that chromatin structure throughout a gene facilitates splicing factor recruitment to nascent transcripts. It has been proposed that splicing factors interact with chromatin directly, or indirectly through intermediate adaptor proteins (Figure 2). H3K4me3, which marks the promoters of actively-transcribed genes, binds specically to CHD1, a protein that associates with U2 snRNP. Indeed, this interaction was shown to increase splicing efciency (Sims et al., 2007). Similarly, H3K36me3, which is enriched over exons, was recently reported to interact with a short splice isoform of Psip1/Ledgf, which in turn associates with several splicing factors including the SR protein SRSF1 (Pradeepa et al., 2012). Supporting a possible role as a recruitment adaptor, knockdown of Psip1 led to a change in SRSF1 localization and affected AS. The aforementioned H2A.Bbd histone variant appears to function in splicing through the recruitment of splicing components (Tolstorukov et al., 2012). Mass spectrometry data revealed that H2A.Bbd interacts with numerous components of the spliceosome, and depletion of this histone variant led to the widespread disruption of constitutive and alternative splicing. Another recent study suggests that recruitment of splicing components by chromatin may be effected through global changes in histone hyperacetylation or changes in the levels of the heterochromatin-associated protein HP1a (Schor et al., 2012). These alterations result in the global redistribution of numerous splicing factors from chromatin to nuclear speckle domains, which are thought to predominantly represent sites
1258 Cell 152, March 14, 2013 2013 Elsevier Inc.

of splicing factor storage (Schor et al., 2012) (see below). Collectively, these studies point to characteristic patterns of chromatin structure associated with active gene expression that may have a widespread impact on the nuclear localization of the splicing machinery, which in turn can impact splicing of nascent transcripts. Chromatin structure can be altered in highly specic ways within genes, for example, in response to environmental and developmental cues. Such local changes are thought to also impact AS of proximal exons on nascent RNA through the action of adaptor proteins that bridge chromatin marks and splicing factors. The rst example of this type of proposed mechanism involves the mutually exclusive exons IIIb and IIIc in the FGFR2 gene. Switching from exon IIIb to exon IIIc alters the ligand afnity of this receptor and represents an important step in the epithelial to mesenchymal transition. In mesenchymal cells, the region encompassing these exons is characterized by elevated levels of H3K36me3 and low levels of H3K4me3 and H3K27me3 (Luco et al., 2010). H3K36me3 modications favor the binding of MRG15, which promotes the recruitment of the splicing regulator PTBP1 to nascent RNA, and as a consequence represses the use of exon IIIb in these cells (Luco et al., 2010). Consistent with a more widespread role for an MRG15-adaptor mechanism to control AS, signicantly overlapping subsets of cassette exons were affected by individual knockdown of MRG15 and PTBP1 (Luco et al., 2010). However, the affected exons generally displayed modest changes in inclusion level and were found to be surrounded by relatively weak PTBP1binding sites, suggesting that this adaptor mechanism may be more important for augmenting or stabilizing patterns of AS achieved by direct action of RNA-based regulators, rather than acting to promote pronounced cell-type-dependent, switchlike regulation of AS. Chromatin Structure Affects Splicing by Inuencing Pol II Elongation Specic features of chromatin structure, as well as chromatinassociated regulators, can inuence splice site choice by impacting transcription elongation (Figure 2). SWI/SNF chromatin remodelling factors interact directly with Pol II (Neish et al., 1998; Wilson et al., 1996), and with splicing factors et al., 2006), suggesting that these factors might impact (Batsche splicing in an elongation-dependent manner. Supporting this view, the association of the ATP-dependent SWI/SNF-type chromatin remodelling factor BRM with the human CD44 gene coincides with a change in inclusion levels of alternative exons et al., 2006). Increased occupancy in CD44 transcripts (Batsche of Pol II with elevated S5 phosphorylation of the CTD (which is associated with a paused form of Pol II) was detected specically over CD44 alternative exons, indicating that a reduced elongation rate or increased pausing of Pol II might be responsible for the change in AS. The BRM ATPase activity required for chromatin remodeling was, however, not required for the change in et al., 2006). AS (Batsche Recent studies analyzing BRM in Drosophila suggest that it acts together with other members of the SWI/SNF complex to regulate AS and polyadenylation in a locus-specic manner (Waldholm et al., 2011; Zraly and Dingwall, 2012). Developmentally regulated intron retention of the Eig71Eh pre-mRNA

Figure 3. Reverse-Coupling Mechanisms


(A) Splicing enhances transcription-associated histone modication. Splicing of the rst intron enhances transcription initiation and stabilizes promoterassociated marks, including H3K4me3 and H3K9ac, near the 50 splice site of exon 1. Splicing may also facilitate a transition between the elongationassociated marks H3K79me2 and H3K36me3 at the 30 splice site of the rst intron. Internal exons are particularly enriched for H3K36me3-modied nucleosomes, due in part to splicing-increased nucleosome occupancy and action of the histone methyltransferase SETD2 associated with elongating Pol II. These marks may also serve to reinforce splicing patterns of nascent pre-mRNA. (B) The SR protein SRSF2/SC35, which regulates splicing of alternative exons, also enhances transcription elongation by recruiting P-TEFb. P-TEFb phosphorylates the Pol II CTD at Serine 2, which enhances the rate of transcription elongation. (C) The Hu family of splicing regulators bind to AU-rich sequences within introns and repress the splicing of regulated alternative exons. Shown here, HuR interacts with and represses the activity of the histone deacetylase, HDAC2, which stabilizes nearby acetylated nucleosomes. Acetylated nucleosomes may enhance the rate of transcription elongation, and consequently, promote the skipping of exons with weak splice sites.

required the SNR1/SNF5 subunit, which suppresses BRM ATPase, and reduced elongation was correlated with more efcient intron splicing (Zraly and Dingwall, 2012). Covalent modications of histones impinge on Pol II elongation in ways that impact AS (Figure 2). The heterochromatin

protein HP1g/CBX3, which binds di- and trimethylated histone H3K9 (Bannister et al., 2001; Lachner et al., 2001), mediates inclusion of alternative exons in CD44 transcripts in human cells upon stimulation of the PKC pathway, concomitantly with an increase in Pol II occupancy over the alternatively spliced region et al., 2011). However, CBX3 may also play a more (Saint-Andre direct role in splicing factor recruitment. Depletion of CBX3 in human cells resulted in the accumulation of unspliced transcripts and loss of recruitment of the U1 snRNP-70 kDa (SNRNP70) protein and other splicing factors to active chromatin (Smallwood et al., 2012). Intriguingly, components of the RNAi machinery in association with CBX3 were recently shown to also regulate AS of CD44 transcripts. Specically, the Argonaute proteins AGO1 and AGO2 were found by ChIP-seq analysis to bind the alternative exon-containing region of CD44 and were loaded onto this region by short RNAs derived from CD44 antisense transcripts (Ameyar-Zazoua et al., 2012). Recruitment of AGO1 and AGO2 to CD44 required Dicer and CBX3 and resulted in increased histone H3K9 methylation over the variant exons. Recruitment of AGO proteins to the CD44 gene thus appears to locally induce a chromatin state that affects Pol II elongation and AS. RNA-binding proteins bound to nascent RNA may also alter chromatin composition in ways that impact elongation and splicing (Figure 3). Hu-family proteins, which have well dened roles in the control of mRNA stability, were recently shown to regulate AS by binding to nascent RNA proximal to alternative exons in a manner that induced local histone hyperacetylation and increased Pol II elongation (Mukherjee et al., 2011; Zhou et al., 2011). This activity was linked to the direct inhibition of histone deacetylase 2 (HDAC2) by Hu proteins (Zhou et al., 2011). RNA Pol II elongation rates are also impacted by nucleotide sequence composition. A/T-rich sequences, in particular, are more difcult for Pol II to transcribe. A novel complex found to be associated with human mRNPs, termed DBIRD, facilitates Pol II elongation across A/T rich sequences (Close et al., 2012). Depletion of this complex resulted in reduced Pol II elongation and changes in the splicing of exons proximal to A/T-rich sequences. It was therefore proposed that DBIRD acts at the interface of RNA Pol II and mRNP complexes to control AS (Close et al., 2012). Finally, the zinc nger DNA-binding transcription factor and chromatin organizer CTCF has been linked to the regulation of AS of exon 5 of the receptor-linked protein tyrosine phosphatase CD45, and of other transcripts, by locally affecting Pol II elongation (Shukla et al., 2011). Variable inclusion of CD45 exon 5 is controlled by RNA-binding proteins during peripheral lymphocyte maturation (Motta-Mena et al., 2010). Intriguingly, CTCF appears to maintain the inclusion of exon 5 at the terminal stages of lymphocyte development by causing Pol II pausing proximal to this exon (Shukla et al., 2011). CTCF binding is inhibited by CpG methylation. Accordingly, increased methylation proximal to CD45 exon 5 led to reduced CTCF occupancy and reduced exon inclusion (Shukla et al., 2011). Analysis of AS changes genome-wide using RNA-Seq following depletion of CTCF further revealed that this factor is likely to have a more widespread role in regulating AS through altering Pol II
Cell 152, March 14, 2013 2013 Elsevier Inc. 1259

elongation kinetics. However, CTCF is known to mediate intrachromosomal interactions (Ohlsson et al., 2010), and it therefore remains to be determined whether the changes in AS caused by CTCF reect a direct inhibition of Pol II elongation, or whether these effects are a consequence of more complex topological changes to chromatin architecture. In the examples described above and others (Luco et al., 2011), changes in AS can be achieved through a variety of mechanisms that perturb Pol II elongation in a widespread or locus-specic manner. In other cases, AS is affected through mechanisms involving the differential recruitment of splicing factors to transcription or chromatin components. It is currently unclear to what extent these mechanisms are distinct or overlap as the recruitment of splicing factors to a transcript in some cases appears to affect elongation kinetics, and in other cases altered elongation kinetics may affect the recruitment of splicing components to chromatin or transcription factors associated with nascent transcripts. For example, as summarized earlier, regulation of variable exon inclusion in CD44 transcripts appears to involve the concerted action of chromatin remodeling, inhibition of Pol II elongation, and the recruitment of splicing factors and the RNAi machinery. Individual genes may therefore possess a unique set of mechanistic principles that are governed by the specic combinatorial interplay between cis elements of the splicing code and genomic features, which together determine the formation and activity of chromatin features and transcription complexes. The increased use of comparative analyses of parallel data sets interrogating transcriptomic, genomic, and chromatin features should nevertheless facilitate a more detailed mechanistic understanding of common principles by which chromatin, transcription, and splicing are coupled to coordinate the regulation of subsets of genes. Regulation of Chromatin and Transcription by the Splicing Machinery In addition to the extensive set of interactions and mechanisms by which chromatin and transcription components can impact splicing, increasing evidence indicates that splicing can have a major impact on chromatin organization and transcriptional output. Early indications of this reverse-coupling were that the efcient expression of transgene constructs required the presence of an intron (Brinster et al., 1988). Such effects were later shown to arise in part as a consequence of enhanced transcription (Furger et al., 2002). Subsequent studies have demonstrated several mechanisms by which the splicing of nascent transcripts can impact chromatin organization and transcription. For example, H3K4me3 and H3K9ac, both of which are associated with active genes and widely assumed to peak in proximity to promoters together with increased Pol II occupancy, are in fact concentrated over rst exon-intron boundaries (Bieberstein et al., 2012) (Figure 3A). In genes with long rst exons, these marks are reduced at promoters, whereas in genes with short rst exons, the marks are increased at promoters as are transcription levels. Conrming a role for rst intron splicing in establishing promoter proximal architecture, intron deletion reduced H3K4me3 levels and transcriptional output (Bieberstein et al., 2012). Taken together with previous observations of associations between U1 and Pol II (Damgaard et al., 2008), and between U2 snRNP and H3K4me3 (Sims et al., 2007), a picture
1260 Cell 152, March 14, 2013 2013 Elsevier Inc.

emerges in which rst intron splicing serves to establish or perhaps reinforce promoter proximal marks, that in turn recruit general transcription factors and Pol II to enhance initiation. The enrichment of H3K36me3 at exons, which is established by the methyltransferase SETD2 as it travels with elongating Pol II, also arises in part as a consequence of splicing (Figure 3A). Global inhibition of splicing (via depletion of specic spliceosome components and/or exposure to the inhibitor spliceostatin) decreased H3K36me3 levels at particular exons, but also broadly altered its distribution within gene bodies (de Almeida et al., 2011; Kim et al., 2011). To what degree these effects are direct remains unclear, as global inhibition of splicing would also be expected to perturb transcription, for example, by affecting the expression and/or deposition of transcription and chromatin factors (Bieberstein et al., 2012). Nonetheless, a direct role also seems likely. For example, reciprocal H3K79me2 and H3K36me3 histone marks transition at rst intronic 30 splice site-rst internal exon boundaries, but not at the corresponding boundaries of pseudoexons (Huff et al., 2010) (ENCODE Project Consortium, 2012), suggests more direct roles of splicing-dependent transitions in chromatin modications (Figure 3A). Moreover, mass spectrometry data further suggests that SETD2 may associate with exon denition complexes (Schneider et al., 2010). Splicing also impacts Pol II pausing and elongation. An association between snRNPs and the Pol II elongation factor TATSF1 can stimulate elongation in vitro, and this activity was further enhanced by the presence of splicing signals in RNA (Fong and Zhou, 2001). Because TAT-SF1 interacts with the positive elongation factor P-TEFb, which phosphorylates the S2 residues of the CTD to increase Pol II processivity, it was proposed that the assembly of splicing complexes on nascent RNA may facilitate Pol II elongation across a gene (Fong and Zhou, 2001). Additional studies have reported roles for splicing factors in elongation. Because this topic has been reviewed elsewhere (Pandit et al., 2008), only a few examples will be highlighted here. Of particular interest are SR and SR-like proteins, which have long-established roles in splicing. The S. cerevisiae SR-like protein Npl3, for example, regulates the splicing of a subset of introns (Chen et al., 2010; Kress et al., 2008), but it also facilitates elongation by acting as an antitermination factor (Dermody et al., 2008). Specic mutations in Npl3 lead to defects in the transcription elongation and termination of 30% of genes (Dermody et al., 2008). Npl3 binds the S2 phosphorylated CTD (Lei et al., 2001), bringing it into close proximity to nascent RNA. Phosphorylation of Npl3 was found to negatively regulate its binding to the CTD and RNA, suggesting that unphosphorylated Npl3 specically promotes elongation in association with Pol II (Dermody et al., 2008). Depletion of the SR family protein SRSF2/SC35 increases Pol II pausing, most likely as a consequence of defective recruitment of P-TEFb and reduced S2 CTD phosphorylation (Lin et al., 2008) (Figure 3B). It is interesting to consider that Npl3, SRSF2, and possibly other RNA-binding proteins, may also facilitate elongation in part by preventing the formation of DNA-RNA hybrids (or R-loops) formed by nascent RNA during transcription (Pandit et al., 2008). Finally, it is also conceivable that SR proteins bound to nascent RNA indirectly promote CTD phosphorylation and/or histone modications that facilitate

(A) Coupling connections between splicing and 30 -end formation, RNA stability, and mRNA export. Splicing and 30 -end formation are coupled by interactions between exon-bound SR proteins and the cleavage and polyadenylation factor CFIm, and between U2AF and both CFIm and PAP. Cryptic upstream adenylation sites (PAS) are suppressed by U1 snRNP (left). Splicing impacts RNA stability by interactions between SR proteins and the EJC, which in turn interacts with the UPF proteins involved in NMD (middle). Splicing inuences mRNA export through the splicing-dependent recruitment of the TREX complex, which in turn interacts with the RNA export factor TAP. (B) Multitasking roles of RBPs in splicing and alternative polyadenylation, RNA export and RNA transport. Top: the Nova RNA-binding proteins have been shown to not only regulate alternative splicing, but also alternative polyadenylation (pA). Both of these processes are modulated in a position-dependent manner with some binding locations promoting splicing and polyadenylation and other locations repressing these processes. The result of this regulation is the generation of mRNAs with different exons and 30 UTR sequences. Bottom: Similarly, Mbnl RNA-binding proteins impact alternative splicing in a position-dependent manner and bind to 30 UTRs, where they function to control subcellular mRNA localization.

Figure 4. Splicing Impacts the Regulation of Multiple Downstream Steps in Gene Regulation

transcription. In this regard, it was recently shown that Npl3 associates in an RNA-independent manner with Bre1, a ubiquitin ligase with specicity for H2B (Moehle et al., 2012) that facilitates transcription elongation in vitro (Pavri et al., 2006). The studies summarized above emphasize important roles for nascent RNA splicing and the factors that control splicing in establishing chromatin architecture and in controlling transcription. It is interesting to consider, therefore, that a major determinant of gene-specic chromatin architecture emanates from information provided by cis-acting elements comprising the splicing code. The previously described case of the Hu family of hnRNP proteins is illustrative of a mechanism through which proteins bound to nascent RNA can reach back to alter proximal chromatin and affect Pol II elongation (Zhou et al., 2011) (Figure 3C). Notably, this mode of regulation also mediates highly local changes in chromatin structure that in turn regulate the AS regulation of nearby exons. A more systematic investigation of the roles of splicing components in establishing region-specic chromatin modications and functions will be

important for understanding the crosstalk between chromatin and splicing. Integration of Splicing with 30 -End Processing, Turnover, and Transport Coupling and Coordination of Splicing with 30 -End Formation Numerous studies have demonstrated communication between factors involved in the splicing of 30 -terminal introns and factors involved in 30 -end cleavage and polyadenylation (CPA), and this topic has been reviewed in detail elsewhere (Di Giammartino et al., 2011; Proudfoot, 2011). Similar to the formation of exondenition complexes, it has been proposed that U2AF binding to the 30 splice site of a terminal exon forms interactions with Cleavage Factor I and the CTD of poly(A) polymerase to mutually stimulate terminal intron splicing and CPA (Millevoi et al., 2002; Millevoi et al., 2006) (Figure 4A). SR proteins have also been implicated in terminal exon crosstalk (Dettwiler et al., 2004; McCracken et al., 2002). In certain cases, competition between
Cell 152, March 14, 2013 2013 Elsevier Inc. 1261

binding of CPA factors and splicing factors can result in physiologically important changes in AS and transcript levels (Evsyukova et al., 2013) (see below). In addition to their roles in the control of large networks of alternative exons, splicing regulators such as Nova and hnRNP H1 function in the regulation of alternative polyadenylation (APA) through direct binding to recognition sites clustered around the CPA signals (Katz et al., 2010; Licatalosi et al., 2008) (Figure 4B). Although these moonlighting roles in APA regulation appear to be largely independent of the splicing of proximal exons/introns, regulation of AS and APA by the same RBPs presumably is important for globally coordinating these processes in a cell type or condition-dependent manner. For example, transcript proling studies have shown that APA is widespread, affecting at least 50% of transcripts from human genes (Tian et al., 2005) and that it plays an important role in controlling the presence of miRNA and RNA-binding protein target sites in UTR sequences, and therefore mRNA expression levels (Mayr and Bartel, 2009; Sandberg et al., 2008). Control of APA and AS by an overlapping set of RBP regulators may therefore constitute an effective mechanism for functionally coordinating these steps in RNA processing. In an analogous manner, U1 snRNP also has dual roles in splicing and CPA. U1 snRNP is more abundant than other spliceosomal snRNPs, and this observation hinted that it may have additional functions in the nucleus. Indeed, recent studies have shown that, through binding to cryptic 50 splice sites within pre-mRNAs, U1 snRNP can inhibit premature 30 -end formation at potential CPA sites that are distributed along pre-mRNAs (Berg et al., 2012) (Figure 4A). In situations where U1 snRNP becomes limiting, for example during bursts of pre-mRNA transcription upon activation of neurons or immune cells, where the ratio of cryptic and bona-de 50 splice sites may be in excess of available U1 snRNP, premature CPA sites are activated leading to transcript shortening (Berg et al., 2012). Furthermore, reduced U1 snRNP to pre-mRNA ratios resulted in changes in terminal exon usage, consistent with the mutual stimulation between the splicing and CPA machineries in terminal exon denition. The discovery of a role for U1 snRNP in suppressing CPA has provided further insight into the mechanism by which certain mutations in 30 UTRs cause disease. For example, a mutation in the 30 UTR of the p14/ROBLD3 receptor gene that is causally linked to immunodeciency creates a 50 splice site that does not activate splicing but suppresses CPA, leading to reduced p14/ROBLD3 expression (Langemeier et al., 2012). Splicing Modulates RNA Stability and Transport The NMD pathway acts to prevent spurious expression of incompletely processed or mutant transcripts (Rebbapragada and Lykke-Andersen, 2009). Although the NMD pathway appears to be present in some form in all eukaryotes, there are nonetheless species-specic differences, particularly in the way PTCs are recognized and in the nature of the degradation pathways involved. In mammalian cells, PTC recognition relies to a large extent on deposition of the exon junction complex (EJC) 2024 nt upstream of exon-exon junctions. The EJC encompasses a stable tetrameric core consisting of eIF4AIII, MAGOH, MLN51, and Y14 proteins, which is deposited on mRNA during splicing (Tange et al., 2005). This core associates
1262 Cell 152, March 14, 2013 2013 Elsevier Inc.

with a host of SR and SR-related proteins to form megadalton size complexes that presumably function in mRNP compaction as well as in facilitating coupling of splicing with downstream steps in gene expression (Singh et al., 2012) (Figures 1 and 4A). During the pioneer round of translation, EJCs are displaced by the ribosome (Isken et al., 2008). However, when the ribosome encounters a PTC more than 5055 nt upstream of a terminal exon-exon junction, EJC components associate with upstream frame shift (UPF) proteins (Figure 4A) that trigger release of the ribosome through interaction with release factors (eRFs). These and other interactions ultimately lead to mRNA decay through pathways that involve 50 -end decapping, deadenylation, and exoribonucleolytic enzymes (Schoenberg and Maquat, 2012). Alternative splicing coupled to NMD controls the levels of specic subsets of genes. It has been estimated that approximately 10%20% of AS events that have the potential to introduce PTCs lead to substantial changes in overall total steady-state transcript levels (Pan et al., 2006). In many cases, these AS-coupled NMD events serve to auto- and crossregulate expression levels of regulatory and core factors involved in splicing and other aspects of RNA metabolism (Cuccurese et al., 2005; Lareau et al., 2007b; Mitrovich and Anderson, 2000; Ni et al., 2007; Plocik and Guthrie, 2012; Saltzman et al., 2008), but important roles in the regulation of other classes of proteins have also been reported (Barash et al., 2010; Lareau et al., 2007a). It is important for a cell to prevent incompletely or aberrantly processed transcripts from being translated, as such transcripts may express truncated proteins with aberrant or dominant negative functions that have harmful consequences. One safeguarding mechanism is to prevent release of such transcripts from the nucleus. The TREX (transcription/export) complex is a conserved multiprotein complex that links transcription elongation with nuclear mRNA export (Katahira et al., 2009). Although sser S. cerevisiae TREX is recruited to intronless transcripts (Stra et al., 2002), its mammalian counterpart is incorporated into maturing mRNPs by the splicing machinery (Masuda et al., 2005) and further requires binding of the 50 cap by the TREX component Aly (Cheng et al., 2006). TREX then mediates association with the TAP nuclear export receptor to facilitate mRNA export through the nuclear pore complex (Stutz et al., 2000; Zhou et al., 2000) (Figure 4A). Natural intronless genes can circumvent the necessity for splicing to recruit TREX through sequence elements that directly mediate TREX- and TAPdependent export (Lei et al., 2011). However, transcripts from some intron-containing yeast genes, for example the gene encoding the nuclear export factor SUS1, require introns for efcient nuclear mRNA export (Cuenca-Bono et al., 2011) (see below). Regulated intron retention has been harnessed to play important regulatory roles in the control of transcript levels. For example, coordinated regulation of a set of alternative retained introns controls the expression of the neuron-specic genes Stx1b, Vamp2, Sv2a, and Kif5a. The splicing regulator Ptbp1, which is expressed widely in nonneural cells, represses splicing of these introns, such that the unspliced transcripts are retained in the nucleus where they are degraded by the exosome (Yap et al., 2012). Inhibition of Ptbp1 expression by miR-124 in

Figure 5. Organization of the Splicing Components in the Cell Nucleus


Major nuclear domains enriched in splicing and other factors in the mammalian cell nucleus are depicted with known and putative roles indicated. Gray areas indicate nucleoli.

neural cells results in splicing of these introns, allowing export and translation of the resulting mature mRNAs. With the wealth of available transcriptome proling data, it can be expected that many additional examples of regulated intron removal linked to functions such as mRNA turnover and transport will soon emerge. Although the EJC appears to be seldom required for NMD in Drosophila, it is important for the localization of developmentally important transcripts. Localization of oskar mRNA to the posterior pole of the oocyte requires the deposition of the EJC core components together with an exon-exon junctionspanning localization element formed by splicing of the rst intron (Ghosh et al., 2012). Changes in alternative splicing, particularly in UTR regions, have been observed to differentially regulate mRNA localization in mammalian cells (La Via et al., 2013; Terenzi and Ladd, 2010) and likely represent a more widely used mode of regulation than currently appreciated. Similar to previously mentioned examples in which specic RBPs have roles in both AS and APA, specic RBPs that function in AS regulation can also function in mRNA localization. Transcriptome proling of cells and tissues decient of MBNL1 and MBNL2, coupled with analysis of the in vivo target sites of these proteins, has revealed that they regulate large networks of alternative exons involved in differentiation and development (Charizanis et al., 2012; Wang et al., 2012) (Figure 4B). A transcriptomic and proteomic analysis of subcellular compartments further uncovered a widespread role for MBNL proteins in the regulation of transcript localization, translation, and protein secretion (Wang et al., 2012). These studies underscore the importance of integrative analyses that capture information from multiple aspects of mRNA processing and expression when analyzing the functions of individual RBPs. In particular, it is becoming increasingly evident that most if not all RBPs in the cell multitask, and the extent to which the multiple regulatory functions of RBPs arise through physical (i.e., direct) coupling between processes, as opposed to independently operating functions, will be important to determine.

Dynamic Nuclear Organization in Splicing Control The majority of the mechanisms described thus far in this review invoke the formation and disruption of protein-protein and protein-RNA interactions in splicing control. However, of critical importance to any one of these mechanisms in vivo, is the local availability of active splicing components relative to the requirements for these factors presented by cognate cis-acting elements in nascent RNA. Regulation of the availability of splicing components provides a potentially powerful means by which constitutive and AS events may be controlled. The highly compartmentalized nature of the cell nucleus, which contains several different types of nonmembranous substructures, or bodies, that concentrate RNA processing factors, provides such a regulatory architecture. Among the domains that concentrate splicing and other RNA processing factors are interchromatin granule clusters or speckles, paraspeckles, Cajal Bodies (CBs) and nuclear stress bodies (Figure 5) (Biamonti and Vourch, 2010; Machyna et al., 2013; Nakagawa and Hirose, 2012; Spector and Lamond, 2011). Mammalian cell nuclei typically contain 2050 speckle structures that concentrate snRNP and non-snRNP splicing factors, including numerous SR family and SR-like proteins (Spector and Lamond, 2011). Experiments employing transcriptional inhibitors and inducible gene loci revealed that splicing factors can shuttle between speckles and nearby sites of nascent RNA transcription, and additional studies have shown that this shuttling behavior can be controlled by specic kinases and phosphatases that alter the posttranslational modication status of SR proteins and other splicing factors. These and other observations led to the proposal that speckles primarily represent storage sites for splicing factors (Spector and Lamond, 2011). However, more recent studies using antibodies that specically recognize the phosphorylated U2 snRNP protein SF3b155 (P-SF3b155), which is found only in catalytically activated or active spliceosomes, paint a more complex picture (Girard et al., 2012). Immunolocalization using an anti-P-SF3b155 antibody showed spliceosomes localized to regions of decompacted chromatin at the periphery ofor
Cell 152, March 14, 2013 2013 Elsevier Inc. 1263

withinnuclear speckles (Girard et al., 2012). Inhibition of transcription and splicing after SF3b155 phosphorylation further revealed that posttranscriptional splicing occurs in nuclear speckles. These results are consistent with results from earlier studies employing simultaneous uorescence in situ hybridization detection of unspliced and spliced transcripts, which suggested that the introns of specic transcripts are spliced within speckles (Lawrence et al., 1993). Paraspeckles are structures that form at the periphery of speckle domains and have been observed widely across mammalian cells and tissues (Fox and Lamond, 2010; Nakagawa and Hirose, 2012). They have been implicated in the regulation of gene expression by mediating the nuclear retention of adenosine-to-inosine (A-to-I) edited transcripts (Fox and Lamond, 2010). However, the recent discovery that these structures concentrate on the order of 40 multifunctional RNA-binding proteins suggests yet undiscovered roles in other aspects of RNA processing (Naganuma et al., 2012). Mammalian nuclei typically contain several Cajal bodies, and these domains are thought to represent primary sites of spliceosomal and nonspliceosomal snRNP biogenesis, maturation, and recycling (Machyna et al., 2013). The formation and size of CBs relates to the transcriptional and metabolic activity of cells, and these structures are prominent in rapidly proliferating cells. Because the in vivo concentration of basal spliceosomal components, including snRNPs, can impact specic subsets of AS events (Park et al., 2004), in particular those that are predicted to regulate levels of RNA processing factors (Saltzman et al., 2011), it is interesting to consider that processes that control the formation and activity of CBs could indirectly control AS of multiple genes to globally coordinate levels of RNA processing factors according to the metabolic requirements of the cell. Analogous to this proposed role for CBs, nuclear stress bodies are structures that form specically in response to a variety of stress conditions including heat shock, oxidative stress, or exposure to toxic materials (Biamonti and Vourch, 2010). These structures are thought to mediate global changes in gene expression, in part by sequestering splicing factors (Biamonti and Vourch, 2010). An important facet of understanding the role of nuclear domains in the control of splicing and other steps in gene regulation is to determine how they are formed. Much in the way nucleoli form around tandem repeats of rRNA genes, formation of nuclear domains with connections to the splicing process may be nucleated byor depend on for integrityspecic DNA or RNA sequences, including long (intergenic) noncoding RNAs (lnc/lincRNAs). CBs have been detected at U1 and U2 snRNA gene loci (Smith et al., 1995), although they may assemble via the association of multiple different protein and nucleic acid components (Machyna et al., 2013), and stress body formation is dependent on transcriptionally active, pericentric tandem repeats of satellite III sequences bound by heat shock transcription factor 1 (HSF1) (Biamonti and Vourch, 2010). Speckle domains concentrate MALAT1, a nuclear lncRNA that appears to participate in controlling the phosphorylation state of SR proteins (Tripathi et al., 2010). Depletion of human MALAT1 was also reported to alter the nuclear distribution of SRSF1
1264 Cell 152, March 14, 2013 2013 Elsevier Inc.

and to lead to changes in SRSF1-dependent AS events (Tripathi et al., 2010), although a more recent study did not observe such effects (Zhang et al., 2012). Moreover, recent studies employing Malat1 knockout mice did not reveal an essential role for this lncRNA under normal laboratory conditions (Eimann et al., 2012; Nakagawa et al., 2012), whereas another study reported that it is important for metastasis-associated properties of lung cancer cells (Gutschner et al., 2013). NEAT1, another lncRNA, is an integral structural component of paraspeckles (Clemson et al., 2009; Naganuma et al., 2012). A change in the alternative 30 -end processing of NEAT1 lncRNA by hnRNP K affects the formation of these domains (Naganuma et al., 2012). Very recently, a class of sno-lncRNAs transcribed from a genomic region linked to Prader-Willi syndrome was shown to sequester the RBFOX2 splicing regulator and to modulate AS (Yin et al., 2012). As additional ncRNAs are identied and characterized, it can be expected that many other examples of ncRNA-based control of splicing factor availability and functional activity will be discovered. In addition to the aforementioned roles for DNA and RNA, it has recently emerged that the prevalence of low complexity or disordered protein regions in splicing and other RNA processing factors may play an important role in the formation and regulation of the activity of nuclear domains. Homotypic and heterotypic interactions involving these domains and RNA have been shown to form hydrogel-like structures, and it is intriguing to consider that such structures act as malleable interfaces or matrices with which to dynamically control (i.e., by differential phosphorylation or other posttranslational modications) the accessibility, assembly, and activity, of splicing and other highly integrated regulatory complexes in the cell nucleus (Han et al., 2012; Kato et al., 2012). Conclusions and Future Perspectives During the past several years remarkable strides have been made in our understanding of how splicing is dynamically integrated with other layers of gene regulation and within the context of subnuclear structure and organization. Advancements in high-throughput technologies and computational approaches, together with focused biochemical, molecular, and cell biological methods, have powered the discovery and characterization of the global principles by which splicing forms a nexus of extensive crosstalk between gene expression processes. This crosstalk temporally coordinates and enhances, and in some cases represses, the kinetics of physically coupled steps in RNA metabolism, but it also serves to coordinately regulate different steps in the transcription, processing, export, stability, and translation of mRNA. Of key importance in future studies will be to determine the specic conditions and mechanisms by which chromatinand transcription-associated components control splicing outcomes, and vice versa. Current models often propose networks of physical interactions between these processes. However, it is unclear to what extent regulatory mechanisms may rely on increased local concentrations of factors (i.e., through associations with chromatin and or other nuclear domains) that provide kinetic advantages, which in turn promote coupled effects. Regardless of the specic mechanisms by which crosstalk impacts splicing and coupled processes, it is

exciting to consider that entirely new functional connections await discovery. For example, the role of splicing in the deposition of specic chromatin marks such as H3K36me3 could impact additional chromatin mark-regulated functions, such as DNA replication, repair, and methylation (Wagner and Carpenter, 2012). The plethora of poorly characterized histone lysine methylation readers such as the tudor, chromodomain, PWWP, and other royal family domain-containing proteins are candidates for mediating possible new splicing-dependent regulation involving chromatin marks and their binding to reader proteins (Yap and Zhou, 2010). Another important area of future investigation is to establish the extent to which nucleic-acid-binding proteins multitask to coordinate different aspects of biology. Although this review focuses on a few examples of multitasking RBPs, it is telling that almost every recent study employing in vivo mapping of binding sites of splicing regulators or other RBPs has uncovered previously unknown, additional functions of these proteins. Moreover, other in vivo crosslinking studies using polyadenylated RNA as bait to comprehensively identify RBPs, point to a much more extensive multitasking world in which transcription factors and proteins associated with other diverse cellular functions, including metabolism, may have unsuspected functions in association with RNA (Baltz et al., 2012; Castello et al., 2012). In this regard, it should be noted that among the largest group of uncharacterized nucleic-acid-binding factors are C2H2 and other zinc-nger domain proteins, dened examples of which can regulate gene expression through binding RNA. Increasing examples of pivotal roles for switch-like AS events is providing a perspective in which a relatively small number of regulated exons can act to rewire entire programs of gene regulation by modifying core domains of proteins that dictate the activities of regulators of chromatin, transcription, and other steps in gene regulation (Irimia and Blencowe, 2012). Numerous other AS events remodel protein interaction and signaling networks that are important for establishing cell type-specic functions (Babu et al., 2011; Ellis et al., 2012; Weatheritt and Gibson, 2012). Such AS events are often found in disordered domains of proteins that are subject to phosphorylation and other types of posttranslational modications. Interestingly, these domains are often found in splicing factors and other nuclear gene expression regulators, with the RS-repeat domains of SR proteins and the CTD of Pol II representing notable examples. A very important area of future investigation will be to understand how these and other protein domains contribute to the assembly and disassembly of higher-order nuclear structures that function to organize and possibly catalyze splicing and other nuclear reactions (Han et al., 2012; Kato et al., 2012). Also central to this understanding will be to discover and characterize ncRNAs that participate in the dynamic integration of splicing with other nuclear processes.
ACKNOWLEDGMENTS We thank members of the Graveley and Blencowe laboratories for helpful discussions. B.R.G. acknowledges support from NIH grants R01 GM067842, R01 GM095296, U54 HG007005, and U54 HG006994. B.J.B. acknowledges funding from the Canadian Institutes of Health Research, Canadian Cancer Society, Natural Sciences and Engineering Research Council of Canada

(NSERC), and the Ontario Research Fund. U.B. was supported by European Molecular Biology Organization and Human Frontier Science Program Fellowships, S.G. was supported by an NSERC Studentship, and A.P. was supported by an NRSA Fellowship.

REFERENCES Ameur, A., Zaghlool, A., Halvardson, J., Wetterbom, A., Gyllensten, U., Cavelier, L., and Feuk, L. (2011). Total RNA sequencing reveals nascent transcription and widespread co-transcriptional splicing in the human brain. Nat. Struct. Mol. Biol. 18, 14351440. Ameyar-Zazoua, M., Rachez, C., Souidi, M., Robin, P., Fritsch, L., Young, R., Morozova, N., Fenouil, R., Descostes, N., Andrau, J.-C., et al. (2012). Argonaute proteins couple chromatin silencing to alternative splicing. Nat. Struct. Mol. Biol. 19, 9981004. Amit, M., Donyo, M., Hollander, D., Goren, A., Kim, E., Gelfman, S., Lev-Maor, G., Burstein, D., Schwartz, S., Postolsky, B., et al. (2012). Differential GC content between exons and introns establishes distinct strategies of splicesite recognition. Cell Rep 1, 543556. Andersson, R., Enroth, S., Rada-Iglesias, A., Wadelius, C., and Komorowski, J. (2009). Nucleosomes are well positioned in exons and carry characteristic histone modications. Genome Res. 19, 17321741. nig, A., Berget, S.M., and OMalley, B.W. (2002). Coordinate Auboeuf, D., Ho regulation of transcription and splicing by steroid receptor coregulators. Science 298, 416419. Babu, M.M., van der Lee, R., de Groot, N.S., and Gsponer, J. (2011). Intrinsically disordered proteins: regulation and disease. Curr. Opin. Struct. Biol. 21, 432440. usser, B., Vasile, A., Murakawa, Y., Baltz, A.G., Munschauer, M., Schwanha Schueler, M., Youngs, N., Penfold-Brown, D., Drew, K., Milek, M., et al. (2012). The mRNA-bound proteome and its global occupancy prole on protein-coding transcripts. Mol. Cell 46, 674690. Bannister, A.J., Zegerman, P., Partridge, J.F., Miska, E.A., Thomas, J.O., Allshire, R.C., and Kouzarides, T. (2001). Selective recognition of methylated lysine 9 on histone H3 by the HP1 chromo domain. Nature 410, 120124. Barash, Y., Calarco, J.A., Gao, W., Pan, Q., Wang, X., Shai, O., Blencowe, B.J., and Frey, B.J. (2010). Deciphering the splicing code. Nature 465, 5359. Barbosa-Morais, N.L., Irimia, M., Pan, Q., Xiong, H.Y., Gueroussov, S., Lee, L.J., Slobodeniuc, V., Kutter, C., Watt, S., Colak, R., et al. (2012). The evolutionary landscape of alternative splicing in vertebrate species. Science 338, 15871593. , E., Yaniv, M., and Muchardt, C. (2006). The human SWI/SNF subunit Batsche Brm is a regulator of alternative splicing. Nat. Struct. Mol. Biol. 13, 2229. Berg, M.G., Singh, L.N., Younis, I., Liu, Q., Pinto, A.M., Kaida, D., Zhang, Z., Cho, S., Sherrill-Mix, S., Wan, L., and Dreyfuss, G. (2012). U1 snRNP determines mRNA length and regulates isoform expression. Cell 150, 5364. Bhatt, D.M., Pandya-Jones, A., Tong, A.-J., Barozzi, I., Lissner, M.M., Natoli, G., Black, D.L., and Smale, S.T. (2012). Transcript dynamics of proinammatory genes revealed by sequence analysis of subcellular RNA fractions. Cell 150, 279290. Biamonti, G., and Vourch, C. (2010). Nuclear stress bodies. Cold Spring Harb. Perspect. Biol. 2, a000695. Bieberstein, N.I., Carrillo Oesterreich, F., Straube, K., and Neugebauer, K.M. (2012). First exon length controls active chromatin signatures and transcription. Cell Rep 2, 6268. Brinster, R.L., Allen, J.M., Behringer, R.R., Gelinas, R.E., and Palmiter, R.D. (1988). Introns increase transcriptional efciency in transgenic mice. Proc. Natl. Acad. Sci. USA 85, 836840. Buljan, M., Chalancon, G., Eustermann, S., Wagner, G.P., Fuxreiter, M., Bateman, A., and Babu, M.M. (2012). Tissue-specic splicing of disordered segments that embed binding motifs rewires protein interaction networks. Mol. Cell 46, 871883.

Cell 152, March 14, 2013 2013 Elsevier Inc. 1265

Castello, A., Fischer, B., Eichelbaum, K., Horos, R., Beckmann, B.M., Strein, C., Davey, N.E., Humphreys, D.T., Preiss, T., Steinmetz, L.M., et al. (2012). Insights into RNA biology from an atlas of mammalian mRNA-binding proteins. Cell 149, 13931406. Charizanis, K., Lee, K.Y., Batra, R., Goodwin, M., Zhang, C., Yuan, Y., Shiue, L., Cline, M., Scotti, M.M., Xia, G., et al. (2012). Muscleblind-like 2-mediated alternative splicing in the developing brain and dysregulation in myotonic dystrophy. Neuron 75, 437450. Chen, M., and Manley, J.L. (2009). Mechanisms of alternative splicing regulation: insights from molecular and genomics approaches. Nat. Rev. Mol. Cell Biol. 10, 741754. te , J., Jackson, C.A., Vollbracht, J.A., Chen, Y.C., Milliman, E.J., Goulet, I., Co and Yu, M.C. (2010). Protein arginine methylation facilitates cotranscriptional recruitment of pre-mRNA splicing factors. Mol. Cell. Biol. 30, 52455256. Cheng, H., Dufu, K., Lee, C.-S., Hsu, J.L., Dias, A., and Reed, R. (2006). Human mRNA export machinery recruited to the 50 end of mRNA. Cell 127, 13891400. Chern, T.-M., Paul, N., van Nimwegen, E., and Zavolan, M. (2008). Computational analysis of full-length cDNAs reveals frequent coupling between transcriptional and splicing programs. DNA Res. 15, 6372. Chodavarapu, R.K., Feng, S., Bernatavichute, Y.V., Chen, P.-Y., Stroud, H., Yu, Y., Hetzel, J.A., Kuo, F., Kim, J., Cokus, S.J., et al. (2010). Relationship between nucleosome positioning and DNA methylation. Nature 466, 388392. Clemson, C.M., Hutchinson, J.N., Sara, S.A., Ensminger, A.W., Fox, A.H., Chess, A., and Lawrence, J.B. (2009). An architectural role for a nuclear noncoding RNA: NEAT1 RNA is essential for the structure of paraspeckles. Mol. Cell 33, 717726. Close, P., East, P., Dirac-Svejstrup, A.B., Hartmann, H., Heron, M., Maslen, S., ding, J., Skehel, M., and Svejstrup, J.Q. (2012). DBIRD complex Chariot, A., So integrates alternative mRNA splicing with RNA polymerase II transcript elongation. Nature 484, 386389. Cramer, P., Pesce, C.G., Baralle, F.E., and Kornblihtt, A.R. (1997). Functional association between promoter structure and transcript alternative splicing. Proc. Natl. Acad. Sci. USA 94, 1145611460. Cuccurese, M., Russo, G., Russo, A., and Pietropaolo, C. (2005). Alternative splicing and nonsense-mediated mRNA decay regulate mammalian ribosomal gene expression. Nucleic Acids Res. 33, 59655977. a-Molinero, V., Pascual-Garc a, P., Dopazo, H., Llopis, Cuenca-Bono, B., Garc guez-Navarro, S. (2011). SUS1 introns are required A., Vilardell, J., and Rodr for efcient mRNA nuclear export in yeast. Nucleic Acids Res. 39, 85998611. Damgaard, C.K., Kahns, S., Lykke-Andersen, S., Nielsen, A.L., Jensen, T.H., and Kjems, J. (2008). A 50 splice site enhances the recruitment of basal transcription initiation factors in vivo. Mol. Cell 29, 271278. David, C.J., and Manley, J.L. (2011). The RNA polymerase C-terminal domain: a new role in spliceosome assembly. Transcription 2, 221225. David, C.J., Boyne, A.R., Millhouse, S.R., and Manley, J.L. (2011). The RNA polymerase II C-terminal domain promotes splicing activation through recruitment of a U2AF65-Prp19 complex. Genes Dev. 25, 972983. de Almeida, S.F., and Carmo-Fonseca, M. (2012). Design principles of interconnections between chromatin and pre-mRNA splicing. Trends Biochem. Sci. 37, 248253. de Almeida, S.F., Grosso, A.R., Koch, F., Fenouil, R., Carvalho, S., Andrade, J., Levezinho, H., Gut, M., Eick, D., Gut, I., et al. (2011). Splicing enhances recruitment of methyltransferase HYPB/Setd2 and methylation of histone H3 Lys36. Nat. Struct. Mol. Biol. 18, 977983. de la Mata, M., and Kornblihtt, A.R. (2006). RNA polymerase II C-terminal domain mediates regulation of alternative splicing by SRp20. Nat. Struct. Mol. Biol. 13, 973980. n, J., Ogundipe, B., Gygi, S.P., Park, P.J., Dermody, J.L., Dreyfuss, J.M., Ville Ponticelli, A.S., Moore, C.L., Buratowski, S., and Bucheli, M.E. (2008). Unphosphorylated SR-like protein Npl3 stimulates RNA polymerase II elongation. PLoS ONE 3, e3273. Dettwiler, S., Aringhieri, C., Cardinale, S., Keller, W., and Barabino, S.M. (2004). Distinct sequence motifs within the 68-kDa subunit of cleavage factor

Im mediate RNA binding, protein-protein interactions, and subcellular localization. J. Biol. Chem. 279, 3578835797. Dhami, P., Saffrey, P., Bruce, A.W., Dillon, S.C., Chiang, K., Bonhoure, N., Koch, C.M., Bye, J., James, K., Foad, N.S., et al. (2010). Complex exon-intron marking by histone modications is not determined solely by nucleosome distribution. PLoS ONE 5, e12339. Di Giammartino, D.C., Nishida, K., and Manley, J.L. (2011). Mechanisms and consequences of alternative polyadenylation. Mol. Cell 43, 853866. Dye, M.J., Gromak, N., and Proudfoot, N.J. (2006). Exon tethering in transcription by RNA polymerase II. Mol. Cell 21, 849859. mmerle, M., Gu nther, S., Caudron-Herger, M., Eimann, M., Gutschner, T., Ha rnig, M., and Diederichs, S. Gro, M., Schirmacher, P., Rippe, K., Braun, T., Zo (2012). Loss of the abundant nuclear non-coding RNA MALAT1 is compatible with life and development. RNA Biol. 9, 10761087. Ellis, J.D., Barrios-Rodiles, M., Colak, R., Irimia, M., Kim, T., Calarco, J.A., Wang, X., Pan, Q., OHanlon, D., Kim, P.M., et al. (2012). Tissue-specic alternative splicing remodels protein-protein interaction networks. Mol. Cell 46, 884892. ENCODE Project Consortium, Dunham, I., Kundaje, A., Aldred, S.F., Collins, P.J., Davis, C.A., Doyle, F., Epstein, C.B., Frietze, S., Harrow, J., Kaul, R., et al. (2012). An integrated encyclopedia of DNA elements in the human genome. Nature 489, 5774. Evsyukova, I., Bradrick, S.S., Gregory, S.G., and Garcia-Blanco, M.A. (2013). Cleavage and polyadenylation specicity factor 1 (CPSF1) regulates alternative splicing of interleukin 7 receptor (IL7R) exon 6. RNA 19, 103115. Feng, S., Cokus, S.J., Zhang, X., Chen, P.Y., Bostick, M., Goll, M.G., Hetzel, J., Jain, J., Strauss, S.H., Halpern, M.E., et al. (2010). Conservation and divergence of methylation patterning in plants and animals. Proc. Natl. Acad. Sci. USA 107, 86898694. Fong, Y.W., and Zhou, Q. (2001). Stimulatory effect of splicing factors on transcriptional elongation. Nature 414, 929933. Fox, A.H., and Lamond, A.I. (2010). Paraspeckles. Cold Spring Harb. Perspect. Biol. 2, a000687. Furger, A., OSullivan, J.M., Binnie, A., Lee, B.A., and Proudfoot, N.J. (2002). Promoter proximal splice sites enhance transcription. Genes Dev. 16, 27922799. spa r, I., and Ephrussi, A. (2012). Control of RNP Ghosh, S., Marchand, V., Ga motility and localization by a splicing-dependent structure in oskar mRNA. Nat. Struct. Mol. Biol. 19, 441449. Girard, C., Will, C.L., Peng, J., Makarov, E.M., Kastner, B., Lemm, I., Urlaub, hrmann, R. (2012). Post-transcriptional spliceosomes H., Hartmuth, K., and Lu are retained in nuclear speckles until splicing completion. Nat Commun 3, 994. rnemann, J., Kotovic, K.M., Hujer, K., and Neugebauer, K.M. (2005). Go Cotranscriptional spliceosome assembly occurs in a stepwise fashion and requires the cap binding complex. Mol. Cell 19, 5363. Graveley, B.R. (2005). Mutually exclusive splicing of the insect Dscam pre-mRNA directed by competing intronic RNA secondary structures. Cell 123, 6573. mmerle, M., Eimann, M., Hsu, J., Kim, Y., Hung, G., Gutschner, T., Ha Revenko, A.S., Arun, G., Stentrup, M., Gro, M., et al. (2013). The noncoding RNA MALAT1 is a critical regulator of the metastasis phenotype of lung cancer cells. Cancer Res. 73, 11801189. Han, T.W., Kato, M., Xie, S., Wu, L.C., Mirzaei, H., Pei, J., Chen, M., Xie, Y., Allen, J., Xiao, G., and McKnight, S.L. (2012). Cell-free formation of RNA granules: bound RNAs identify features and components of cellular assemblies. Cell 149, 768779. Hirose, Y., Tacke, R., and Manley, J.L. (1999). Phosphorylated RNA polymerase II stimulates pre-mRNA splicing. Genes Dev. 13, 12341239. Hon, G., Wang, W., and Ren, B. (2009). Discovery and annotation of functional chromatin signatures in the human genome. PLoS Comput. Biol. 5, e1000566. Hoskins, A.A., and Moore, M.J. (2012). The spliceosome: a exible, reversible macromolecular machine. Trends Biochem. Sci. 37, 179188.

1266 Cell 152, March 14, 2013 2013 Elsevier Inc.

Hoskins, A.A., Friedman, L.J., Gallagher, S.S., Crawford, D.J., Anderson, E.G., Wombacher, R., Ramirez, N., Cornish, V.W., Gelles, J., and Moore, M.J. (2011). Ordered and dynamic assembly of single spliceosomes. Science 331, 12891295. Hsin, J.P., and Manley, J.L. (2012). The RNA polymerase II CTD coordinates transcription and RNA processing. Genes Dev. 26, 21192137. Huang, Y., Li, W., Yao, X., Lin, Q.-J., Yin, J.-W., Liang, Y., Heiner, M., Tian, B., Hui, J., and Wang, G. (2012). Mediator complex regulates alternative mRNA processing via the MED23 subunit. Mol. Cell 45, 459469. Huff, J.T., Plocik, A.M., Guthrie, C., and Yamamoto, K.R. (2010). Reciprocal intronic and exonic histone modication regions in humans. Nat. Struct. Mol. Biol. 17, 14951499. Ip, J.Y., Schmidt, D., Pan, Q., Ramani, A.K., Fraser, A.G., Odom, D.T., and Blencowe, B.J. (2011). Global impact of RNA polymerase II elongation inhibition on alternative splicing regulation. Genome Res. 21, 390401. Irimia, M., and Blencowe, B.J. (2012). Alternative splicing: decoding an expansive regulatory layer. Curr. Opin. Cell Biol. 24, 323332. Isken, O., Kim, Y.K., Hosoda, N., Mayeur, G.L., Hershey, J.W., and Maquat, L.E. (2008). Upf1 phosphorylation triggers translational repression during nonsense-mediated mRNA decay. Cell 133, 314327. Kalsotra, A., and Cooper, T.A. (2011). Functional consequences of developmentally regulated alternative splicing. Nat. Rev. Genet. 12, 715729. Katahira, J., Inoue, H., Hurt, E., and Yoneda, Y. (2009). Adaptor Aly and co-adaptor Thoc5 function in the Tap-p15-mediated nuclear export of HSP70 mRNA. EMBO J. 28, 556567. Kato, M., Han, T.W., Xie, S., Shi, K., Du, X., Wu, L.C., Mirzaei, H., Goldsmith, E.J., Longgood, J., Pei, J., et al. (2012). Cell-free formation of RNA granules: low complexity sequence domains form dynamic bers within hydrogels. Cell 149, 753767. Katz, Y., Wang, E.T., Airoldi, E.M., and Burge, C.B. (2010). Analysis and design of RNA sequencing experiments for identifying isoform regulation. Nat. Methods 7, 10091015. Khare, T., Pai, S., Koncevicius, K., Pal, M., Kriukiene, E., Liutkeviciute, Z., Irimia, M., Jia, P., Ptak, C., Xia, M., et al. (2012). 5-hmC in the brain is abundant in synaptic genes and shows differences at the exon-intron boundary. Nat. Struct. Mol. Biol. 19, 10371043. Khodor, Y.L., Rodriguez, J., Abruzzi, K.C., Tang, C.-H.A., Marr, M.T., 2nd, and Rosbash, M. (2011). Nascent-seq indicates widespread cotranscriptional pre-mRNA splicing in Drosophila. Genes Dev. 25, 25022512. Khodor, Y.L., Menet, J.S., Tolan, M., and Rosbash, M. (2012). Cotranscriptional splicing efciency differs dramatically between Drosophila and mouse. RNA 18, 21742186. Kim, S., Kim, H., Fong, N., Erickson, B., and Bentley, D.L. (2011). Pre-mRNA splicing is a determinant of histone H3K36 methylation. Proc. Natl. Acad. Sci. USA 108, 1356413569. Kolasinska-Zwierz, P., Down, T., Latorre, I., Liu, T., Liu, X.S., and Ahringer, J. (2009). Differential chromatin marking of introns and expressed exons by H3K36me3. Nat. Genet. 41, 376381. Kornblihtt, A.R. (2007). Coupling transcription and alternative splicing. Adv. Exp. Med. Biol. 623, 175189. Kouzarides, T. (2007). Chromatin modications and their function. Cell 128, 693705. Kress, T.L., Krogan, N.J., and Guthrie, C. (2008). A single SR-like protein, Npl3, promotes pre-mRNA splicing in budding yeast. Mol. Cell 32, 727734. La Via, L., Bonini, D., Russo, I., Orlandi, C., Barlati, S., and Barbon, A. (2013). Modulation of dendritic AMPA receptor mRNA trafcking by RNA splicing and editing. Nucleic Acids Res. 41, 617631. Lacadie, S.A., and Rosbash, M. (2005). Cotranscriptional spliceosome assembly dynamics and the role of U1 snRNA:5ss base pairing in yeast. Mol. Cell 19, 6575.

Lachner, M., OCarroll, D., Rea, S., Mechtler, K., and Jenuwein, T. (2001). Methylation of histone H3 lysine 9 creates a binding site for HP1 proteins. Nature 410, 116120. Langemeier, J., Schrom, E.-M., Rabner, A., Radtke, M., Zychlinski, D., Saborowski, A., Bohn, G., Mandel-Gutfreund, Y., Bodem, J., Klein, C., and Bohne, J. (2012). A complex immunodeciency is based on U1 snRNP-mediated poly(A) site suppression. EMBO J. 31, 40354044. Lareau, L.F., Brooks, A.N., Soergel, D.A., Meng, Q., and Brenner, S.E. (2007a). The coupling of alternative splicing and nonsense-mediated mRNA decay. Adv. Exp. Med. Biol. 623, 190211. Lareau, L.F., Inada, M., Green, R.E., Wengrod, J.C., and Brenner, S.E. (2007b). Unproductive splicing of SR genes associated with highly conserved and ultraconserved DNA elements. Nature 446, 926929. Laurent, L., Wong, E., Li, G., Huynh, T., Tsirigos, A., Ong, C.T., Low, H.M., Kin Sung, K.W., Rigoutsos, I., Loring, J., and Wei, C.L. (2010). Dynamic changes in the human methylome during differentiation. Genome Res. 20, 320331. Lawrence, J.B., Carter, K.C., and Xing, X. (1993). Probing functional organization within the nucleus: is genome structure integrated with RNA metabolism? Cold Spring Harb. Symp. Quant. Biol. 58, 807818. Lei, E.P., Krebber, H., and Silver, P.A. (2001). Messenger RNAs are recruited for nuclear export during transcription. Genes Dev. 15, 17711782. Lei, H., Dias, A.P., and Reed, R. (2011). Export and stability of naturally intronless mRNAs require specic coding region sequences and the TREX mRNA export complex. Proc. Natl. Acad. Sci. USA 108, 1798517990. Licatalosi, D.D., and Darnell, R.B. (2010). RNA processing and its regulation: global insights into biological networks. Nat. Rev. Genet. 11, 7587. Licatalosi, D.D., Mele, A., Fak, J.J., Ule, J., Kayikci, M., Chi, S.W., Clark, T.A., Schweitzer, A.C., Blume, J.E., Wang, X., et al. (2008). HITS-CLIP yields genome-wide insights into brain alternative RNA processing. Nature 456, 464469. Lin, S., Coutinho-Manseld, G., Wang, D., Pandit, S., and Fu, X.D. (2008). The splicing factor SC35 has an active role in transcriptional elongation. Nat. Struct. Mol. Biol. 15, 819826. Listerman, I., Sapra, A.K., and Neugebauer, K.M. (2006). Cotranscriptional coupling of splicing factor recruitment and precursor messenger RNA splicing in mammalian cells. Nat. Struct. Mol. Biol. 13, 815822. Long, J.C., and Caceres, J.F. (2009). The SR protein family of splicing factors: master regulators of gene expression. Biochem. J. 417, 1527. Luco, R.F., Pan, Q., Tominaga, K., Blencowe, B.J., Pereira-Smith, O.M., and Misteli, T. (2010). Regulation of alternative splicing by histone modications. Science 327, 9961000. Luco, R.F., Allo, M., Schor, I.E., Kornblihtt, A.R., and Misteli, T. (2011). Epigenetics in alternative pre-mRNA splicing. Cell 144, 1626. der, A.W., Richmond, R.K., Sargent, D.F., and Richmond, T.J. Luger, K., Ma (1997). Crystal structure of the nucleosome core particle at 2.8 A resolution. Nature 389, 251260. Lyko, F., Foret, S., Kucharski, R., Wolf, S., Falckenhayn, C., and Maleszka, R. (2010). The honey bee epigenomes: differential methylation of brain DNA in queens and workers. PLoS Biol. 8, e1000506. Machyna, M., Heyn, P., and Neugebauer, K.M. (2013). Cajal bodies: where form meets function. Wiley Interdiscip Rev RNA 4, 1734. Malik, S., and Roeder, R.G. (2010). The metazoan Mediator co-activator complex as an integrative hub for transcriptional regulation. Nat. Rev. Genet. 11, 761772. Martinez-Contreras, R., Cloutier, P., Shkreta, L., Fisette, J.F., Revil, T., and Chabot, B. (2007). hnRNP proteins and splicing control. Adv. Exp. Med. Biol. 623, 123147. Masuda, S., Das, R., Cheng, H., Hurt, E., Dorman, N., and Reed, R. (2005). Recruitment of the human TREX complex to mRNA during splicing. Genes Dev. 19, 15121517.

Cell 152, March 14, 2013 2013 Elsevier Inc. 1267

Mayr, C., and Bartel, D.P. (2009). Widespread shortening of 3UTRs by alternative cleavage and polyadenylation activates oncogenes in cancer cells. Cell 138, 673684. McCracken, S., Fong, N., Yankulov, K., Ballantyne, S., Pan, G., Greenblatt, J., Patterson, S.D., Wickens, M., and Bentley, D.L. (1997a). The C-terminal domain of RNA polymerase II couples mRNA processing to transcription. Nature 385, 357361. McCracken, S., Fong, N., Rosonina, E., Yankulov, K., Brothers, G., Siderovski, D., Hessel, A., Foster, S., Shuman, S., and Bentley, D.L. (1997b). 50 -Capping enzymes are targeted to pre-mRNA by binding to the phosphorylated carboxy-terminal domain of RNA polymerase II. Genes Dev. 11, 33063318. McCracken, S., Lambermon, M., and Blencowe, B.J. (2002). SRm160 splicing coactivator promotes transcript 30 -end cleavage. Mol. Cell. Biol. 22, 148160. McManus, C.J., and Graveley, B.R. (2011). RNA structure and the mechanisms of alternative splicing. Curr. Opin. Genet. Dev. 21, 373379. Millevoi, S., Geraghty, F., Idowu, B., Tam, J.L., Antoniou, M., and Vagner, S. (2002). A novel function for the U2AF 65 splicing factor in promoting pre-mRNA 30 -end processing. EMBO Rep. 3, 869874. Millevoi, S., Loulergue, C., Dettwiler, S., Karaa, S.Z., Keller, W., Antoniou, M., and Vagner, S. (2006). An interaction between U2AF 65 and CF I(m) links the splicing and 30 end processing machineries. EMBO J. 25, 48544864. Mitrovich, Q.M., and Anderson, P. (2000). Unproductively spliced ribosomal protein mRNAs are natural targets of mRNA surveillance in C. elegans. Genes Dev. 14, 21732184. Moehle, E.A., Ryan, C.J., Krogan, N.J., Kress, T.L., and Guthrie, C. (2012). The yeast SR-like protein Npl3 links chromatin modication to mRNA processing. PLoS Genet. 8, e1003101. Monsalve, M., Wu, Z., Adelmant, G., Puigserver, P., Fan, M., and Spiegelman, B.M. (2000). Direct coupling of transcription and mRNA processing through the thermogenic coactivator PGC-1. Mol. Cell 6, 307316. Moore, M.J., and Proudfoot, N.J. (2009). Pre-mRNA processing reaches back to transcription and ahead to translation. Cell 136, 688700. Motta-Mena, L.B., Heyd, F., and Lynch, K.W. (2010). Context-dependent regulatory mechanism of the splicing factor hnRNP L. Mol. Cell 37, 223234. Mukherjee, N., Corcoran, D.L., Nusbaum, J.D., Reid, D.W., Georgiev, S., Hafner, M., Ascano, M., Jr., Tuschl, T., Ohler, U., and Keene, J.D. (2011). Integrative regulatory mapping indicates that the RNA-binding protein HuR couples pre-mRNA processing and mRNA stability. Mol. Cell 43, 327339. rez Santangelo, M.S., Paronetto, M.P., de la Mata, M., Pelisch, oz, M.J., Pe Mun F., Boireau, S., Glover-Cutter, K., Ben-Dov, C., Blaustein, M., Lozano, J.J., et al. (2009). DNA damage regulates alternative splicing through inhibition of RNA polymerase II elongation. Cell 137, 708720. Naganuma, T., Nakagawa, S., Tanigawa, A., Sasaki, Y.F., Goshima, N., and Hirose, T. (2012). Alternative 30 -end processing of long noncoding RNA initiates construction of nuclear paraspeckles. EMBO J. 31, 40204034. Nakagawa, S., and Hirose, T. (2012). Paraspeckle nuclear bodiesuseful uselessness? Cell. Mol. Life Sci. 69, 30273036. Nakagawa, S., Ip, J.Y., Shioi, G., Tripathi, V., Zong, X., Hirose, T., and Prasanth, K.V. (2012). Malat1 is not an essential component of nuclear speckles in mice. RNA 18, 14871499. Neish, A.S., Anderson, S.F., Schlegel, B.P., Wei, W., and Parvin, J.D. (1998). Factors associated with the mammalian RNA polymerase II holoenzyme. Nucleic Acids Res. 26, 847853. Ni, J.Z., Grate, L., Donohue, J.P., Preston, C., Nobida, N., OBrien, G., Shiue, L., Clark, T.A., Blume, J.E., and Ares, M., Jr. (2007). Ultraconserved elements are associated with homeostatic control of splicing regulators by alternative splicing and nonsense-mediated decay. Genes Dev. 21, 708718. Nilsen, T.W., and Graveley, B.R. (2010). Expansion of the eukaryotic proteome by alternative splicing. Nature 463, 457463. Ohlsson, R., Bartkuhn, M., and Renkawitz, R. (2010). CTCF shapes chromatin by multiple mechanisms: the impact of 20 years of CTCF research on understanding the workings of chromatin. Chromosoma 119, 351360.

Pan, Q., Saltzman, A.L., Kim, Y.K., Misquitta, C., Shai, O., Maquat, L.E., Frey, B.J., and Blencowe, B.J. (2006). Quantitative microarray proling provides evidence against widespread coupling of alternative splicing with nonsensemediated mRNA decay to control gene expression. Genes Dev. 20, 153158. Pan, Q., Shai, O., Lee, L.J., Frey, B.J., and Blencowe, B.J. (2008). Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat. Genet. 40, 14131415. Pandit, S., Wang, D., and Fu, X.D. (2008). Functional integration of transcriptional and RNA processing machineries. Curr. Opin. Cell Biol. 20, 260265. Park, J.W., Parisky, K., Celotto, A.M., Reenan, R.A., and Graveley, B.R. (2004). Identication of alternative splicing regulators by RNA interference in Drosophila. Proc. Natl. Acad. Sci. USA 101, 1597415979. Pavri, R., Zhu, B., Li, G., Trojer, P., Mandal, S., Shilatifard, A., and Reinberg, D. (2006). Histone H2B monoubiquitination functions cooperatively with FACT to regulate elongation by RNA polymerase II. Cell 125, 703717. Plocik, A.M., and Guthrie, C. (2012). Diverse forms of RPS9 splicing are part of an evolving autoregulatory circuit. PLoS Genet. 8, e1002620. Pradeepa, M.M., Sutherland, H.G., Ule, J., Grimes, G.R., and Bickmore, W.A. (2012). Psip1/Ledgf p52 binds methylated histone H3K36 and splicing factors and contributes to the regulation of alternative splicing. PLoS Genet. 8, e1002717. Proudfoot, N.J. (2011). Ending the message: poly(A) signals then and now. Genes Dev. 25, 17701782. Rebbapragada, I., and Lykke-Andersen, J. (2009). Execution of nonsensemediated mRNA decay: what denes a substrate? Curr. Opin. Cell Biol. 21, 394402. Rosonina, E., and Blencowe, B.J. (2004). Analysis of the requirement for RNA polymerase II CTD heptapeptide repeats in pre-mRNA splicing and 30 -end cleavage. RNA 10, 581589. Rosonina, E., Ip, J.Y., Calarco, J.A., Bakowski, M.A., Emili, A., McCracken, S., Tucker, P., Ingles, C.J., and Blencowe, B.J. (2005). Role for PSF in mediating transcriptional activator-dependent stimulation of pre-mRNA processing in vivo. Mol. Cell. Biol. 25, 67346746. , V., Batsche , E., Rachez, C., and Muchardt, C. (2011). Histone H3 Saint-Andre lysine 9 trimethylation and HP1g favor inclusion of alternative exons. Nat. Struct. Mol. Biol. 18, 337344. Saltzman, A.L., Kim, Y.K., Pan, Q., Fagnani, M.M., Maquat, L.E., and Blencowe, B.J. (2008). Regulation of multiple core spliceosomal proteins by alternative splicing-coupled nonsense-mediated mRNA decay. Mol. Cell. Biol. 28, 43204330. Saltzman, A.L., Pan, Q., and Blencowe, B.J. (2011). Regulation of alternative splicing by the core spliceosomal machinery. Genes Dev. 25, 373384. Sandberg, R., Neilson, J.R., Sarma, A., Sharp, P.A., and Burge, C.B. (2008). Proliferating cells express mRNAs with shortened 30 untranslated regions and fewer microRNA target sites. Science 320, 16431647. hrmann, R. Schneider, M., Will, C.L., Anokhina, M., Tazi, J., Urlaub, H., and Lu (2010). Exon denition complexes contain the tri-snRNP and can be directly converted into B-like precatalytic splicing complexes. Mol. Cell 38, 223235. Schoenberg, D.R., and Maquat, L.E. (2012). Regulation of cytoplasmic mRNA decay. Nat. Rev. Genet. 13, 246259. ` res, D., Risso, G.J., Pawellek, A., Ule, J., Lamond, A.I., and Schor, I.E., Lle Kornblihtt, A.R. (2012). Perturbation of chromatin structure globally affects localization and recruitment of splicing factors. PLoS ONE 7, e48084. Schwartz, S., Meshorer, E., and Ast, G. (2009). Chromatin organization marks exon-intron structure. Nat. Struct. Mol. Biol. 16, 990995. Shukla, S., Kavak, E., Gregory, M., Imashimizu, M., Shutinoski, B., Kashlev, M., Oberdoerffer, P., Sandberg, R., and Oberdoerffer, S. (2011). CTCFpromoted RNA polymerase II pausing links DNA methylation to splicing. Nature 479, 7479. Sims, R.J., 3rd, Millhouse, S., Chen, C.-F., Lewis, B.A., Erdjument-Bromage, H., Tempst, P., Manley, J.L., and Reinberg, D. (2007). Recognition of

1268 Cell 152, March 14, 2013 2013 Elsevier Inc.

trimethylated histone H3 lysine 4 facilitates the recruitment of transcription postinitiation factors and pre-mRNA splicing. Mol. Cell 28, 665676. Singh, G., Kucukural, A., Cenik, C., Leszyk, J.D., Shaffer, S.A., Weng, Z., and Moore, M.J. (2012). The cellular EJC interactome reveals higher-order mRNP structure and an EJC-SR protein nexus. Cell 151, 750764. Smallwood, A., Hon, G.C., Jin, F., Henry, R.E., Espinosa, J.M., and Ren, B. (2012). CBX3 regulates efcient RNA processing genome-wide. Genome Res. 22, 14261436. Smith, K.P., Carter, K.C., Johnson, C.V., and Lawrence, J.B. (1995). U2 and U1 snRNA gene loci associate with coiled bodies. J. Cell. Biochem. 59, 473485. Smolle, M., and Workman, J.L. (2013). Transcription-associated histone modications and cryptic transcription. Biochim. Biophys. Acta 1829, 8497. Spector, D.L., and Lamond, A.I. (2011). Nuclear speckles. Cold Spring Harb. Perspect. Biol. 3. Spies, N., Nielsen, C.B., Padgett, R.A., and Burge, C.B. (2009). Biased chromatin signatures around polyadenylation sites and exons. Mol. Cell 36, 245254. sser, K., Masuda, S., Mason, P., Pfannstiel, J., Oppizzi, M., RodriguezStra n, A.G., Aguilera, A., Struhl, K., Reed, R., and Hurt, E. Navarro, S., Rondo (2002). TREX is a conserved complex coupling transcription with messenger RNA export. Nature 417, 304308. raphin, B., Wilm, M., Bork, P., Stutz, F., Bachi, A., Doerks, T., Braun, I.C., Se and Izaurralde, E. (2000). REF, an evolutionary conserved family of hnRNPlike proteins, interacts with TAP/Mex67p and participates in mRNA nuclear export. RNA 6, 638650. Talbert, P.B., and Henikoff, S. (2010). Histone variantsancient wrap artists of the epigenome. Nat. Rev. Mol. Cell Biol. 11, 264275. Tange, T.O., Shibuya, T., Jurica, M.S., and Moore, M.J. (2005). Biochemical analysis of the EJC reveals two new factors and a stable tetrameric protein core. RNA 11, 18691883. Tardiff, D.F., Lacadie, S.A., and Rosbash, M. (2006). A genome-wide analysis indicates that yeast pre-mRNA splicing is predominantly posttranscriptional. Mol. Cell 24, 917929. Terenzi, F., and Ladd, A.N. (2010). Conserved developmental alternative splicing of muscleblind-like (MBNL) transcripts regulates MBNL localization and activity. RNA Biol. 7, 4355. Tian, B., Hu, J., Zhang, H., and Lutz, C.S. (2005). A large-scale analysis of mRNA polyadenylation of human and mouse genes. Nucleic Acids Res. 33, 201212. rcel, J., Tilgner, H., Nikolaou, C., Althammer, S., Sammeth, M., Beato, M., Valca , R. (2009). Nucleosome positioning as a determinant of exon and Guigo recognition. Nat. Struct. Mol. Biol. 16, 9961001. Tilgner, H., Knowles, D.G., Johnson, R., Davis, C.A., Chakrabortty, S., Djebali, , R. (2012). Deep S., Curado, J., Snyder, M., Gingeras, T.R., and Guigo sequencing of subcellular RNA fractions shows splicing to be predominantly co-transcriptional in the human genome but inefcient for lncRNAs. Genome Res. 22, 16161625. Tillo, D., and Hughes, T.R. (2009). G+C content dominates intrinsic nucleosome occupancy. BMC Bioinformatics 10, 442. Tolstorukov, M.Y., Goldman, J.A., Gilbert, C., Ogryzko, V., Kingston, R.E., and Park, P.J. (2012). Histone variant H2A.Bbd is associated with active transcription and mRNA processing in human cells. Mol. Cell 47, 596607. Tripathi, V., Ellis, J.D., Shen, Z., Song, D.Y., Pan, Q., Watt, A.T., Freier, S.M., Bennett, C.F., Sharma, A., Bubulya, P.A., et al. (2010). The nuclear-retained noncoding RNA MALAT1 regulates alternative splicing by modulating SR splicing factor phosphorylation. Mol. Cell 39, 925938. Vargas, D.Y., Shah, K., Batish, M., Levandoski, M., Sinha, S., Marras, S.A., Schedl, P., and Tyagi, S. (2011). Single-molecule imaging of transcriptionally coupled and uncoupled splicing. Cell 147, 10541065.

Wagner, E.J., and Carpenter, P.B. (2012). Understanding the language of Lys36 methylation at histone H3. Nat. Rev. Mol. Cell Biol. 13, 115126. hrmann, R. (2009). The spliceosome: design Wahl, M.C., Will, C.L., and Lu principles of a dynamic RNP machine. Cell 136, 701718. Waldholm, J., Wang, Z., Brodin, D., Tyagi, A., Yu, S., Theopold, U., Farrants, A.K., and Visa, N. (2011). SWI/SNF regulates the alternative processing of a specic subset of pre-mRNAs in Drosophila melanogaster. BMC Mol. Biol. 12, 46. Wang, Z., and Burge, C.B. (2008). Splicing regulation: from a parts list of regulatory elements to an integrated splicing code. RNA 14, 802813. Wang, E.T., Sandberg, R., Luo, S., Khrebtukova, I., Zhang, L., Mayr, C., Kingsmore, S.F., Schroth, G.P., and Burge, C.B. (2008). Alternative isoform regulation in human tissue transcriptomes. Nature 456, 470476. Wang, E.T., Cody, N.A.L., Jog, S., Biancolella, M., Wang, T.T., Treacy, D.J., Luo, S., Schroth, G.P., Housman, D.E., Reddy, S., et al. (2012). Transcriptome-wide regulation of pre-mRNA splicing and mRNA localization by muscleblind proteins. Cell 150, 710724. Weatheritt, R.J., and Gibson, T.J. (2012). Linear motifs: lost in (pre)translation. Trends Biochem. Sci. 37, 333341. hler, J. Wilhelm, B.T., Marguerat, S., Aligianni, S., Codlin, S., Watt, S., and Ba (2011). Differential patterns of intronic and exonic DNA regions with respect to RNA polymerase II occupancy, nucleosome density and H3K36me3 marking in ssion yeast. Genome Biol. 12, R82. Wilson, C.J., Chao, D.M., Imbalzano, A.N., Schnitzler, G.R., Kingston, R.E., and Young, R.A. (1996). RNA polymerase II holoenzyme contains SWI/SNF regulators involved in chromatin remodeling. Cell 84, 235244. Witten, J.T., and Ule, J. (2011). Understanding splicing regulation through RNA splicing maps. Trends Genet. 27, 8997. Wu, H., and Zhang, Y. (2011). Mechanisms and functions of Tet proteinmediated 5-methylcytosine oxidation. Genes Dev. 25, 24362452. Yap, K.L., and Zhou, M.M. (2010). Keeping it in the family: diverse histone recognition by conserved structural folds. Crit. Rev. Biochem. Mol. Biol. 45, 488505. Yap, K., Lim, Z.Q., Khandelia, P., Friedman, B., and Makeyev, E.V. (2012). Coordinated regulation of neuronal mRNA steady-state levels through developmentally controlled intron retention. Genes Dev. 26, 12091223. Yin, Q.-F., Yang, L., Zhang, Y., Xiang, J.-F., Wu, Y.-W., Carmichael, G.G., and Chen, L.-L. (2012). Long noncoding RNAs with snoRNA ends. Mol. Cell 48, 219230. Zeng, C., and Berget, S.M. (2000). Participation of the C-terminal domain of RNA polymerase II in exon denition during pre-mRNA splicing. Mol. Cell. Biol. 20, 82908301. Zhang, C., Frias, M.A., Mele, A., Ruggiu, M., Eom, T., Marney, C.B., Wang, H., Licatalosi, D.D., Fak, J.J., and Darnell, R.B. (2010). Integrative modeling denes the Nova splicing-regulatory network and its combinatorial controls. Science 329, 439443. Zhang, B., Arun, G., Mao, Y.S., Lazar, Z., Hung, G., Bhattacharjee, G., Xiao, X., Booth, C.J., Wu, J., Zhang, C., and Spector, D.L. (2012). The lncRNA Malat1 is dispensable for mouse development but its transcription plays a cis-regulatory role in the adult. Cell Rep 2, 111123. Zhou, Z., Luo, M.J., Straesser, K., Katahira, J., Hurt, E., and Reed, R. (2000). The protein Aly links pre-messenger-RNA splicing to nuclear export in metazoans. Nature 407, 401405. Zhou, H.L., Hinman, M.N., Barron, V.A., Geng, C., Zhou, G., Luo, G., Siegel, R.E., and Lou, H. (2011). Hu proteins regulate alternative splicing by inducing localized histone hyperacetylation in an RNA-dependent manner. Proc. Natl. Acad. Sci. USA 108, E627E635. Zraly, C.B., and Dingwall, A.K. (2012). The chromatin remodeling and mRNA splicing functions of the Brahma (SWI/SNF) complex are mediated by the SNR1/SNF5 regulatory subunit. Nucleic Acids Res. 40, 59755987.

Cell 152, March 14, 2013 2013 Elsevier Inc. 1269

Review
Genome Architecture: Domain Organization of Interphase Chromosomes
Wendy A. Bickmore1,* and Bas van Steensel2,*
Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh EH42XU, UK of Gene Regulation, Netherlands Cancer Institute, 1066 CX Amsterdam, the Netherlands *Correspondence: wendy.bickmore@igmm.ed.ac.uk (W.A.B.), b.v.steensel@nki.nl (B.v.S.) http://dx.doi.org/10.1016/j.cell.2013.02.001
2Division 1MRC

Leading Edge

The architecture of interphase chromosomes is important for the regulation of gene expression and genome maintenance. Chromosomes are linearly segmented into hundreds of domains with different protein compositions. Furthermore, the spatial organization of chromosomes is nonrandom and is characterized by many local and long-range contacts among genes and other sequence elements. A variety of genome-wide mapping techniques have made it possible to chart these properties at high resolution. Combined with microscopy and computational modeling, the results begin to yield a more coherent picture that integrates linear and three-dimensional (3D) views of chromosome organization in relation to gene regulation and other nuclear functions.
Introduction The idea that chromosomes are segmented into domains with distinct functional properties goes back to the initial microscopy observations of heterochromatin and euchromatin and the banding patterns of mitotic chromosomes and polytene interphase chromosomes upon staining with particular dyes. Later, immunouorescence microscopy with antibodies against specic chromatin proteins led to the notion that differential protein composition may underlie this segmentation. The development of chromatin immunoprecipitation (ChIP) subsequently enabled the locations of specic proteins or histone modications to be mapped along the genome with much higher resolution. This provided direct evidence that some proteins can associate with long genomic regions, such as the Polycomb protein at the homeotic bithorax locus in Drosophila (Orlando and Paro, 1993). Further renement of ChIP and the developments of the complementary mapping technique DamID (van Steensel et al., 2001) and methods for the mapping of various other properties of chromatin (see below) have led to the generation of numerous high-resolution genome-wide maps that identify various chromatin features with a domain-like organization. Parallel to these developments, the three-dimensional (3D) folding of chromosomes has been investigated by uorescence in situ hybridization (FISH). Extensive microscopy studies have revealed a high degree of nonrandom positioning of loci within the nucleus and within chromosomal territories (Cremer et al., 2006). More recently, chromosome conformation capture (3C) techniques have begun to offer detailed views of the associations among linearly distant genomic loci that can be captured by formaldehyde crosslinking in the nucleus, and from which aspects of 3D chromosomal folding have been inferred. Excitingly, these linear and 3D views of interphase chromosome architecture are now beginning to converge, revealing that the chromatin modications of genomic regions and the
1270 Cell 152, March 14, 2013 2013 Elsevier Inc.

overall 3D organization are linked. Here, we will discuss current insights in the relationships between the linear and 3D domain architecture of interphase chromosomes, with emphasis on results obtained in mammals and in Drosophila. Linear Views of Chromatin Domains DNA-Sequence Organization The genomes of most species show striking nonrandom patterns in their sequence composition. Mammalian and bird genomes are a patchwork of long DNA stretches (>100 kb, Figure 1A) termed isochores, which differ in A/T content (Costantini et al., 2006). Gene density is also highly nonhomogeneous along the genome and roughly corresponds to isochore patterning, with high gene density overlapping with A/T-poor regions. Interestingly, genes embedded in gene-dense regions tend to be more active than those in gene-poor regions, resulting in megabasesized domains of alternating high and low transcriptional activity (Caron et al., 2001). Most transposons and virus-derived elements are also nonevenly distributed. For example, of the most abundant repetitive elements in the human genome, SINE elements tend to be located in gene-dense regions, whereas LINE elements are more abundant in gene-poor regions. As will become clear below, the nonrandom distribution of DNA-sequence features is closely linked to chromatin-domain organization, although it is unknown how the former helps to establish the latter. Various satellite repeats constitute very prominent domains, but it is beyond the scope of this Review to discuss them here. Polycomb Domains Perhaps the best studied chromatin domains are those formed in Drosophila by the Polycomb group (PcG) proteins, known for their role in maintaining silencing of gene clusters during development. These proteins form multisubunit complexes of which the most prominent are Polycomb-repressive complexes 1 and 2 (PRC1 and PRC2). PRC2 contains a histone

Figure 1. Chromatin Types


(A) Size distribution (in base pairs) of some genome features (blue) and various types of chromatin domains in human broblasts (red). Numbers on the righthand side indicate approximate genome-wide counts of the respective domains. Note that the sizes and counts of chromatin domains can vary between cell types and can depend much on the algorithm used to dene the domains. (Data are from http://genome.ucsc.edu; Costantini et al., 2006; Guelen et al., 2008; Lister et al., 2009; Hawkins et al., 2010; Dixon et al., 2012; Pope et al., 2012.) (B) Cartoon model of a chromosomal ber, illustrating its segmentation into domains of distinct chromatin types, each consisting of a specic combination of proteins and histone modications (indicated by colors). (C) Classication of chromatin types. This example shows binding maps of 12 chromatin proteins along a part of chromosome 2L in Drosophila Kc cells. Computer algorithms are used to search for recurrent combinations of proteins (chromatin types or states) and to subsequently dene linear chromosomal domains covered by these types (highlighted in different colors; the classication in this example was based on 53 protein proles). Note that some proteins are present in a single chromatin type, whereas others can be shared among multiple types. Adapted from Filion and van Steensel (2010).

methyltransferase that catalyzes trimethylation of H3K27 (H3K27me3), whereas PRC1 includes Polycomb (Pc), which binds to this histone mark (Morey and Helin, 2010). In Drosophila, PcG proteins and H3K27me3 form a few hundred domains, of about 10 to 150 kb in size, that are scattered along the genome. These domains often cover multiple genes (Tolhuis et al., 2006; Schwartz et al., 2010), most of which are transcriptionally inactive. Specialized Polycomb response elements (PREs) of several hundred base pairs in size can act as nucleation sites from which the PcG complexes spread laterally to form the domains (Morey and Helin, 2010). Genes within a PcG domain are generally not coregulated throughout development, although exceptions to this rule may occur (Tolhuis

et al., 2006). Rather, PcG domains are dynamic structures that, depending on the cell type, can be partially or entirely cleared of PcG proteins to accommodate expression of one or several of the underlying genes (Schwartz et al., 2010). In mouse and human, most available evidence so far points to the existence of only a few large PcG domains that cover multiple neighboring genes, primarily at the Hox gene clusters (Bernstein et al., 2006) and on the inactive X chromosome in female cells (Marks et al., 2009). Otherwise, PcG domains are relatively small (10 kb) (Figure 1A) and usually overlap with individual CpG-rich promoter regions (Lee et al., 2006; Ku et al., 2008). However, one ChIP study (Pauler et al., 2009) suggests that besides the small H3K27me3 peaks, there may be hundreds of larger regions
Cell 152, March 14, 2013 2013 Elsevier Inc. 1271

(average 40 kb) with milder but signicant H3K27me3 enrichment in mouse embryonic broblasts. Like in Drosophila, mammalian PcG target genes are highly enriched in genes that have regulatory functions in development (Lee et al., 2006). In embryonic stem cells (ESCs), many of these sites are also marked by H3K4me3, a state that is referred to as bivalent. Like H3K27me3/PRC2, this H3K4me3 is required for differentiation rather than the stem cell state per se (Jiang et al., 2011). Upon ESC differentiation, the bivalent state typically resolves into either an active H3K4me3-only or a repressed H3K27me3-only state (Bernstein et al., 2006). In case of the latter, the H3K27me3 domains expand 2- to 3-fold in size once ESCs differentiate (Hawkins et al., 2010). Interestingly, ectopically integrated unmethylated CpG-rich elements can recruit PRC2 (Mendenhall et al., 2010), indicating that they harbor the necessary sequence information for setting up a small PcG domain. However, most CpG islands lack H3K27me3, so additional local cues must modulate this ability. Specic transcription factors seem to have a role in PcG recruitment (Arnold et al., 2013), whereas activating signals might counteract the formation of a PcG domain at CpG islands (Mendenhall et al., 2010). H3K9me2/3 Chromatin H3K9me2 and H3K9me3 are abundant histone marks produced by several enzymes; in mammals, these include G9a/GLP, SETDB1, and SUV39H1. Various proteins bind specically to methylated H3K9, each with its own preference for mono-, di-, or trimethylation. The most extensively studied among these are the HP1 proteins. Together with H3K9me2/3, they cover pericentric and telomeric regions of many species, where they have structural roles, suppress recombination, and silence transposable elements (Zeng et al., 2010). In Drosophila, H3K9me2 and HP1 additionally associate with a few hundred genes throughout the genome, where they are often conned to individual transcription units. Although H3K9me2 and H3K9me3 are widely believed to be repressive marks, there are many examples of transcriptionally active genes that are covered by these marks and by HP1 (Kwon and Workman, 2011). In mammals, there is now evidence for extensive formation of chromatin domains by H3K9me2/3 and HP1s. As well as binding to methylated H3K9, ectopic recruitment of HP1a to a specic site is sufcient in itself to establish an H3K9me3 domain of about 10 kb (Hathaway et al., 2012).Thousands of endogenous H3K9me2 domains were found along the genome in mouse cells and tissues (Wen et al., 2009). These domains, referred to as large organized chromatin K9 modications (LOCKs), have a size of roughly 100 kb and together cover up to 45% of the genome, depending on the cell type. It was claimed that mouse ESCs (mESCs) mostly lack these domains, but the statistical signicance of this observation was questioned (Filion and van Steensel, 2010), and an independent study found only marginal differences in the H3K9me2 domain patterns between mESCs and differentiated cells (Lienert et al., 2011). In human ESCs (hESCs), H3K9me3 forms more than ten thousand small domains (median size 7 kb). These tend to be about 2-fold larger in broblasts (Hawkins et al., 2010) (Figure 1A), which could point to the ability of this chromatin type to spread in cis. Analysis of H3K9me2 in human lymphocytes suggests a global pattern of
1272 Cell 152, March 14, 2013 2013 Elsevier Inc.

Mb-sized domains (Ryba et al., 2010). No study so far has carefully compared H3K9me2 and H3K9me3 patterns, so it is unclear to what extent these two marks overlap. A striking example of the H3K9me3 chromatin type is formed by the clusters of KRAB-ZNF genes, which encode the largest mammalian family of transcription factors. These clusters, some of which are more than 1 Mb in size, are extensively covered by H3K9me3, CBX1/HP1b, SETDB1, and SUV39H1 (Vogel et al., 2006; Frietze et al., 2010). However, these components are not homogenously distributed within the domains, as they show higher occupancy near the 30 ends of the KRABZNF genes. Also present at these positions is the repressor protein KAP1, which is known to interact with HP1 proteins and SETDB1 (Frietze et al., 2010; Groner et al., 2010). Articial targeting of KAP1 to genomic loci can trigger recruitment and cis-spreading of HP1b and H3K9me3, often over several tens of kb (Groner et al., 2010), pointing to a role of KAP1 in the nucleation of at least some H3K9me3 domains. Replication-Timing Domains Timing of DNA replication during S phase is tightly controlled. High-resolution genome mapping shows 100 kb to several Mb-sized (Figure 1A) alternating segments of early and late replicating DNA (Hiratani et al., 2008; Schwaiger et al., 2009). Such domains are thought to contain clusters of synchronously ring origins. A ner temporal sampling through S phase, however, suggests that a binary early versus late classication may be overly simplistic and should eventually be replaced with a more continuous scale (Hansen et al., 2010). Nevertheless, early-replicating domains generally contain active genes, whereas late-replicating domains are mostly transcriptionally silent (Hiratani et al., 2008; Schwaiger et al., 2009). For about 50% of the genome, the timing of replication is dependent on cell type and usually linked to the transcription status of the region (Schwaiger et al., 2009; Hansen et al., 2010; Ryba et al., 2010). Human late-replicating regions are depleted of various active histone marks and enriched for H3K9me2, but they are not enriched for H3K27me3, indicating the different characteristics of inactive domains that are established by distinct mechanisms (Ryba et al., 2010). Late-replicating domains also correlate with lamina-associated domains (LADs), which are regions that are preferentially located at the nuclear periphery (see below). During differentiation, there is a consolidation of replication-timing domains to fewer larger domains (Ryba et al., 2010). Intriguing insights into the inuence of genome sequence and genomic context on replication-timing domains come from the behavior of a human chromosome 21 in mouse cells (Pope et al., 2012), but how the regional coordination of replication timing is established is still largely unclear. Depletion of HP1 in Drosophila was found to cause both premature as well as delayed replication of large HP1-bound domains, depending on the genomic location (Schwaiger et al., 2010). In mESCs, deletion of G9a leads to substantial global loss of H3K9me2 but does not detectably affect replication timing (Yokochi et al., 2009), indicating that late replication is not a direct result of the presence of H3K9me2. However, H3K9me3 is still present in these cells and may be redundant with H3K9me2 in the regulation of replication timing. Systematic studies are needed to

determine the roles of chromatin components in the regional timing of replication. DNA Methylation DNA cytosine methylation in mammals occurs predominantly at CG dinucleotides, although bisulte sequencing of the genome from hESCs and mouse brain has also detected substantial cytosine methylation in other sequence contexts (Lister et al., 2009; Xie et al., 2012). CG methylation in hESCs and other cells in vivo is ubiquitous, with nearly 80% of the CG dinucleotides throughout the genome showing a methylation frequency above 80%. This contrasts strongly with the mostly unmethylated state of focal CpG islands (approximately 1 kb in size, Figure 1A). In general, there is a strong anticorrelation between Polycomb and DNA methylation in such cells. In contrast, cultured human somatic cell lines exhibit numerous so-called partially methylated domains (PMDs) that exhibit lower CG methylation frequencies and have a median size of 50 kb (Lister et al., 2009) (Figure 1A). Genes in PMDs tend to be downregulated and, in some cases, show increased levels of H3K27me3. Highly similar hypomethylated PMDs were also observed in human colon cancers, but not in the adjacent normal tissue (Hansen et al., 2011; Berman et al., 2012). Furthermore, it was noted that these PMDs tend to overlap with LADs and domains of H3K9me2 enrichment (Hansen et al., 2011; Berman et al., 2012), although the latter domains were mapped in different cell types. Therefore, in some mammalian cell lines and tumors, but apparently not in normal cells in vivo, the globally high DNA methylation is interrupted by domains of reduced methylation levels that tend to coincide with repressive domains marked by H3K9me2 and nuclear lamina interactions. Inferring 3D Organization from Linear Maps: LADs and NADs A special class of chromatin domains is formed by genomic regions that interact with relatively xed nuclear structures. In particular, the nuclear lamina (NL) has been implicated in the anchoring of chromosomal domains. The NL covers the nucleoplasmic side of the inner nuclear membrane (INM) and consists of lamin proteins that form long polymers (Prokocimer et al., 2009). Maps of genomeNL interactions provide insight into the overall spatial organization of interphase chromosomes. In mammalian cells, DamID was used to identify about 1,300 LADs that contact lamin B1 (Guelen et al., 2008). These domains are large, about 100 kb10 Mb (median size of 0.5 Mb, Figure 1A), and collectively cover nearly 40% of the genome. In part, the NL interaction pattern is cell type specic (PericHupkes et al., 2010). A domain pattern of interactions with the NL was also observed in ies and worms (Pickersgill et al., 2006; Gerstein et al., 2010; van Bemmel et al., 2010). The majority of genes located in LADs are transcriptionally inactive, indicating that the NL constitutes a repressive environment (Pickersgill et al., 2006; Guelen et al., 2008; Gerstein et al., 2010; Peric-Hupkes et al., 2010; van Bemmel et al., 2010). Indeed, articial tethering of a locus to the NL or INM can lead to reduced gene expression (Finlan et al., 2008; Reddy et al., 2008; Dialynas et al., 2010), although not in all instances (Kumaran and Spector, 2008). How LADs are targeted to the NL is still poorly understood. In part, it may involve NL-associated DNA-binding proteins that recognize specic sequence features. A recent study suggested

that (GA)n repeats can target certain human LADs to the NL (Zullo et al., 2012). However, a systematic genome-wide survey of repeats did not nd (GA)n repeats to be enriched in LADs (Guelen et al., 2008), so this mechanism must be relatively rare. In mammals, constitutive LADs (i.e., LADs shared among cell types) show a striking overlap with A/T-rich isochores, suggesting a role for long stretches of A/T-rich DNA in NL targeting (Meuleman et al., 2012). In C. elegans, it was found that two H3K9 methyltransferases, MET-2 and SET-25, together promote the peripheral localization and silencing of a transgene repeat (Towbin et al., 2012). A combined knockout of the two enzymes also caused a partial loss of NL interactions genome-wide. Thus, both DNA sequences and chromatin modications can drive NL contacts of LADs. The nucleolus appears to be another platform for organizing the genome. Two groups have identied human DNA sequences meth et al., in the chromatin associated with puried nucleoli (Ne 2010; van Koningsbruggen et al., 2010). Besides the expected ribosomal DNA (rDNA) loci, a domain-like interaction pattern was observed across all chromosomes. These nucleolus-associated domains (NADs) preferentially contain repressed genes and show enrichment for repressive histone marks, in particular H3K9me3. Surprisingly, LAD and NAD patterns in human cells overlap meth et al., 2010; van Koningsbruggen et al., substantially (Ne 2010), although comparisons so far are based on data from different cell types. It is possible that LADs and NADs in part consist of the same type of repressive chromatin that distributes between the NL and nucleoli in a random manner. This model is supported by microscopy observations that some chromosomal regions associated with a nucleolus in a mother cell can be repositioned to the nuclear periphery in the daughter cells after mitosis (Thomson et al., 2004; van Koningsbruggen et al., 2010). Nuclear pore proteins (Nups) also interact with specic genomic loci. One study in Drosophila found many 10500 kb domains to be bound by Nups (Vaquerizas et al., 2010), but a parallel study reported mostly narrow peaks of binding (Kalverda et al., 2010). Because Nups freely roam through the nucleoplasm, most of these binding events occur in the nuclear interior (Capelson et al., 2010; Kalverda et al., 2010) and thus provide only limited information on the spatial organization of chromosomes. Integrative Approaches to Classify Chromatin Domains The overview above illustrates that chromatin domains are prevalent along metazoan genomes. In addition, the distribution patterns of different markers often correlate, indicating that multiple chromatin components work together in the same genomic regions. This raises a number of important questions. Is the genome segmented into domains of a limited number of types (or states), dened by recurrent combinations of DNA sequences, proteins, and histone marks? What are these chromatin types and what are their functions? Several laboratories have begun to address these questions by collecting large sets of genome-wide chromatin maps and using computational approaches to identify chromatin types and to study their domain patterns (Figures 1B and 1C). A survey of 53 broadly selected chromatin proteins in Drosophila Kc cells dened ve principal chromatin types that
Cell 152, March 14, 2013 2013 Elsevier Inc. 1273

segment the genome into domains that frequently span multiple neighboring genes and that consist of specic combinations of proteins (Filion et al., 2010). Some proteins mark only a single chromatin type, whereas others are shared among multiple types. The ve chromatin types are the HP1 and Polycomb chromatin types; two distinct types of transcriptionally active chromatin that harbor different sets of genes; and BLACK chromatin, a strongly repressive chromatin type that lacks HP1 or Polycomb proteins. Even though BLACK chromatin covers nearly twothirds of all repressed genes, it is still largely uncharacterized. Among the proteins that make up BLACK chromatin are histone H1, the AT-hook protein D1, and SU(UR), which is a regulator of late replication. Also present is LAM, the sole B-type Lamin in Drosophila, indicating that BLACK chromatin is preferentially located at the nuclear periphery. In another study in male Drosophila S2 cells, integration of 18 histone modication maps yielded a segmentation into nine chromatin types (Kharchenko et al., 2011). One of these is linked to the dosage-compensation complex that is specic to the male X chromosome, whereas the remaining eight states appear to represent a somewhat ner subdivision of the ve states mentioned above. It is noteworthy that the prevailing state in S2 cells lacks all of the mapped histone marks and may largely correspond to the BLACK state in Kc cells. For a complete picture, it is thus essential to include nonhistone proteins in integrative approaches such as these. Similar systematic mapping efforts in human cell lines have led to classications of a highly variable number of chromatin states, ranging from 6 to as many as 51 (Ernst and Kellis, 2010; Ram et al., 2011; Dunham et al., 2012). However, mostly small, subgenic domains were identied, not the large LADs, NADs, LOCKs, or replication-timing domainsbecause good markers of these latter domain types were mostly missing from these studies. Because there are still hundreds of chromatin components that have not been mapped, and algorithms to categorize chromatin types are still being rened, the classication of chromatin domains should presently be regarded as work-in-progress. Nevertheless, the results of these efforts provide a useful framework to think about the diversity of chromatin domains and their functions. Insulators and Linear Chromatin Domains Insulator elements were originally identied based on their ability to block activation of a promoter by a distant enhancer when inserted between these elements. This enhancer-blocking activity is mediated by proteins/complexes that bind specically to insulator element sequences (Vogelmann et al., 2011). Later it was recognized that these proteins may also help to delimit chromatin domains. In Drosophila, the ve known insulator-binding proteinsSU(HW), dCTCF, GAF, BEAF32, and ZW-5each have several thousand partially overlapping focal binding sites ` gre et al., 2010; van Bemmel et al., 2010; in the genome (Ne Schwartz et al., 2012). Interestingly, most exhibit a signicant enrichment at the borders of H3K27me3 domains, suggestive ` gre et al., of a boundary function to limit this chromatin type (Ne 2010). Indeed, depletion of dCTCF and its cofactor CP190 cause some expansion of certain H3K27me3 domains (Bartkuhn et al., 2009). SU(HW) binding is enriched at LAD borders, but its depletion has only subtle effects on NL interactions (van Bemmel et al.,
1274 Cell 152, March 14, 2013 2013 Elsevier Inc.

2010). A recent extensive survey indicates that most binding sites for insulator-binding proteins in Drosophila may not act as insulators or chromatin-blocking sites but may serve other functions depending on the combination of proteins that is present (Schwartz et al., 2012). The only well-documented mammalian insulator-binding protein to date is CTCF. It has about 15-30,000 focal binding sites (Cuddapah et al., 2009; Handoko et al., 2011) and exhibits a sharp enrichment at the borders of LADs (Guelen et al., 2008). It also frequently demarcates domains of specic histone marks such as H3K27me3 and H2AK5ac (Cuddapah et al., 2009; Handoko et al., 2011), and it may also contribute to the boundaries of some topological domains (see below). This suggests that CTCF helps to insulate chromatin domains, but conclusive evidence for this has been difcult to obtain because depletion of CTCF is lethal to many cell types. Furthermore, as will be discussed below, insulator-binding proteins appear to be tightly linked to the 3D organization of chromatin, and this may well hold a clueto their mechanisms of action. Disruption of individual CTCFbinding sites will be needed to directly demonstrate their function. Domains in 3D Linear genomic maps of chromatin features provide new insights into the relationships between primary DNA sequence, some chromatin structures, and gene expression, but understanding genome function in vivo requires a consideration of the 3D structure of chromosomes in the nucleus. Historically, insights into this aspect of genome biology have come from visual approachesmainly immunouorescence and FISH. But this is now being complemented by high-throughput assays that use crosslinking and intramolecular ligation assays to query the spatial relationships of different genomic loci. Collectively these techniques have been termed 3C-based approaches (see de Wit and de Laat, 2012 for a recent review of methodologies). Broadly, these methods fall into two categories: those that look outward from a particular sequence of interest to see what other sequences in the genome can be captured together with the original bait locus, and those that query all possible combinations of ligated fragmentseither within a dened genomic region or across a whole genome. The former (one against all) methodology is exemplied by the 4C technique (Simonis et al., 2006) and has been particularly useful for investigating the associations of specic genes with putative longrange regulatory elements (Noordermeer et al., 2008; Montavon et al., 2011). Of the all-against-all methods, 5C is designed to investigate chromatin conformation at high resolution within a dened genomic region of up to a few Mb in size (Dostie et al., 2006; Nora et al., 2012). This is exactly the size range over which much long-range gene regulation seems to be able to operate in mammalian cells (Williamson et al., 2011). However, probing the spatial associations of entire large genomes of complex metazoans is currently beyond the scope of 5C. Hi-C is the method of choice for this latter kind of analysis (Lieberman-Aiden et al., 2009; Dixon et al., 2012; Kalhor et al., 2012; Sexton et al., 2012; Zhang et al., 2012). The rst Hi-C studies of metazoan genomes gave relatively coarse-grained (1 Mb resolution) views of genome topology,

Figure 2. Radial Organization in the Nucleus, within and between Chromosomes


(A) mESC nucleus hybridized by FISH with a paint for mouse chromosome 2 (green) and a probe (red) for just the exome of mouse chromosome 2 (Boyle et al., 2011). This demonstrates the looping-out of the gene-dense regions from the core chromosome territory and their disposition away from the nuclear periphery and toward the interior of the nucleus. (B) Human lymphoblastoid cell nucleus hybridized by FISH with paint for the gene-rich human chromosome 19 (red) and gene-poor chromosome 18 (green) reveals the radial organization of chromosomes in the nucleus. (C) Courtesy of Frank Alber (University of Southern California). A density contour plot for the localization probability (red = max, green = min) of human chromosomes 1, 11, 14, 15, 16, 17, 19, 20, 21, and 22 in lymphoblastoid cell nuclei modeled from Hi-C data recapitulates the clustering of gene-rich chromosomes in the center of the nucleus, along with the rDNA-containing acrocentric chromosomes (Kalhor et al., 2012).

the effective resolution of the analyses being limited not only by the choice of restriction enzyme used to fragment the chromatin (e.g., 4 bp or 6 bp cutter) but also by the library complexity generated after the PCR amplication of the ligated fragments and by the depth of sequencing (Lieberman-Aiden et al., 2009). In an allversus-all assay such as Hi-C, a 10-fold increase in resolution requires a 102-fold increase in the sequencing depth of a library that is of sufcient complexity. Rapid increases in sequence depth are now allowing the development of higher-resolution topological genome maps (Dixon et al., 2012). The Territory of the Chromosome The dominant feature apparent in Hi-C analyses of metazoan genomes, irrespective of cell type, is that each chromosome is largely an individual territorymost of the captured associations are in cis rather than in trans (Lieberman-Aiden et al., 2009; Kalhor et al., 2012; Sexton et al., 2012; Zhang et al., 2012), and this is consistent with the appearance of chromosomes in the nucleus as detected by FISH (Figures 2A and 2B). Both 4C (Tolhuis et al., 2011) and Hi-C analyses (Hou et al., 2012; Kalhor et al., 2012; Sexton et al., 2012) suggest that the centromere forms some kind of barrier that attenuates associations between sequences located on the two opposite arms (Figure 3) of the same chromosomeconsistent with the visually separate appearance of the p and q arms of human metacentric chromosomes (Dietzel et al., 1998). Even though associations in cis seem to dominate in most 3C studies, robust associations of some sequences in trans are also consistently seen in large-scale studies. Such sequences tend to be in chromosomal regions characterized by high local gene density, high transcriptional activity, and high density of DNase I-hypersensitive sites (DHS) (Simonis et al., 2006; Lieberman-Aiden et al., 2009; Hou et al., 2012; Kalhor et al., 2012; Sexton et al., 2012). The loci that have the highest probability of making crosslinkable interactions in trans are those that also visibly loop outside of the core of the visible chromosome territory to the

greatest extent (Kalhor et al., 2012). The inltration of a loopedout activated genomic regionthe human major histocompatibility complexinto the territories of other chromosomes has been directly visualized by high-resolution FISH (Branco and Pombo, 2006). The whole-scale extent to which gene-dense chromosomal regions decorate the outside of their own chromosome territories, beyond the limits of the core territory detected by FISH with traditional chromosome paints and toward the nuclear interior (Figure 2A), has now been revealed using custom FISH probes composed of high-complexity oligonucleotide pools targeted to the exonic regions of an entire chromosome (Boyle et al., 2011). This calls for a reconsideration of what is understood by the term chromosome territory to encapsulate the core condensed part of the territory (that is visible with standard chromosome paints) and the surrounding territory corona that is only detectable by FISH with either locus-specic probes or with probes targeted at gene-dense chromatin. In this context, looping-out from chromosome territories is actually looping-in to the corona of ones own territory and perhaps also looping-in to the corona of neighboring territories. The ability of a particular locus to locate outside of its own chromosome territory core, and so have an increased probability of intermingling with sequences from other chromosomes, is not just dependent on that locuss own transcriptional activity or chromatin state but rather is inuenced by its local linear chromosome context. This is best exemplied by the different behaviors of the a-globin locus in mouse or human erythroid cells. In human primary erythroid cells, the a-globin gene cluster is within a large decondensed chromatin domain that often lies outside of its own chromosome territory core (Brown et al., 2006) and in spatial proximity to other active erythroid gene loci located on other chromosomes (Brown et al., 2008). Due to a break in conserved synteny, the mouse a-globin locus is embedded in a genomic context different from that of its human ortholog, its local chromatin environment is more condensed,
Cell 152, March 14, 2013 2013 Elsevier Inc. 1275

Figure 3. Long-Range Interactions among Polycomb Domains in Drosophila


(A) Interaction prole of the Polycomb target gene Ant-B (position indicated by red triangle) as determined by 4C. Data are shown as a domainogram, which visualizes the statistical signicance (purple: moderately signicant; red: highly signicant) of the detected interactions for each chromosomal position. Vertical axis represents the resolution at which the signicance was calculated (see Tolhuis et al., 2011 for details). Note that the interactions mostly overlap with Polycomb domains (blue boxes) yet are restricted to one chromosome arm. (B) Effect of a chromosomal inversion that joins the two red segments and the two blue segments. Data are plotted in the same order as in (A) for easy comparison. Note that Ant-B now interacts with several loci that were previously on the other chromosome arm and no longer with loci that are moved toward the other arm. (C) Cartoon interpretation of (A), illustrating the stochastic contacts of Ant-B (red triangle) with various Polycomb domains on the same chromosome arm.

and it makes infrequent associations in trans with other highly expressed genes in murine erythroid cells. Strikingly, when the endogenous mouse a-globin locus was replaced with 120 kb of the human sequence, the nuclear organization properties of the humanized a-globin locus in mouse erythroid cellsi.e., few trans-associations and a localization within the chromosome territory corewere those characteristic of mouse and not human a-globin (Brown et al., 2008). Thus, even for a 120 kb locus, the broader genomic context matters for its spatial organization. Clustering of Active Regions Systematic FISH analysis across a multimegabase region of the mouse genome demonstrated that within a single chromosome, multiple gene-rich segments within the linear genome sequence have a tendency to cluster together in the nuclear space (Shopland et al., 2006). This clustering was not seen for the intervening gene-poor domains. 4C and Hi-C techniques have conrmed this tendency for active gene-dense domains to be able to associate with each other on a global scale (Simonis et al., 2006; Lieberman-Aiden et al., 2009; Hakim et al., 2011; Splinter et al., 2011; Yaffe and Tanay, 2011; Hou et al., 2012; Kalhor et al., 2012; Sexton et al., 2012; Zhang et al., 2012). The majority of these associations are intrachromosomal, but some interchromosomal associations are also captured. This has been conrmed by FISHspatial proximity of loci associating in cis is generally seen in a much higher proportion of nuclei than for loci involved in 3C interchromosomal associations. Although these associations are occurring between active genomic regions, they are not dependent on ongoing transcription (Palstra et al., 2008). It may be that some other chromatin or functional feature of these regions is responsible, or that transcription has a role in the estab1276 Cell 152, March 14, 2013 2013 Elsevier Inc.

lishment but not the maintenance of chromosomal interaction networks. More detailed analyses of genomewide chromatin proles have identied DHS as the most prominent chromatin feature enriched in the active domains with a propensity to form long-range and interchromosomal 3C associations with each other (Hakim et al., 2011; Yaffe and Tanay, 2011). Whether it is the transcription factors and chromatin-binding proteins responsible for generating the DHS or a feature of the underlying DNA topology itself that is critical for the high-frequency of formaldehyde crosslinked associations captured between these domains remains to be determined. So what is the functional signicance of the clustering of active domains in the nucleus? At one extreme, it could be a manifestation of gene relocalization to specic transcription factories that are specialized in particular transcriptional pathways and driven by discrete transcription factors (Sutherland and Bickmore, 2009; Schoenfelder et al., 2010). However, 4C indicates that associations between active domains are rather similar between tissue types and that the associations captured by the active b-globin locus in erythroid cells are to other generally transcriptionally active regions, rather than to regions with erythroid- or otherwise tissue-specic expression (Simonis et al., 2006). Moreover, the transcriptional changes that are rapidly induced in cells in response to ligand-activated glucocorticoid receptor (GR) occur without any detectable dramatic changes in nuclear organization (Hakim et al., 2011). One possibility is that the clustering of active genes is a reection of their congregation around splicing factor-enriched nuclear speckles (Brown et al., 2006). Consistent with the view of transcriptional responses playing out against the background of a pre-established generic spatial organization, recent Hi-C analysis has suggested that the associations between active genomic regions are largely indiscriminatei.e., there are no preferred pairs of associated domains,

beyond those that are due to sharing of the same chromosome territory or due to the known spatial clustering of small generich chromosomes to the center of the nucleus as a consequence of radial nuclear organization (Boyle et al., 2001; Cremer et al., 2001; Kalhor et al., 2012) (Figures 2B and 2C). This stochastic self-association of very active genomic regions may in turn contribute toward radial nuclear organization. The functional consequences of spatial encounters between loci from different chromosomes are revealed by the behavior of a human b-globin enhancer (the LCR) integrated into an ectopic site in the mouse genome. Even in places of erythropoiesis (fetal liver) where the LCR normally acts to enhance b-globin expression, this element did not change the spectrum of 4C associations made by the site of integration. However, it did increase the efciency of pre-existing contacts likely by enhancing looping out of the surrounding region from the chromosome territory core (Noordermeer et al., 2011a). There was no functional consequence of the integrated LCR on the expression of other mouse genes, except for those immediately in cis to the integrated LCR and one other gene located elsewhere. The endogenous mouse b-globin gene Hbb-bh1 is normally expressed only at earlierembryonicstages of development dependent on its own LCR, but it was upregulated in fetal livers of human LCR transgenic animals. Almost a third of the cells with detectable cytoplasmic Hbb-bh1 messenger RNA (mRNA) showed spatial colocalization of Hbb-bh1 with the ectopic human LCR in trans, far above the frequency seen in the cell population as a whole. This elegant experiment demonstrates that there can be functional consequences on gene expression for colocalization in trans, but that due to the stochastic nature of these interactions and the constraint placed on them by their surrounding genomic context, they are unlikely to have a deterministic role in pathways of developmental gene regulation. However, they could contribute to variation in gene expression levels between cells of a population that could then be acted upon and exploited, for example, by external signaling pathways or environmental cues. Clustering of Inactive Regions Whereas active regions tend to associate with other active regions, do inactive regions then preferentially keep company with other inactive regionsand for what purpose? Although FISH analysis along a mouse chromosome did not nd clustering of inactive domains with each other in the same way that was seen for the active regions (Shopland et al., 2006), both 4C (Simonis et al., 2006) and Hi-C (Lieberman-Aiden et al., 2009; Sexton et al., 2012) studies revealed the preferential capture of inactive loci with other inactive regions of the genome. A renement of the computational analysis of Hi-C data from mammalian cells unmasked further subtleties in this pattern, with clusters of inactive regions partitioning into those that are close to centromeres or located on relatively short chromosome arms and those that are more distally located on larger chromosomes (Yaffe and Tanay, 2011; Imakaev et al., 2012). This seems unlikely to reect a fundamental difference in the nature of these inactive chromatin domains themselves but rather some restriction that is placed on their ability to encounter each otherdictated by overall chromosome topology or centromere behavior. A Hi-C analysis of the Drosophila embryo was also able to detect the known spatial clustering of telomeres and of

centromeres with each other and with the heterochromatic 4th chromosome (Sexton et al., 2012). This is consistent with the known ability of heterochromatin clustering to drive changes in nuclear organization (Csink and Henikoff, 1996) and indicates that H3K9-methylated genomic regions have a tendency to self-associate. Inactive regions are more constrained from associating over long genomic distances than are active regionstheir Hi-C contacts being restricted to regions from the same chromosome and, moreover, the same chromosome arm. This probably reects the fact that inactive regions have less freedom of motion in the nucleus than active domains and are restricted to life within their own core chromosome territories and at the NL (Kalhor et al., 2012). This has been directly visualized for one particular example. Hox loci are maintained in a silent and compact chromatin state in mammalian ESCs by the Polycomb PRC2 and PRC1 complexes (Eskeland et al., 2010), and they are entirely located within their host chromosome territories. Upon their transcriptional activation, Hox loci acquire a greater freedom of movement in the nucleus, and the active alleles can then be seen to adopt positions either inside or outside of their chromosome territory cores (Morey et al., 2009). This coincides with an enhanced ability of Hox loci to be captured together with sequences from other chromosomes in 3C-type experiments rtele and Chartrand, 2006). (Wu Genome-wide Hi-C studies do not reveal any spatial segregation of different categories of inactive chromatin (e.g., H3K9me3 versus H3K27me3). However, in Drosophila, long-range 4C contacts and spatial colocalization of silent Polycomb targets, including Hox loci, have been demonstrated in embryos and larvae (Bantignies et al., 2011; Tolhuis et al., 2011) (Figure 3A). Silent, non-Polycomb target loci do not cluster with the Polycomb sites, indicating that this association is not simply driven by lack of transcriptional activity, and indeed associations of Polycomb target loci were shown to be dependent on PcG proteins themselves. This indicates that the spatial associations of some inactive regions are directly linked to their particular mechanism of epigenetic silencing. The phenotypic consequences of spatial associations between Polycomb targets are less clear. Some altered gene expression, and a modest phenotypic consequence, could be measured when associations were perturbed in one study (Bantignies et al., 2011). However, major changes in spatial associations of PcG targets, caused by a chromosomal inversion (Figure 3B), were not accompanied by any detectable alteration of gene expression and no gross phenotypic consequence in ies homozygous for the inversion (Tolhuis et al., 2011). Topologically Associating Domains Recent Hi-C and 5C studies in y and mammalian cells have yielded data sets of unprecedented resolution and coverage (Dixon et al., 2012; Hou et al., 2012; Nora et al., 2012; Sexton et al., 2012). These studies indicate that crosslinked associations are enriched locally within discrete domains that are 100 kb in size in Drosophila and an average of 900 kb in mammalian cells (Figures 1A and 4A). Further increases in sequencing depth may lead to renements of these estimates. In mammalian cells, about 2,000 of these domains collectively tile most of the genome (Dixon et al., 2012). High-resolution FISH was used to
Cell 152, March 14, 2013 2013 Elsevier Inc. 1277

Figure 4. Topologically Associating Domains


(A) Matrix plot showing all pairwise interaction frequencies captured by 5C (color scale whiteblueblack) among loci in an 4.5 Mb region on the X chromosome in mESCs. TADs are the large, discrete blocks within which the pairwise contacts are relatively frequent. Gray area corresponds to a repetitive region that could not be probed. (B) H3K27me3 distribution (Marks et al., 2009) and Lamin B1 interactions (Peric-Hupkes et al., 2010) in the same region. Note the partial overlap with TADs. ` ge Nora (Institut Curie). Images in (A) and (B) are courtesy of Elphe

show that two loci within a single domain are visibly closer together in the nucleus than loci separated by the same genomic distance but located in adjacent domains. Additionally, hybridization signals from complex probe pools entirely located within one domain intermingle with each other to a greater extent than probe pools that span across domain boundaries (Nora et al., 2012). Hence these sub-megabase-sized self-associating domains have been called topologically associating domains (TADs), or simply topological domains. These domains show a remarkable degree of alignment to the distribution of some active and repressive (H3K9me3 and H3K27me3) histone modications along the genome and also to LADs (Figure 4B). However, the stability of these local self-interacting domains in cell types with very different patterns of gene expression and epigenetic modicationse.g., ESCs,
1278 Cell 152, March 14, 2013 2013 Elsevier Inc.

adult broblasts, and brainsuggests that it is not transcription or histone modications that dictate the domain structure but rather some inherent property linked to the underlying genome sequence, for example, binding sites for ubiquitous sequencespecic DNA-binding factors. That histone modications act downstream of topological domain structure is most graphically illustrated by persistence of the domains in ESCs that lack G9a or Eed, the histone-modifying activities responsible for H3K9me2 and H3K27me3, respectively (Nora et al., 2012). Some differences in Hi-C interactions are seen between cell types, and these correspond to differentially regulated genes. Interestingly, these facultative focal interactions tend to occur within the connes of the stable domains. This suggests that Hi-C- and 5C-dened domains may reect the normal sphere of inuence over which long-range gene regulation can occur. Similarly, expression changes during differentiation of X-linked genes within the same 5C-dened domain are more correlated than those of genes located in separate domains or for random gene sets. Moreover, genetic deletion of a specic topological boundary on the mouse X resulted in transcriptional misregulation and the acquisition of new ectopic long-range 5C contacts (Nora et al., 2012). To dene what might constitute the boundaries of TADs, their positions have been aligned to maps of insulator proteins and other epigenomic features. In Drosophila, domain borders were enriched in DHS and particularly in binding sites of the insulator component CP190 and of Chromator, a known regulator of chromosome structure (Sexton et al., 2012). In mammalian cells, domain boundaries were especially enriched in the promoters of housekeeping genes and binding sites for CTCF. However, only a minority of CTCF sites are associated with TAD boundaries. Other genomic elements enriched at mammalian TAD boundaries include transfer RNA (tRNA) genes and interspersed repeats of the SINE family, both of which are capable of conferring insulator activity (Lunyak et al., 2007; Raab et al., 2012). One feature that may unify these various elements is the presence of nucleosome-free regions and disrupted chromatin ber structure. Together with the binding of specic protein factors, this might result in a rigid chromatin structure that constrains regions on either side from intermingling. That elements located between TADs are important for restraining associations between adjacent domains is demonstrated by the consequence of a deletion that removes one of the TAD boundaries on the mouse X chromosome (Nora et al., 2012). Loci from the two anking TADs gained interactions with each other, and the structure of one of the original TADs was recongured as a result of this deletion. However, the fact that the two TADs anking the deletion did not completely merge into one also indicates that TADs are governed by factors in addition to domain boundaries. Chromatin Compaction and Chromatin Domains It has long been appreciated that levels of chromatin compaction vary across the genome. The appearance of puffs on Drosophila polytene chromosomes is a graphic manifestation of the chromatin decompaction of active genomic regions, as is the bloated appearance of the hyperactive X chromosome in male ies. In mammalian cells, FISH has also revealed differences in compaction from different regions of the genome. For example, hybridization signal from a gene-rich chromosome

territory occupies a larger proportion of the nucleus than does the signal from an equivalently (in Mb) sized gene-poor chromosome (Croft et al., 1999), and different degrees of compaction have been inferred from the relationships of interprobe nuclear versus genomic distances at G-band and R-band regions of the genome (Yokota et al., 1997). The rst genome-wide attempt to document chromatin compaction in mammalian cells used sucrose-gradient sedimentation to assay the frictional properties of long chromatin fragments (Gilbert et al., 2004). Slowly sedimenting bers (high frictional coefcient) were inferred to be in a more open structure than fast sedimenting bers of the same length, and this was conrmed by FISH. Domains of open chromatin ber structure correspond to the most gene-dense active R-band regions of the genome. Other studies have conrmed a more compact and spherical chromatin structure of gene-poor regions compared to gene-dense domains, and this structure is independent of transcriptional activity (Goetze et al., 2007). In the same way that interphase distances measured by FISH scale with genomic distance, so all 3C-type data decay with increasing genomic distance. Consistent with the idea of a less compact chromatin structure at active genomic regions, the calculated scaling factor for Hi-C data from active regions of the Drosophila genome is higher than that for repressed domains, and this was suggested to also reect a lower level of chromatin compaction at active domains (Sexton et al., 2012). There will be very many factors involved in modulating higherorder chromatin compaction, but one specic determinant known to be important for maintaining inactive domains in a compact state are the PRCs. In mESCs, the PRC1 complex was shown by FISH to be essential for maintaining a compact chromatin state at silent Hox loci. The PRC2 complex and the associated H3K27me3 are insufcient to maintain chromatin compaction; PRC1 was required (Eskeland et al., 2010). These observations are consistent with the very compact appearance of Hox loci in polytene chromosomes from anterior parts of Drosophila larvae where Hox genes are repressed and bound by Polycomb (Marchetti et al., 2003). This view of Polycomb-repressed regions as compact chromatin domains is consistent with some (Ferraiuolo et al., 2010; Noordermeer et al., 2011b), but not all (Wang et al., 2011) interpretations of 3C-type studies. Moreover, the compact structure of active Hox loci inferred from 4C and 5C analysis of human broblasts is at odds with the decompact appearance of active Hox loci in mammalian cells and embryos as visualized by FISH (Chambeyron et al., 2005; Morey et al., 2007), and with the puffed appearance of active Hox loci from the posterior of Drosophila larvae (Marchetti et al., 2003). More work needs to be done to determine to what extent 3C proles across a locus can be translated into 3D chromatin conformation (Dostie and Bickmore, 2012). Computational Models of Genome Topology Interpreting the rich data from large-scale 3C-type studies in terms of chromosome folding requires advanced computer modeling. Methods for this are still being developed and come in two broad classes. The rst class of methods begins with theoretical analysis or in silico modeling of idealized polymers in a conned space as

a model of chromosomes in a nucleus. By altering specic variables such as polymer stiffness, repulsion or stickiness of bers, the presence of looping interactions, etc., one can then calculate dening parameters that describe the spatial folding of chromosomes, such as the overall relationship between the linear distance (in kb) of two loci and their contact frequency. These theoretical parameters are then matched against those determined from 5C or Hi-C data sets as well as FISH distance measurements, and the model yielding the best match is considered to be the most likely model to describe chromosome architecture. Taking this approach, Lieberman-Aiden et al. (2009) concluded from their Hi-C data that human interphase chromosomes may best be described by a model of a fractal globule a polymer model in which one region is topologically constrained from passing across and entangling with another region and in which the polymer crumples into globules on all scales (Mirny, 2011). However, this model is inconsistent with FISH data that show that the linear relationship of physical distance (meansquare interphase distance) to genomic separation plateaus at distances > 12 Mb (Yokota et al., 1995; Gilbert et al., 2004; Mateos-Langerak et al., 2009). FISH data seem more compatible with 1 Mb equilibrium globule models in which the chromatin ber has a random-walk conguration but is conned within a dened volume. The nature of those boundary walls is not known, but they may be compatible with the boundaries between TADs identied by high-resolution Hi-C and 5C (Dixon et al., 2012; Nora et al., 2012). Another testable feature that is fundamentally different between the fractal and equilibrium globule models is the entanglement of the polymer chain in the latter. However, the action of topoisomerase II should be sufcient to deal with this, and indeed, simulations have suggested that topoII activity may be sufcient to rapidly drive a fractal globule into an equilibrium globule state during interphase (Mirny, 2011). A recent strings and binders switch model combines features of a random-walk polymer model with the effects of interactions mediated by diffusible factors (e.g., proteins) bound to the chromatin (Barbieri et al., 2012). This model nicely reproduces the plateauing of mean-squared interphase distances at larger genomic separations, suggests that high concentrations of bound factors can collapse the chromatin into a compact state, and shows how interactions between different types of bound factors can reproduce the spatial segregation of genomic domains with different characteristics (e.g., active versus inactive). The second class of computational methods aims to reconstruct the actual trajectory of a chromosomal ber inside the nucleus, based on 5C or Hi-C data. Here, the major challenge is that chromosome folding is stochastic and variable from cell to cell. Appropriate modeling is hence done in a probabilistic manner that yields ensembles of possible trajectories, rather than a single average path. One such study adapted a method for solving protein structures from NMR spectroscopy data to model the most likely folding trajectories of the human 0.5 Mb ` et al., 2011; a-globin locus and a bacterial chromosome (Bau Umbarger et al., 2011). Another study modeled the coarsegrained organization and positions of entire chromosomes in thousands of cell nuclei based on Hi-C data (Kalhor et al.,
Cell 152, March 14, 2013 2013 Elsevier Inc. 1279

2012). This faithfully reproduced the statistical clustering of gene-rich chromosomes in the center of the nucleus and the looping out of highly transcribed regions from the bulk of their chromosome territoriesallowing contact between chromosomes (Figure 2). Further development of such computational models will be indispensable to interpret the large amounts of 4C, 5C, and Hi-C data sets that are to be expected. Challenges and Future Directions Genome-wide methodologies provide an unprecedented opportunity to describe both the linear and 3D compartmentalization of genomes. However, presently available mapping techniques require thousands to millions of cells and thus only provide population averages. Most Hi-C and microscopy evidence indicates that specic long-range contacts occur in a small fraction of cells at any given moment (Simonis et al., 2006; Lieberman-Aiden et al., 2009; Schoenfelder et al., 2010). Accordingly, our understanding of the true nature of chromatin domains is probably blurred by population-averaged data, and cartoon models as in Figure 1B should be viewed only as basic working models. For example, is a given chromatin domain occupied by its cognate proteins in every cell of a population, or only in a subset of cells? And in any single cell, is a chromatin domain covered in its entirety by proteins, or only in part? An important future challenge will be to devise strategies that can generate genome-wide data sets from single cells and so directly capture the stochastic behavior of chromosomes. Also lagging behind are experimental approaches that can efciently manipulate linear and 3D domains. These types of interventionist approaches are key to determining the functional signicance of genome organization or whether the structures are just reective of genome functions. The overall similarity of Hi-C maps generated from mammalian cell populations as diverse as rapidly dividing pluripotent ESCs, terminally differentiated cells (broblasts and lymphoblastoid cell lines), and nondividing pro-B cells (Lieberman-Aiden et al., 2009; Dixon et al., 2012; Zhang et al., 2012) suggests that the overall spatial organization of the mammalian genome is a fundamental state upon which genome function is then largely played out. One well-established consequence of 3D organization is its inuence on the spectrum of chromosomal translocations that occur as a consequence of double-strand break repair by nonhomologous end-joining (Zhang et al., 2012). Conversely, translocations perturb the normal spatial context of the participating chromosomes (Tolhuis et al., 2011) and in some instances do appear to impact on gene expression from these chromosomes (Harewood et al., 2010). Using mouse genetics to tailor-make specic translocations might be a productive way to better explore the functional consequences of some aspects of 3D organization. Similarly, deleting specic boundaries between chromatin domains and binding sites for proteins thought to be important determinants of such domains provides a way to explore the functional relevance of linear and 3D compartments (Nora et al., 2012). New methods for the efcient editing of the genome (e.g., Wood et al., 2011) will facilitate such approaches. Articial DNA-binding proteins can also be harnessed to create new topological structures and study the functional consequences (Deng et al., 2012).
1280 Cell 152, March 14, 2013 2013 Elsevier Inc.

Complementing these approaches is the integration of reporter constructs into different types of chromatin domains, which offers a direct readout out of the effects of distinct local environments (Gierman et al., 2007). Because the interplay between genomic elements is expected to be complex, such strategies need to be scaled up in order to explore the wide range of combinatorial possibilities and to obtain unbiased and broadly interpretable results. Despite these challenges, it appears that the pieces of the chromosome puzzle are now coming together, and it will be exciting to dissect the underlying molecular mechanisms and elucidate how chromosome-domain organization contributes to gene regulation and other nuclear functions.
ACKNOWLEDGMENTS We thank members of the van Steensel lab; Duncan Sproul, Nick Gilbert, and Colin Semple of IGMM for critical reading of the manuscript; an anonymous ` ge Nora, reviewer for many thoughtful suggestions; and Frank Alber, Elphe and Tyrone Ryba for help with gures. We are supported by ERC Advanced grant 293662 and NWO-VICI (B.v.S.) and by the MRC and ERC Advanced grant 249956 (W.A.B.). REFERENCES ler, A., Pachkov, M., Balwierz, P.J., Jrgensen, H., Stadler, Arnold, P., Scho beler, D. (2013). Modeling of epigenome M.B., van Nimwegen, E., and Schu dynamics identies transcription factors that mediate Polycomb targeting. Genome Res. 23, 6073. Bantignies, F., Roure, V., Comet, I., Leblanc, B., Schuettengruber, B., Bonnet, J., Tixier, V., Mas, A., and Cavalli, G. (2011). Polycomb-dependent regulatory contacts between distant Hox loci in Drosophila. Cell 144, 214226. Barbieri, M., Chotalia, M., Fraser, J., Lavitas, L.M., Dostie, J., Pombo, A., and Nicodemi, M. (2012). Complexity of chromatin folding is captured by the strings and binders switch model. Proc. Natl. Acad. Sci. USA 109, 16173 16178. Bartkuhn, M., Straub, T., Herold, M., Herrmann, M., Rathke, C., Saumweber, H., Gilllan, G.D., Becker, P.B., and Renkawitz, R. (2009). Active promoters and insulators are marked by the centrosomal protein 190. EMBO J. 28, 877888. ` , D., Sanyal, A., Lajoie, B.R., Capriotti, E., Byron, M., Lawrence, J.B., DekBau ker, J., and Marti-Renom, M.A. (2011). The three-dimensional folding of the a-globin gene domain reveals formation of chromatin globules. Nat. Struct. Mol. Biol. 18, 107114. Berman, B.P., Weisenberger, D.J., Aman, J.F., Hinoue, T., Ramjan, Z., Liu, Y., Noushmehr, H., Lange, C.P., van Dijk, C.M., Tollenaar, R.A., et al. (2012). Regions of focal DNA hypermethylation and long-range hypomethylation in colorectal cancer coincide with nuclear lamina-associated domains. Nat. Genet. 44, 4046. Bernstein, B.E., Mikkelsen, T.S., Xie, X., Kamal, M., Huebert, D.J., Cuff, J., Fry, B., Meissner, A., Wernig, M., Plath, K., et al. (2006). A bivalent chromatin structure marks key developmental genes in embryonic stem cells. Cell 125, 315326. Boyle, S., Gilchrist, S., Bridger, J.M., Mahy, N.L., Ellis, J.A., and Bickmore, W.A. (2001). The spatial organization of human chromosomes within the nuclei of normal and emerin-mutant cells. Hum. Mol. Genet. 10, 211219. Boyle, S., Rodesch, M.J., Halvensleben, H.A., Jeddeloh, J.A., and Bickmore, W.A. (2011). Fluorescence in situ hybridization with high-complexity repeatfree oligonucleotide probes generated by massively parallel synthesis. Chromosome Res. 19, 901909. Branco, M.R., and Pombo, A. (2006). Intermingling of chromosome territories in interphase suggests role in translocations and transcription-dependent associations. PLoS Biol. 4, e138.

Brown, J.M., Leach, J., Reittie, J.E., Atzberger, A., Lee-Prudhoe, J., Wood, W.G., Higgs, D.R., Iborra, F.J., and Buckle, V.J. (2006). Coregulated human globin genes are frequently in spatial proximity when active. J. Cell Biol. 172, 177187. Brown, J.M., Green, J., das Neves, R.P., Wallace, H.A., Smith, A.J., Hughes, J., Gray, N., Taylor, S., Wood, W.G., Higgs, D.R., et al. (2008). Association between active genes occurs at nuclear speckles and is modulated by chromatin environment. J. Cell Biol. 182, 10831097. Capelson, M., Liang, Y., Schulte, R., Mair, W., Wagner, U., and Hetzer, M.W. (2010). Chromatin-bound nuclear pore components regulate gene expression in higher eukaryotes. Cell 140, 372383. Caron, H., van Schaik, B., van der Mee, M., Baas, F., Riggins, G., van Sluis, P., te, P.A., et al. (2001). The human Hermus, M.C., van Asperen, R., Boon, K., Vou transcriptome map: clustering of highly expressed genes in chromosomal domains. Science 291, 12891292. Chambeyron, S., Da Silva, N.R., Lawson, K.A., and Bickmore, W.A. (2005). Nuclear re-organisation of the Hoxb complex during mouse embryonic development. Development 132, 22152223. Costantini, M., Clay, O., Auletta, F., and Bernardi, G. (2006). An isochore map of human chromosomes. Genome Res. 16, 536541. Cremer, M., von Hase, J., Volm, T., Brero, A., Kreth, G., Walter, J., Fischer, C., Solovei, I., Cremer, C., and Cremer, T. (2001). Non-random radial higher-order chromatin arrangements in nuclei of diploid human cells. Chromosome Res. 9, 541567. ller, S., Solovei, I., and Fakan, S. (2006). Cremer, T., Cremer, M., Dietzel, S., Mu Chromosome territoriesa functional nuclear landscape. Curr. Opin. Cell Biol. 18, 307316. Croft, J.A., Bridger, J.M., Boyle, S., Perry, P., Teague, P., and Bickmore, W.A. (1999). Differences in the localization and morphology of chromosomes in the human nucleus. J. Cell Biol. 145, 11191131. Csink, A.K., and Henikoff, S. (1996). Genetic modication of heterochromatic association and nuclear organization in Drosophila. Nature 381, 529531. Cuddapah, S., Jothi, R., Schones, D.E., Roh, T.Y., Cui, K., and Zhao, K. (2009). Global analysis of the insulator binding protein CTCF in chromatin barrier regions reveals demarcation of active and repressive domains. Genome Res. 19, 2432. de Wit, E., and de Laat, W. (2012). A decade of 3C technologies: insights into nuclear organization. Genes Dev. 26, 1124. Deng, W., Lee, J., Wang, H., Miller, J., Reik, A., Gregory, P.D., Dean, A., and Blobel, G.A. (2012). Controlling long-range genomic interactions at a native locus by targeted tethering of a looping factor. Cell 149, 12331244. Dialynas, G., Speese, S., Budnik, V., Geyer, P.K., and Wallrath, L.L. (2010). The role of Drosophila Lamin C in muscle function and gene expression. Development 137, 30673077. nkel, Dietzel, S., Jauch, A., Kienle, D., Qu, G., Holtgreve-Grez, H., Eils, R., Mu C., Bittner, M., Meltzer, P.S., Trent, J.M., and Cremer, T. (1998). Separate and variably shaped chromosome arm domains are disclosed by chromosome arm painting in human cell nuclei. Chromosome Res. 6, 2533. Dixon, J.R., Selvaraj, S., Yue, F., Kim, A., Li, Y., Shen, Y., Hu, M., Liu, J.S., and Ren, B. (2012). Topological domains in mammalian genomes identied by analysis of chromatin interactions. Nature 485, 376380. Dostie, J., and Bickmore, W.A. (2012). Chromosome organization in the nucleus - charting new territory across the Hi-Cs. Curr. Opin. Genet. Dev. 22, 125131. Dostie, J., Richmond, T.A., Arnaout, R.A., Selzer, R.R., Lee, W.L., Honan, T.A., Rubio, E.D., Krumm, A., Lamb, J., Nusbaum, C., et al. (2006). Chromosome Conformation Capture Carbon Copy (5C): a massively parallel solution for mapping interactions between genomic elements. Genome Res. 16, 1299 1309. Dunham, I., Kundaje, A., Aldred, S.F., Collins, P.J., Davis, C.A., Doyle, F., Epstein, C.B., Frietze, S., Harrow, J., Kaul, R., et al.; ENCODE Project Consortium. (2012). An integrated encyclopedia of DNA elements in the human genome. Nature 489, 5774.

Ernst, J., and Kellis, M. (2010). Discovery and characterization of chromatin states for systematic annotation of the human genome. Nat. Biotechnol. 28, 817825. Eskeland, R., Leeb, M., Grimes, G.R., Kress, C., Boyle, S., Sproul, D., Gilbert, N., Fan, Y., Skoultchi, A.I., Wutz, A., and Bickmore, W.A. (2010). Ring1B compacts chromatin structure and represses gene expression independent of histone ubiquitination. Mol. Cell 38, 452464. Ferraiuolo, M.A., Rousseau, M., Miyamoto, C., Shenker, S., Wang, X.Q., Nadler, M., Blanchette, M., and Dostie, J. (2010). The three-dimensional architecture of Hox cluster silencing. Nucleic Acids Res. 38, 74727484. Filion, G.J., and van Steensel, B. (2010). Reassessing the abundance of H3K9me2 chromatin domains in embryonic stem cells. Nat. Genet. 42, 4, author reply 56. Filion, G.J., van Bemmel, J.G., Braunschweig, U., Talhout, W., Kind, J., Ward, L.D., Brugman, W., de Castro, I.J., Kerkhoven, R.M., Bussemaker, H.J., and van Steensel, B. (2010). Systematic protein location mapping reveals ve principal chromatin types in Drosophila cells. Cell 143, 212224. Finlan, L.E., Sproul, D., Thomson, I., Boyle, S., Kerr, E., Perry, P., Ylstra, B., Chubb, J.R., and Bickmore, W.A. (2008). Recruitment to the nuclear periphery can alter expression of genes in human cells. PLoS Genet. 4, e1000039. Frietze, S., OGeen, H., Blahnik, K.R., Jin, V.X., and Farnham, P.J. (2010). ZNF274 recruits the histone methyltransferase SETDB1 to the 30 ends of ZNF genes. PLoS ONE 5, e15082. Gerstein, M.B., Lu, Z.J., Van Nostrand, E.L., Cheng, C., Arshinoff, B.I., Liu, T., Yip, K.Y., Robilotto, R., Rechtsteiner, A., Ikegami, K., et al.; modENCODE Consortium. (2010). Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project. Science 330, 17751787. Gierman, H.J., Indemans, M.H., Koster, J., Goetze, S., Seppen, J., Geerts, D., van Driel, R., and Versteeg, R. (2007). Domain-wide regulation of gene expression in the human genome. Genome Res. 17, 12861295. Gilbert, N., Boyle, S., Fiegler, H., Woodne, K., Carter, N.P., and Bickmore, W.A. (2004). Chromatin architecture of the human genome: gene-rich domains are enriched in open chromatin bers. Cell 118, 555566. Goetze, S., Mateos-Langerak, J., Gierman, H.J., de Leeuw, W., Giromus, O., Indemans, M.H., Koster, J., Ondrej, V., Versteeg, R., and van Driel, R. (2007). The three-dimensional structure of human interphase chromosomes is related to the transcriptome map. Mol. Cell. Biol. 27, 44754487. nervaud, N., Groner, A.C., Meylan, S., Ciuf, A., Zangger, N., Ambrosini, G., De Bucher, P., and Trono, D. (2010). KRAB-zinc nger proteins and KAP1 can mediate long-range transcriptional repression through heterochromatin spreading. PLoS Genet. 6, e1000869. Guelen, L., Pagie, L., Brasset, E., Meuleman, W., Faza, M.B., Talhout, W., Eussen, B.H., de Klein, A., Wessels, L., de Laat, W., and van Steensel, B. (2008). Domain organization of human chromosomes revealed by mapping of nuclear lamina interactions. Nature 453, 948951. Hakim, O., Sung, M.H., Voss, T.C., Splinter, E., John, S., Sabo, P.J., Thurman, R.E., Stamatoyannopoulos, J.A., de Laat, W., and Hager, G.L. (2011). Diverse gene reprogramming events occur in the same spatial clusters of distal regulatory elements. Genome Res. 21, 697706. Handoko, L., Xu, H., Li, G., Ngan, C.Y., Chew, E., Schnapp, M., Lee, C.W., Ye, C., Ping, J.L., Mulawadi, F., et al. (2011). CTCF-mediated functional chromatin interactome in pluripotent cells. Nat. Genet. 43, 630638. Hansen, K.D., Timp, W., Bravo, H.C., Sabunciyan, S., Langmead, B., McDonald, O.G., Wen, B., Wu, H., Liu, Y., Diep, D., et al. (2011). Increased methylation variation in epigenetic domains across cancer types. Nat. Genet. 43, 768775. Hansen, R.S., Thomas, S., Sandstrom, R., Caneld, T.K., Thurman, R.E., Weaver, M., Dorschner, M.O., Gartler, S.M., and Stamatoyannopoulos, J.A. (2010). Sequencing newly replicated DNA reveals widespread plasticity in human replication timing. Proc. Natl. Acad. Sci. USA 107, 139144. tz, F., Boyle, S., Perry, P., Delorenzi, M., Bickmore, W.A., Harewood, L., Schu and Reymond, A. (2010). The effect of translocation-induced nuclear reorganization on gene expression. Genome Res. 20, 554564.

Cell 152, March 14, 2013 2013 Elsevier Inc. 1281

Hathaway, N.A., Bell, O., Hodges, C., Miller, E.L., Neel, D.S., and Crabtree, G.R. (2012). Dynamics and memory of heterochromatin in living cells. Cell 149, 14471460. Hawkins, R.D., Hon, G.C., Lee, L.K., Ngo, Q., Lister, R., Pelizzola, M., Edsall, L.E., Kuan, S., Luu, Y., Klugman, S., et al. (2010). Distinct epigenomic landscapes of pluripotent and lineage-committed human cells. Cell Stem Cell 6, 479491. Hiratani, I., Ryba, T., Itoh, M., Yokochi, T., Schwaiger, M., Chang, C.W., Lyou, beler, D., and Gilbert, D.M. (2008). Global reorganizaY., Townes, T.M., Schu tion of replication domains during embryonic stem cell differentiation. PLoS Biol. 6, e245. Hou, C., Li, L., Qin, Z.S., and Corces, V.G. (2012). Gene density, transcription, and insulators contribute to the partition of the Drosophila genome into physical domains. Mol. Cell 48, 471484. Imakaev, M., Fudenberg, G., McCord, R.P., Naumova, N., Goloborodko, A., Lajoie, B.R., Dekker, J., and Mirny, L.A. (2012). Iterative correction of Hi-C data reveals hallmarks of chromosome organization. Nat. Methods 9, 9991003. Jiang, H., Shukla, A., Wang, X., Chen, W.Y., Bernstein, B.E., and Roeder, R.G. (2011). Role for Dpy-30 in ES cell-fate specication by regulation of H3K4 methylation within bivalent domains. Cell 144, 513525. Kalhor, R., Tjong, H., Jayathilaka, N., Alber, F., and Chen, L. (2012). Genome architectures revealed by tethered chromosome conformation capture and population-based modeling. Nat. Biotechnol. 30, 9098. Kalverda, B., Pickersgill, H., Shloma, V.V., and Fornerod, M. (2010). Nucleoporins directly stimulate expression of developmental and cell-cycle genes inside the nucleoplasm. Cell 140, 360371. Kharchenko, P.V., Alekseyenko, A.A., Schwartz, Y.B., Minoda, A., Riddle, N.C., Ernst, J., Sabo, P.J., Larschan, E., Gorchakov, A.A., Gu, T., et al. (2011). Comprehensive analysis of the chromatin landscape in Drosophila melanogaster. Nature 471, 480485. Ku, M., Koche, R.P., Rheinbay, E., Mendenhall, E.M., Endoh, M., Mikkelsen, T.S., Presser, A., Nusbaum, C., Xie, X., Chi, A.S., et al. (2008). Genomewide analysis of PRC1 and PRC2 occupancy identies two classes of bivalent domains. PLoS Genet. 4, e1000242. Kumaran, R.I., and Spector, D.L. (2008). A genetic locus targeted to the nuclear periphery in living cells maintains its transcriptional competence. J. Cell Biol. 180, 5165. Kwon, S.H., and Workman, J.L. (2011). The changing faces of HP1: From heterochromatin formation and gene silencing to euchromatic gene expression: HP1 acts as a positive regulator of transcription. Bioessays 33, 280289. Lee, T.I., Jenner, R.G., Boyer, L.A., Guenther, M.G., Levine, S.S., Kumar, R.M., Chevalier, B., Johnstone, S.E., Cole, M.F., Isono, K., et al. (2006). Control of developmental regulators by Polycomb in human embryonic stem cells. Cell 125, 301313. Lieberman-Aiden, E., van Berkum, N.L., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B.R., Sabo, P.J., Dorschner, M.O., et al. (2009). Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289293. Lienert, F., Mohn, F., Tiwari, V.K., Baubec, T., Roloff, T.C., Gaidatzis, D., Sta beler, D. (2011). Genomic prevalence of heterochromatic dler, M.B., and Schu H3K9me2 and transcription do not discriminate pluripotent from terminally differentiated cells. PLoS Genet. 7, e1002090. Lister, R., Pelizzola, M., Dowen, R.H., Hawkins, R.D., Hon, G., Tonti-Filippini, J., Nery, J.R., Lee, L., Ye, Z., Ngo, Q.M., et al. (2009). Human DNA methylomes at base resolution show widespread epigenomic differences. Nature 462, 315322. n ez, E., Cramer, T., Ju, B.G., Ohgi, K.A., Lunyak, V.V., Prefontaine, G.G., Nu a-D az, A., Zhu, X., et al. (2007). Developmentally reguHutt, K., Roy, R., Garc lated activation of a SINE B2 repeat as a domain boundary in organogenesis. Science 317, 248251. Marchetti, M., Fanti, L., Berloco, M., and Pimpinelli, S. (2003). Differential expression of the Drosophila BX-C in polytene chromosomes in cells of larval

fat bodies: a cytological approach to identifying in vivo targets of the homeotic Ubx, Abd-A and Abd-B proteins. Development 130, 36833689. Marks, H., Chow, J.C., Denissov, S., Franc oijs, K.J., Brockdorff, N., Heard, E., and Stunnenberg, H.G. (2009). High-resolution analysis of epigenetic changes associated with X inactivation. Genome Res. 19, 13611373. Mateos-Langerak, J., Bohn, M., de Leeuw, W., Giromus, O., Manders, E.M., Verschure, P.J., Indemans, M.H., Gierman, H.J., Heermann, D.W., van Driel, R., and Goetze, S. (2009). Spatially conned folding of chromatin in the interphase nucleus. Proc. Natl. Acad. Sci. USA 106, 38123817. Mendenhall, E.M., Koche, R.P., Truong, T., Zhou, V.W., Issac, B., Chi, A.S., Ku, M., and Bernstein, B.E. (2010). GC-rich sequence elements recruit PRC2 in mammalian ES cells. PLoS Genet. 6, e1001244. Meuleman, W., Peric-Hupkes, D., Kind, J., Beaudry, J.B., Pagie, L., Kellis, M., Reinders, M., Wessels, L., and van Steensel, B. (2012). Constitutive nuclear lamina-genome interactions are highly conserved and associated with A/Trich sequence. Genome Res. 23, 270280. Mirny, L.A. (2011). The fractal globule as a model of chromatin architecture in the cell. Chromosome Res. 19, 3751. Montavon, T., Soshnikova, N., Mascrez, B., Joye, E., Thevenet, L., Splinter, E., de Laat, W., Spitz, F., and Duboule, D. (2011). A regulatory archipelago controls Hox genes transcription in digits. Cell 147, 11321145. Morey, L., and Helin, K. (2010). Polycomb group protein-mediated repression of transcription. Trends Biochem. Sci. 35, 323332. Morey, C., Da Silva, N.R., Perry, P., and Bickmore, W.A. (2007). Nuclear reorganisation and chromatin decondensation are conserved, but distinct, mechanisms linked to Hox gene activation. Development 134, 909919. Morey, C., Kress, C., and Bickmore, W.A. (2009). Lack of bystander activation shows that localization exterior to chromosome territories is not sufcient to up-regulate gene expression. Genome Res. 19, 11841194. ` gre, N., Brown, C.D., Shah, P.K., Kheradpour, P., Morrison, C.A., Henikoff, Ne J.G., Feng, X., Ahmad, K., Russell, S., White, R.A., et al. (2010). A comprehensive map of insulator elements for the Drosophila genome. PLoS Genet. 6, e1000814. meth, A., Conesa, A., Santoyo-Lopez, J., Medina, I., Montaner, D., Pe tera, Ne ngst, G. (2010). Initial genomics of B., Solovei, I., Cremer, T., Dopazo, J., and La the human nucleolus. PLoS Genet. 6, e1000889. Noordermeer, D., Branco, M.R., Splinter, E., Klous, P., van Ijcken, W., Swagemakers, S., Koutsourakis, M., van der Spek, P., Pombo, A., and de Laat, W. (2008). Transcription and chromatin organization of a housekeeping gene cluster containing an integrated beta-globin locus control region. PLoS Genet. 4, e1000016. Noordermeer, D., de Wit, E., Klous, P., van de Werken, H., Simonis, M., LopezJones, M., Eussen, B., de Klein, A., Singer, R.H., and de Laat, W. (2011a). Variegated gene expression caused by cell-specic long-range DNA interactions. Nat. Cell Biol. 13, 944951. Noordermeer, D., Leleu, M., Splinter, E., Rougemont, J., De Laat, W., and Duboule, D. (2011b). The dynamic architecture of Hox gene clusters. Science 334, 222225. Nora, E.P., Lajoie, B.R., Schulz, E.G., Giorgetti, L., Okamoto, I., Servant, N., Piolot, T., van Berkum, N.L., Meisig, J., Sedat, J., et al. (2012). Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature 485, 381385. Orlando, V., and Paro, R. (1993). Mapping Polycomb-repressed domains in the bithorax complex using in vivo formaldehyde cross-linked chromatin. Cell 75, 11871198. Palstra, R.J., Simonis, M., Klous, P., Brasset, E., Eijkelkamp, B., and de Laat, W. (2008). Maintenance of long-range DNA interactions after inhibition of ongoing RNA polymerase II transcription. PLoS One 3, e1661. Pauler, F.M., Sloane, M.A., Huang, R., Regha, K., Koerner, M.V., Tamir, I., Sommer, A., Aszodi, A., Jenuwein, T., and Barlow, D.P. (2009). H3K27me3 forms BLOCs over silent genes and intergenic regions and species a histone banding pattern on a mouse autosomal chromosome. Genome Res. 19, 221233.

1282 Cell 152, March 14, 2013 2013 Elsevier Inc.

Peric-Hupkes, D., Meuleman, W., Pagie, L., Bruggeman, S.W., Solovei, I., f, S., Flicek, P., Kerkhoven, R.M., van Lohuizen, M., et al. Brugman, W., Gra (2010). Molecular maps of the reorganization of genome-nuclear lamina interactions during differentiation. Mol. Cell 38, 603613. Pickersgill, H., Kalverda, B., de Wit, E., Talhout, W., Fornerod, M., and van Steensel, B. (2006). Characterization of the Drosophila melanogaster genome at the nuclear lamina. Nat. Genet. 38, 10051014. Pope, B.D., Chandra, T., Buckley, Q., Hoare, M., Ryba, T., Wiseman, F.K., Kuta, A., Wilson, M.D., Odom, D.T., and Gilbert, D.M. (2012). Replicationtiming boundaries facilitate cell-type and species-specic regulation of a rearranged human chromosome in mouse. Hum. Mol. Genet. 21, 41624170. Prokocimer, M., Davidovich, M., Nissim-Rania, M., Wiesel-Motiuk, N., Bar, D.Z., Barkan, R., Meshorer, E., and Gruenbaum, Y. (2009). Nuclear lamins: key regulators of nuclear structure and activities. J. Cell. Mol. Med. 13, 10591085. Raab, J.R., Chiu, J., Zhu, J., Katzman, S., Kurukuti, S., Wade, P.A., Haussler, D., and Kamakaka, R.T. (2012). Human tRNA genes function as chromatin insulators. EMBO J. 31, 330350. Ram, O., Goren, A., Amit, I., Shoresh, N., Yosef, N., Ernst, J., Kellis, M., Gymrek, M., Issner, R., Coyne, M., et al. (2011). Combinatorial patterning of chromatin regulators uncovered by genome-wide location analysis in human cells. Cell 147, 16281639. Reddy, K.L., Zullo, J.M., Bertolino, E., and Singh, H. (2008). Transcriptional repression mediated by repositioning of genes to the nuclear lamina. Nature 452, 243247. Ryba, T., Hiratani, I., Lu, J., Itoh, M., Kulik, M., Zhang, J., Schulz, T.C., Robins, A.J., Dalton, S., and Gilbert, D.M. (2010). Evolutionarily conserved replication timing proles predict long-range chromatin interactions and distinguish closely related cell types. Genome Res. 20, 761770. Schoenfelder, S., Sexton, T., Chakalova, L., Cope, N.F., Horton, A., Andrews, S., Kurukuti, S., Mitchell, J.A., Umlauf, D., Dimitrova, D.S., et al. (2010). Preferential associations between co-regulated genes reveal a transcriptional interactome in erythroid cells. Nat. Genet. 42, 5361. bSchwaiger, M., Stadler, M.B., Bell, O., Kohler, H., Oakeley, E.J., and Schu eler, D. (2009). Chromatin state marks cell-type- and gender-specic replication of the Drosophila genome. Genes Dev. 23, 589601. beler, D. Schwaiger, M., Kohler, H., Oakeley, E.J., Stadler, M.B., and Schu (2010). Heterochromatin protein 1 (HP1) modulates replication timing of the Drosophila genome. Genome Res. 20, 771780. Schwartz, Y.B., Kahn, T.G., Stenberg, P., Ohno, K., Bourgon, R., and Pirrotta, V. (2010). Alternative epigenetic chromatin states of polycomb target genes. PLoS Genet. 6, e1000805. Schwartz, Y.B., Linder-Basso, D., Kharchenko, P.V., Tolstorukov, M.Y., Kim, M., Li, H.B., Gorchakov, A.A., Minoda, A., Shanower, G., Alekseyenko, A.A., et al. (2012). Nature and function of insulator protein binding sites in the Drosophila genome. Genome Res. 22, 21882198. Sexton, T., Yaffe, E., Kenigsberg, E., Bantignies, F., Leblanc, B., Hoichman, M., Parrinello, H., Tanay, A., and Cavalli, G. (2012). Three-dimensional folding and functional organization principles of the Drosophila genome. Cell 148, 458472. Shopland, L.S., Lynch, C.R., Peterson, K.A., Thornton, K., Kepper, N., Hase, Jv., Stein, S., Vincent, S., Molloy, K.R., Kreth, G., et al. (2006). Folding and organization of a contiguous chromosome region according to the gene distribution pattern in primary genomic sequence. J. Cell Biol. 174, 2738. Simonis, M., Klous, P., Splinter, E., Moshkin, Y., Willemsen, R., de Wit, E., van Steensel, B., and de Laat, W. (2006). Nuclear organization of active and inactive chromatin domains uncovered by chromosome conformation capture-onchip (4C). Nat. Genet. 38, 13481354. Splinter, E., de Wit, E., Nora, E.P., Klous, P., van de Werken, H.J., Zhu, Y., Kaaij, L.J., van Ijcken, W., Gribnau, J., Heard, E., and de Laat, W. (2011). The inactive X chromosome adopts a unique three-dimensional conformation that is dependent on Xist RNA. Genes Dev. 25, 13711383.

Sutherland, H., and Bickmore, W.A. (2009). Transcription factories: gene expression in unions? Nat. Rev. Genet. 10, 457466. Thomson, I., Gilchrist, S., Bickmore, W.A., and Chubb, J.R. (2004). The radial positioning of chromatin is not inherited through mitosis but is established de novo in early G1. Curr. Biol. 14, 166172. Tolhuis, B., de Wit, E., Muijrers, I., Teunissen, H., Talhout, W., van Steensel, B., and van Lohuizen, M. (2006). Genome-wide proling of PRC1 and PRC2 Polycomb chromatin binding in Drosophila melanogaster. Nat. Genet. 38, 694699. Tolhuis, B., Blom, M., Kerkhoven, R.M., Pagie, L., Teunissen, H., Nieuwland, M., Simonis, M., de Laat, W., van Lohuizen, M., and van Steensel, B. (2011). Interactions among Polycomb domains are guided by chromosome architecture. PLoS Genet. 7, e1001343. lez-Aguilera, C., Sack, R., Gaidatzis, D., Kalck, V., Towbin, B.D., Gonza Meister, P., Askjaer, P., and Gasser, S.M. (2012). Step-wise methylation of histone H3K9 positions heterochromatin at the nuclear periphery. Cell 150, 934947. ` , D., Hong, S.H., Umbarger, M.A., Toro, E., Wright, M.A., Porreca, G.J., Bau Fero, M.J., Zhu, L.J., Marti-Renom, M.A., McAdams, H.H., et al. (2011). The three-dimensional architecture of a bacterial genome and its alteration by genetic perturbation. Mol. Cell 44, 252264. van Bemmel, J.G., Pagie, L., Braunschweig, U., Brugman, W., Meuleman, W., Kerkhoven, R.M., and van Steensel, B. (2010). The insulator protein SU(HW) ne-tunes nuclear lamina interactions of the Drosophila genome. PLoS ONE 5, e15013. van Koningsbruggen, S., Gierlinski, M., Schoeld, P., Martin, D., Barton, G.J., Ariyurek, Y., den Dunnen, J.T., and Lamond, A.I. (2010). High-resolution whole-genome sequencing reveals that specic chromatin domains from most human chromosomes associate with nucleoli. Mol. Biol. Cell 21, 3735 3748. van Steensel, B., Delrow, J., and Henikoff, S. (2001). Chromatin proling using targeted DNA adenine methyltransferase. Nat. Genet. 27, 304308. Vaquerizas, J.M., Suyama, R., Kind, J., Miura, K., Luscombe, N.M., and Akhtar, A. (2010). Nuclear pore proteins nup153 and megator dene transcriptionally active regions in the Drosophila genome. PLoS Genet. 6, e1000846. n, M., Talhout, W., Vogel, M.J., Guelen, L., de Wit, E., Peric-Hupkes, D., Lode Feenstra, M., Abbas, B., Classen, A.K., and van Steensel, B. (2006). Human heterochromatin proteins form large domains containing KRAB-ZNF genes. Genome Res. 16, 14931504. Vogelmann, J., Valeri, A., Guillou, E., Cuvier, O., and Nollmann, M. (2011). Roles of chromatin insulator proteins in higher-order chromatin organization and transcription regulation. Nucleus 2, 358369. Wang, K.C., Yang, Y.W., Liu, B., Sanyal, A., Corces-Zimmerman, R., Chen, Y., Lajoie, B.R., Protacio, A., Flynn, R.A., Gupta, R.A., et al. (2011). A long noncoding RNA maintains active chromatin to coordinate homeotic gene expression. Nature 472, 120124. Wen, B., Wu, H., Shinkai, Y., Irizarry, R.A., and Feinberg, A.P. (2009). Large histone H3 lysine 9 dimethylated chromatin blocks distinguish differentiated from embryonic stem cells. Nat. Genet. 41, 246250. Williamson, I., Hill, R.E., and Bickmore, W.A. (2011). Enhancers: from developmental genetics to the genetics of common human disease. Dev. Cell 21, 1719. Wood, A.J., Lo, T.W., Zeitler, B., Pickle, C.S., Ralston, E.J., Lee, A.H., Amora, R., Miller, J.C., Leung, E., Meng, X., et al. (2011). Targeted genome editing across species using ZFNs and TALENs. Science 333, 307. rtele, H., and Chartrand, P. (2006). Genome-wide scanning of HoxB1-assoWu ciated loci in mouse ES cells using an open-ended Chromosome Conformation Capture methodology. Chromosome Res. 14, 477495. Xie, W., Barr, C.L., Kim, A., Yue, F., Lee, A.Y., Eubanks, J., Dempster, E.L., and Ren, B. (2012). Base-resolution analyses of sequence and parent-of-origin dependent DNA methylation in the mouse genome. Cell 148, 816831. Yaffe, E., and Tanay, A. (2011). Probabilistic modeling of Hi-C contact maps eliminates systematic biases to characterize global chromosomal architecture. Nat. Genet. 43, 10591065.

Cell 152, March 14, 2013 2013 Elsevier Inc. 1283

Yokochi, T., Poduch, K., Ryba, T., Lu, J., Hiratani, I., Tachibana, M., Shinkai, Y., and Gilbert, D.M. (2009). G9a selectively represses a class of late-replicating genes at the nuclear periphery. Proc. Natl. Acad. Sci. USA 106, 1936319368. Yokota, H., Singer, M.J., van den Engh, G.J., and Trask, B.J. (1997). Regional differences in the compaction of chromatin in human G0/G1 interphase nuclei. Chromosome Res. 5, 157166. Yokota, H., van den Engh, G., Hearst, J.E., Sachs, R.K., and Trask, B.J. (1995). Evidence for the organization of chromatin in megabase pair-sized loops arranged along a random walk path in the human G0/G1 interphase nucleus. J. Cell Biol. 130, 12391249.

Zeng, W., Ball, A.R., Jr., and Yokomori, K. (2010). HP1: heterochromatin binding proteins working the genome. Epigenetics 5, 287292. Zhang, Y., McCord, R.P., Ho, Y.J., Lajoie, B.R., Hildebrand, D.G., Simon, A.C., Becker, M.S., Alt, F.W., and Dekker, J. (2012). Spatial organization of the mouse genome and its role in recurrent chromosomal translocations. Cell 148, 908921. -Regi, R., Gaffney, D.J., Epstein, C.B., SpooZullo, J.M., Demarco, I.A., Pique ner, C.J., Luperchio, T.R., Bernstein, B.E., Pritchard, J.K., Reddy, K.L., and Singh, H. (2012). DNA sequence-dependent compartmentalization and silencing of chromatin at the nuclear lamina. Cell 149, 14741487.

1284 Cell 152, March 14, 2013 2013 Elsevier Inc.

Leading Edge

Review
CTCF and Cohesin: Linking Gene Regulatory Elements with Their Targets
Matthias Merkenschlager1,* and Duncan T. Odom2,*
Development Group, MRC Clinical Sciences Centre, Imperial College London, Du Cane Road, London W12 0NN, UK Research UK Cambridge Institute, University of Cambridge, Robinson Way, Cambridge CB2 0RE, UK *Correspondence: matthias.merkenschlager@csc.mrc.ac.uk (M.M.), duncan.odom@cruk.cam.ac.uk (D.T.O.) http://dx.doi.org/10.1016/j.cell.2013.02.029
2Cancer 1Lymphocyte

Current epigenomics approaches have facilitated the genome-wide identication of regulatory elements based on chromatin features and transcriptional regulator binding and have begun to map long-range interactions between regulatory elements and their targets. Here, we focus on the emerging roles of CTCF and the cohesin in coordinating long-range interactions between regulatory elements. We discuss how species-specic transposable elements may inuence such interactions by remodeling the CTCF binding repertoire and suggest that cohesins association with enhancers, promoters, and sites dened by CTCF binding has the potential to form developmentally regulated networks of long-range interactions that reect and promote cell-type-specic transcriptional programs.
Introduction Mammalian genomes are vast and are composed of billions of bases of DNA containing sufcient regulatory information to create complex organisms with thousands of cell types and considerable behavioral repertoires. Each of the twenty-odd thousand genes in the human genome likely has many distinct regulatory regions spread across tens to hundreds of kilobases that operate in concert to accurately instruct when, where, and how much of each gene to transcribe. Here, we focus on the role of the sequence-specic DNA-binding protein CTCF (CCCTC-binding factor) and the multiprotein cohesin complex in orchestrating tissue-specic gene regulation in an evolutionary context. The Interplay between Regulatory Elements and Chromatin Directs Tissue-Specic Transcription Tissue-specic transcription of protein-coding genes is controlled by one or more small regulatory regions that contain sets of DNA-binding proteins, which occupy DNA in a combinatorial fashion (Lee et al., 2002; Odom et al., 2006; see Box 1 for a brief overview of regulatory elements). The DNA itself is coiled around nucleosomes that are composed of histone octamers and convey regulatory information by their position and in the form of posttranslational histone modications (Segal and Widom, 2009; Campos and Reinberg, 2009). Regulatory regions are combined by through-space interactions to nalize both assembly and control of basal transcriptional machineries (Lee et al., 2002; Sanyal et al., 2012; Handoko et al., 2011; Li et al., 2012; Dixon et al., 2012; Noordermeer et al., 2011). How to Recognize Regulatory Elements Regulatory elements can be identied genome wide using indirect and direct means. The successful application of sequence conservation among mammals has been instrumental in identifying the complete protein-coding complement of mammals (Church et al., 2009). More recently, the same strategy has been taken to identify the sequences in the genome under selective pressure, presumably by DNA-binding proteins and other noncovalent regulators (Lindblad-Toh et al., 2011). One notable success of this strategy includes the identication of thousands of highly conserved CTCF binding locations that appear shared among most mammalian species. However, a considerable fraction of the regulation of the genome seems to occur in a highly species-specic manner due to the rapid evolution of tissue-specic transcription factor binding, indicating that direct comparison of genome sequences has inherent limitations (Kunarso et al., 2010; Schmidt et al., 2010b). Recent advances in sequencing technology have facilitated the genome-wide identication of regulatory elements based on chromatin accessibility, posttranslational histone modications, and the binding of regulatory factors (Johnson et al., 2007, reviewed in Noonan and McCallion, 2010; see Box 2 for a brief overview). These approaches have been integrated across the human genome to generate a rst pass encyclopedia of the regulatory features that are functional in the human genome (reviewed in Noonan and McCallion, 2010). How Regulatory Elements Work How do histone marks, chromatin state, and genome architecture conspire to create gene expression programs? Local chromatin accessibility, transcription factor binding, and specic chromatin modications such as acetylation, methylation, phosphorylation, or ubiquitylation not only mark regulatory elements, but they also actively contribute to the control of gene expression (Figure 1). Specically, DNA methylation, histone modications, and the proteins that interact with them affect the accessibility of chromatin. These factors link chromatin marks to the general transcription machinery (Thomas and Chiang, 2006), including TBP-associated factors (TAFs) and Mediator, as well
Cell 152, March 14, 2013 2013 Elsevier Inc. 1285

Box 1. Types of Regulatory Elements in Mammalian Genomes 1. Promoters are located near genes and directly regulate transcription. 2. Enhancers are located distal to protein-coding genes and may require chromatin conformational placement near regulated loci. 3. Insulators for either gene expression or chromatin state divide heterochromatin from euchromatin and/or active from inactive gene expression domains. 4. Repressors can decrease the expression of nearby genes.

Box 2. Mapping Regulatory Elements and Their Interactions DNase hypersensitivity reveals accessible sites that are not protected by nucleosomes or DNA-binding proteins (Hesselberth et al., 2009) and can be analyzed for transcription factor binding motifs, sequence conservation, and the tissue-specic gene expression (Neph et al., 2012). The chromatin state and the binding of regulatory proteins can be mapped by chromatin immunoprecipitation (ChIP) using antibodies against covalent histone modications, site-specic transcription factors, and transcriptional cofactors (Bernstein et al., 2005; Suganuma and Workman, 2011; Visel et al., 2009; Odom et al., 2004). A recent renement is ChIP-exo, which can identify the exact bases to which a factor is bound, as exemplied for CTCF (Rhee and Pugh, 2011). Chromosome conformation capture interrogates chromatin interactions based on crosslinked and ligated DNA (Dekker, 2008) on a gene-specic, regional, or genome-wide scale (Dostie et al., 2006; Lieberman-Aiden et al., 2009; Simonis et al., 2009). The output of current chromosome conformation capture experiments represents probability distributions within cell populations (Nora et al., 2012; Sanyal et al., 2012). FISH visualizes the proximity of specic sequences in individual cells on a gene-specic level (Dostie and Bickmore, 2012) and on a genomic scale (Boyle et al., 2011).

as RNA polymerase II (RNAPII) to regulate transcription initiation and elongation (see Figure 1 and Box 3). Classical models of transcriptional control begin with transcription factors binding to DNA, recruiting nucleosome remodeling complexes and histone modifying enzymes whose products can then interact with basal machinery to drive transcription. Although this is often true, in reality, these events are mutually interdependent (Figure 2). In summary, chromatin marks facilitate not only the cataloguing of genomic features, but more importantly, they also link regulatory elements to downstream effectors of transcriptional activation or repression. Evolution and Gene Regulatory Strategies In simple eukaryotes such as yeast, gene regulation is largely controlled by elements immediately proximal to their target genes (Borneman et al., 2006; Harbison et al., 2004; Lee et al., 2002). More complex model organisms like Drosophila and C. elegans have vastly larger genomes than yeast, yet most regulatory regions remain relatively close to their target genes (He et al., 2011; MacArthur et al., 2009). Population genetics analyses suggest that large breeding population sizes drive the condensation of regulatory and genic sequences (Lusk and Eisen, 2010; Lynch, 2007). In effect, competitive pressures can select for more efcient genome organization and utilization, leading to subtle growth advantages that become dominant only over thousands of generations in large breeding population sizes. In contrast, vertebrate species often have small breeding populations and thus lack these genome-compressing effects. Therefore, their genomes must have mechanisms to adapt to the constant onslaught of selsh element expansion and genetic drift. The resulting fragmentation of the genome has been exploited by mammals to increase the possible regulatory combinations of control elements in the interest of cell-type and tissue-specic gene expression (Dunham et al., 2012). Indeed, the organismal complexity found in mammals, and more generally in vertebrates, may be a direct result of this fragmentation (Figure 3A). The challenge that had to be overcome to create this opportunity to expand organismal complexity, however, was how to efciently connect the now-diffuse regulatory sequences with their targets. Linking Regulatory Elements with Their Targets Eukaryotes appear to have evolved molecular systems from preexisting cellular machineries to connect remote enhancers and promoters. The components of these systems have been investigated to different degrees. The proteins that directly occupy
1286 Cell 152, March 14, 2013 2013 Elsevier Inc.

DNA, such as CTCF and clusters of tissue-specic transcription factors, create a protein landscape that can be more readily handled biophysically than the substrate nucleic acid. Connecting directly with these regulators is the multicomponent cohesin complex, including the associated loading complex NIPBLMAU2. An interesting feature of the cohesin complex is its ring-like shape with an internal diameter of 40 nm (Nasmyth and Haering, 2009). This arrangement enables cohesin to handle the molecular dimensions of chromatin bers, as illustrated by its ability to mediate the interactions of sister chromatids (reviewed in Nasmyth and Haering, 2009; Skibbens, 2009). Additional players in the process of organizing regulatory elements include polycomb, which associates with repressive histone marks, SATB1 and -2, and the nuclear lamina (Galande et al., 2007; Morey and Helin, 2010; van Steensel, 2011). Methods to interrogate chromatin interactions include chromosome conformation capture (Dekker, 2008) and uorescence in situ hybridization (FISH) (Dostie and Bickmore, 2012) (see Box 3) and have shown that particular enhancers and promoters interact in a nonrandom fashion. A depletion of contacts can be used to suggest locations for insulating elements that divide gene expression domains. CTCF: The CCCTC-Binding Factor CTCF and cohesin are central players in regulating long-range interactions. We briey describe the scientic history of each, the recent discovery that they colocalize and interact to control long-range regulatory interactions, and what models may t our current understanding of their roles in tissue-specic transcription. Identication of CTCF as a Transcriptional Regulator CTCF was originally identied as a transcriptional regulator of the c-myc oncogene (Baniahmad et al., 1990; Filippova et al., 1996;

Figure 1. Chromatin Modications at Regulatory Elementsfrom Marks to Function


Models for gene regulation have moved from an early focus on transcription factors and DNA to encompass the full context of chromatin (left). Regulatory elements are marked by patterns of DNA methylation, histone marks, and interacting proteins that link chromatin modications to the regulation of transcription (center). Regulatory elements are often separated by considerable distances in the linear sequence of metazoan genomes. Transcriptional control is thought to involve interactions between regulatory elements in three-dimensional nuclear space (right). To illustrate this, the gure depicts regulatory elements of the imprinted IGF2/H19 locus and their interactions as detailed in the section CTCF and Cohesin Regulate Complex Loci.

Lobanenkov et al., 1990). This widely expressed 11 zinc nger DNA-binding protein is conserved in most higher eukaryotes (Klenova et al., 1993) and is essential for cellular function (Burcin et al., 1997; Fedoriw et al., 2004). One of the most interesting aspects of CTCF is that it appears to be the unique, major DNA binding component that establishes vertebrate insulators (Bell et al., 1999). CTCF can block enhancer function when placed between enhancers and promoters in reporter plasmids, and mostif not allCTCF binding sites can serve as insulators in such constructs (Giles et al., 2010; Phillips and Corces, 2009). Demonstrating enhancer blocking function of CTCF sites in their native chromatin context is much more difcult. A well-studied case is the IGF2/H19 locus, where CTCF binding controls the functional interaction of the IGF2 and H19 promoters with a distal enhancer, as supported by the analysis of natural (Beygo et al., 2013) and engineered mutations (discussed in detail below). The functions of other CTCF sites have been probed by the deletion of specic CTCF sites from the mouse immunoglobulin heavy chain locus (Guo et al., 2011) and the insertion of ectopic CTCF sites into the T cell receptor b chain locus (Shrimali et al., 2012). Such site-specic experiments are complemented by loss-of-function approaches in which CTCF is genetically deleted (Ribeiro de Almeida et al., 2011; Hirayama et al., 2012). Correlative studies link the position of CTCF binding sites to long-range interactions by chromatin conformation assays

(Dixon et al., 2012; Sanyal et al., 2012) and the analysis of chromatin features. CTCF binding is often found at transitions between distinct chromatin states as marked by histone modications (Cuddapah et al., 2009) or interactions with the nuclear lamina (van Steensel, 2011), supporting the notion that, in addition to limiting the reach of regulatory elements, CTCF can form boundaries. However, based on these data, only a minor fraction of CTCF sites appears to demarcate chromatin boundaries in their native chromatin context in vivo (Dixon et al., 2012; Schmidt et al., 2012b). This suggests that plasmid-based reporter constructs may not accurately capture the native chromatin environment, which is crucial to integrate regulatory inputs. Remarkably, CTCF may also help regulate viral genomes (Holdorf et al., 2011; Stedman et al., 2008; Tempera et al., 2010). CTCF interacts with specic locations in numerous viral genomes, including EBV and murine and human herpes viruses (Stedman et al., 2008; Stevens et al., 2012). These interactions are functional, and CTCF regulates both individual viral genes as well as entire programs; for instance, viral latency is inuenced by CTCF (Hughes et al., 2012; Kang et al., 2011). In the case of Kaposis sarcoma-associated herpes virus, cooperation between CTCF and cohesin has been documented (Chen et al., 2012; Kang et al., 2011; Stedman et al., 2008). As a host protein that can directly control viral gene expression, CTCF links the mammalian hosts defenses and gene
Cell 152, March 14, 2013 2013 Elsevier Inc. 1287

Box 3. Chromatin Modications: From Marks to Function Enhancers display monomethylated lysine 4 (H3K4me1) together with acetylated lysine 27 on histone H3 (H3K27ac) in the active state and trimethylated lysine 27 on histone H3 (H3K27me3) in the repressed state, whereas promoters are marked by trimethylation of histone H3 at lysine 4 (H4K4me3) (reviewed by Bannister and Kouzarides, 2011). H3K4 methylation marks are established by the SET1 and mixed lineage leukemia (MLL) family of histone methyltransferases. Among the readers of di- and trimethylated H3K4 are PHD (plant homeodomain) nger proteins (Bannister and Kouzarides, 2011)for example, the TAF3 subunit of the general transcription machinery for RNAPII. The targeting of TFIID connects the promoter mark H3K4me3 to the initiation of transcription (Vermeulen et al., 2007). Other readers of H3K4 methylation include the V(D)J recombinase subunit RAG2, chromatin modiers, and remodeling factors. In contrast to H3K4, H3K27 is methylated by the polycomb repressive complex 2 (PRC2). This modication recruits PRC1, a polycomb complex that blocks RNA polymerase and mediates transcriptional repression (Morey and Helin, 2010). Trimethylated H3K9 is a mark of inaccessible chromatin, or heterochromatin. Among the readers of H3K9me3 is the chromodomain protein heterochromatin protein 1 (HP1), which propagates the formation of inaccessible chromatin (Bannister and Kouzarides, 2011). Most CG dinucleotides in mammalian genomes are targeted by DNA metyltransferases that modify cytosine residues. CG-rich promoterproximal sequences (CpG islands) are specically protected from DNA methylation by the binding of Cfp1 (Thomson et al., 2010). Readers of DNA methylation include methyl-binding proteins such as MECP2 (mutated in Rett syndrome; Guy et al., 2007). Histone acetylation is linked to the transcriptional machinery by bromodomain proteins such as the BET protein BRD4, which interact with the Mediator complex and transcription elongation factors. Mediator regulates transcription by bridging sequence-specic DNAbinding proteins with RNAPII (Conaway and Conaway, 2011; Soutourina et al., 2011), and elongation factors facilitate transcription by promoting Pol II processivity (Yang et al., 2005; Jang et al., 2005).

regulatory elements with the regulation, function, and pathogenicity of the virus genome. In the next section, we explore how CTCF interacts with and potentially controls another form of parasite: transposable elements. Insights from the Genome-wide Analysis of CTCF Binding Early genome-scale mapping of CTCF binding in mammalian cells revealed a large, information-rich motif and mostly tissueindependent binding preferentially to gene-dense regions but with little or no enrichment in promoters (Kim et al., 2007). A substantial minority of CTCF sites may be cell-type specic, particularly in cancer-derived cell lines in which differential binding correlates with differential DNA methylation (Wang et al., 2012; see legend of Figure 2 for discussion). High conservation of CTCF binding was predicted and later demonstrated by both comparative (Lindblad-Toh et al., 2011) and experimental (Kunarso et al., 2010; Schmidt et al., 2012a) approaches. A series of simultaneous papers reported that CTCF and cohesin co-occupy the genome (Parelho et al., 2008; Rubio et al., 2008; Stedman et al., 2008; Wendt et al., 2008). This observation provided a functional link between an extremely high-afnity DNA-binding protein (CTCF) and cohesin, a key component of
1288 Cell 152, March 14, 2013 2013 Elsevier Inc.

chromatin. Cohesin had long been known to connect sister chromatids, a function that strongly suggested that cohesin may play a similar role in connecting chromatin domain loops within one chromosome, which might also be anchored by CTCF. The interaction of cohesin and CTCF may explain how CTCF acts in specic locations as a domain boundary for chromatin states. Evolution of CTCF Binding A link between CTCF and SINE repeat elements was reported in a seminal paper that dissected transcription factor binding proles based on their association with repetitive elements (Bourque et al., 2008). Thousands of B2 SINE elements in the mouse genome carry a CTCF binding motif, and a signicant proportion of these motifs are bound by CTCF in vivo. It was postulated that embedding a long, complex CTCF motif into such a repeat represented a powerful mechanism for rapidly expanding the CTCF regulatory repertoire; such a mechanism had been previously suggested for REST/NRSF, a repressor that targets a similarly large and complex motif (Mortazavi et al., 2006). Most recently, by comparing the in vivo genomic occupancy of CTCF in six mammalian species, it was discovered that the SINE repeats currently active in at least three of four major mammalian lineages carry a high-afnity CTCF site (Schmidt et al., 2012a). Indeed, hundreds to many thousands of SINE-expanded CTCF sites were identied in dog, gray short-tailed opossum, and rat, as well as in mice (Figure 3A). The comparison of the sequences surrounding the most ancient, highly conserved CTCF binding sites reveals over a hundred fossilized SINE repeat sequences in multiple species of mammals, separated by up to 180 MY of evolution. Thus, the repeat-driven expansion of CTCF binding sites is an ancient mechanism of genome evolution. Interestingly, neither human nor macaque show evidence of recent repeatassociated expansion of SINEs, suggesting that primates may have escaped this mode of genome remodeling. In contrast, rodents show massive SINE element expansion; however, the relative transposon activity of the SINE B2 elements that carry CTCF is considerably accelerated in mouse versus rat. In the time since their most recent common ancestor, almost four times more SINE B2 insertions carrying CTCF have occurred in mouse. The comparison of the thousands of CTCF binding events occurring in mouse at SINE B2 repeat elements with the similar number of CTCF binding sites that are deeply shared in mammalian evolution revealed that both types of CTCF binding function as transcriptional and chromatin insulators (Schmidt et al., 2012a). CTCF Binding as a Potential Survival Strategy for Expanding Repeats Analysis of the mechanisms by which mammals epigenetically silence repeats such as SINE elements suggests possible benets that repeat elements might obtain by carrying binding motifs for a transcriptional regulator. The rst possible advantage is that CTCF binding may modulate DNA methylation, which could otherwise silence transposons (Wang et al., 2012). It has been suggested that mammalian genomes defend against the large burden of transposons and endogenous retroviruses by methylating cytosines in DNA (Walsh and Bestor, 1999), which leads to comparatively rapid decay of these sequences via a C to T transposition (Bird,

Figure 2. Interdependence of Genome Sequence, Chromatin, and Transcription


DNA sequence and local chromatin structure direct transcription factor (TF) binding. Some TFs harness cofactors to remodel chromatin, and others require open chromatin. The pluripotency factors Oct4, Sox2, and Klf4 predominantly engage closed distal regulatory elements during somatic cell reprogramming, whereas Myc prefers open chromatin (Sou et al., 2012). Lineage-specic TFs often rely on pre-existing permissive chromatin; Foxp3, T-bet, and RORgt are induced in specialized T cell subsets and bind pre-existing regulatory elements (Ciofani et al., 2012; Samstein et al., 2012; Vahedi et al., 2012). TF binding can be indifferent to DNA methylation (Bell et al., 2011), but CTCF prefers hypomethylated DNA (Bell and Felsenfeld, 2000; Hark et al., 2000; Kanduri et al., 2000; Wang et al., 2012). This relationship is reciprocal, as CTCF can inuence the methylation status of distal regulatory regions (Stadler et al., 2011), blurring cause and effect of preferential binding to hypomethylated DNA. The differential methylation of CG-rich sites can exclude CTCF, allowing for the parentof-origin-specic (imprinted) regulation of IGF2/H19 (Bell and Felsenfeld, 2000; Hark et al., 2000; Kanduri et al., 2000). CTCFL, a paralog of CTCF, binds DNA irrespective of methylation (Nguyen et al., 2008). TF binding recruits chromatin remodelers and chromatin-modifying enzymes (writers and erasers) that modify histones or methylate DNA. Chromatin modications can limit recombination between repetitive regions of the genome and impact the activity of transposable elements that drive genome evolution, including the evolution of regulatory elements. DNA methylation and histone modications are recognized by chromatin readers that link chromatin modications to the transcription machinery (Box 3). Transcription alters the chromatin structure of transcribed regions; at lymphocyte receptor loci, transcription drives rearrangement by depositing H3K4me3, which is recognized by the PHD nger of Rag2, a component of the V(D)J recombination machinery. As the process of transcription itself, RNA transcripts can impact the chromatin landscape. Repeat-associated transcripts activate RNA interference mechanisms that modify chromatin and control transposable elements (Fedoroff, 2012). In mammals, this mechanism appears restricted to germ cells (Siomi and Siomi, 2011). Long noncoding RNAs also regulate chromatin structure and gene expression, as exemplied by Xist, which mediates X chromosome inactivation in female mammals (Brockdorff, 2011). Cohesin is recruited to active genes alongside TFs and the basal transcription machinery (Schmidt et al., 2010a; Kagey et al., 2010) and in turn can facilitate TF binding to low-afnity sites (Faure et al., 2012).

1980). Indeed, a substantial fraction of cytosine methylation in mammalian genomes is found in transposable elements; CTCF binding occurs in hypomethylated regions, thus partially protecting surrounding genetic sequences from methylation (see also Cohen et al., 2011). A second possible advantage is in modulating the RNAi-mediated control of transposable elements in somatic cells or in the germline (Fedoroff, 2012; Siomi and Siomi, 2011) (Figure 2). Duplication either of entire genomes or of genomic regions results in repeated genomic information and the danger of illegitimate recombination, and RNAi may have facilitated the expansion of higher eukaryotic genomes by limiting the danger of illegitimate recombination (Fedoroff, 2012). Once controlled, duplications can diversify to drive the evolution of genes, gene regulatory elements, and the factors that bind them (for example, Boris, a CTCF paralog active in germline cells [Loukinov et al., 2002]). Epigenetic mechanisms such as DNA methylation and RNAi have facilitated the domestication of transposable elements, which in turn has enabled the genomes of higher eukaryotes to

accommodate vast numbers of transposable elements. These transposable elements have been repurposed to build centro pez-Flores and Garrido-Ramos, meres and telomeres (Lo 2010), to remodel genome and regulatory architectures (Kunarso et al., 2010; Schmidt et al., 2012b), and to rearrange immune receptor loci (Schatz, 2004). Cohesin Cohesin Functions in the Cell Cycle A strong candidate for mediating long-range interactions between regulatory elements is cohesin, a multiprotein complex that provides cohesion between sister chromatids from the time of DNA replication in S phase until cell division (Nasmyth and Haering, 2009). This function of cohesin enables postreplicative DNA repair and proper chromosome segregation through mitosis and meiosis and hence the integrity of genomic information passed on from mother to daughter cells and from one generation of multicellular organisms to the next. Unsurprisingly, this function of cohesin is essential, and cohesin is highly conserved through evolution (Nasmyth and Haering, 2009). At
Cell 152, March 14, 2013 2013 Elsevier Inc. 1289

Figure 3. The Evolution of CTCF Binding Sites


(A) Dispersal of CTCF binding sites by the activity of transposable elements. Both CTCF binding and the mammalian genome itself are remodeled by repeat elements. The ACSL6 locus with CTCF binding and exon-to-exon homology mapping is shown for mouse and human. Intron sizes have been extensively remodeled due to repeat element expansions; in mouse, this expansion includes the introduction of a CTCF binding site carried within a mouse SINE B3 repeat. (B) Retention of CTCF binding sites. A conserved CTCF binding site upstream of the Ifng locus is maintained in rodent genomes despite the nearcomplete deletion of the associated gene, Il26 (red elements), and contributes to long-range interactions (Sekimata et al., 2009; Hadjur et al., 2009).

the heart of cohesin (as well as of the highly related condensin and Smc5/6 complexes) are heterodimers of SMC (structural maintenance of chromosomes) proteins. The V-shaped Smc1Smc3 heterodimer is complemented by Rad21/Scc1 and Scc3/SA1/SA2 subunits to form a ring-like structure large enough to topologically embrace two chromatin bers (Nasmyth and Haering, 2009). Consistent with its role in postreplicative DNA repair and chromosome segregation, cohesin is enriched at sites of DNA m et al., 2004; Unal et al., 2004) and at centromeres damage (Stro (Nasmyth and Haering, 2009). In higher eukaryotes, cohesin is a major component of chromatin also in noncycling and even in postmitotic cells. This points to a role for cohesin outside of the cell cycle, and indeed, there is growing evidence that cohesin contributes to the regulation of chromatin structure and gene expression in interphase. Cohesins Emerging Roles in Gene Regulation Initial evidence for a role of cohesin in gene regulation came from genetic studies in Drosophila, in which the expression of specic
1290 Cell 152, March 14, 2013 2013 Elsevier Inc.

homeobox genes is dependent on the dosage of the cohesin loading factor Nipped-B (Rollins et al., 1999). Heterozygous mutations in NIPBL, the human homolog of Nipped-B, were subsequently found to cause the developmental disorder Cornelia de Lange syndrome (Strachan, 2005), and similar developmental abnormalities are associated with mutations in cohesin subunits (Strachan, 2005), cohesin cofactors (Zhang et al., 2009), and cohesin-modifying enzymes (Vega et al., 2005; Deardorff et al., 2012). Although cultured cells derived from NIPBL heterozygous individuals do not show clear defects in chromosome segregation (Strachan, 2005), a distinction between cell-division-related and celldivision-independent cohesin functions is required to support a direct link between cohesin and gene expression. This was rst demonstrated by depleting cohesin from postmitotic cells in Drosophila (Pauli et al., 2008, 2010; Schuldiner et al., 2008) and later in noncycling mouse thymocytes (Seitan et al., 2011). Cohesin-depleted Drosophila neurons show defective axon pruning as a result of deregulated ecdyson receptor expression (Pauli et al., 2008, 2010; Schuldiner et al., 2008). Genetic ablation of the Rad21 cohesin subunit in mouse thymocytes impairs the transcription and rearrangement of the developmentally regulated T cell receptor a locus and disrupted thymocyte differentiation (Seitan et al., 2011). Recent studies uncovered two distinct types of cohesin sites that might mediate cohesins roles in gene regulation. Strong cohesin sites usually coincide with the binding of CTCF (Wendt et al., 2008; Parelho et al., 2008; Stedman et al., 2008; Rubio et al., 2008), whereas numerous and often weaker cohesin sites map to active promoters and enhancers (Schmidt et al., 2010a; Kagey et al., 2010; Faure et al., 2012). Here, cohesin is colocalized with its loading factor Nipbl, with Mediator components, and with tissue-specic transcription factors (Schmidt et al., 2010a; Kagey et al., 2010).

Box 4. Limitations of Current Experimental Approaches to Understanding Cohesins Role in Gene Expression Schmidt et al. (2010a) correlated the binding of transcription factors with cohesin recruitment but did not explore the biochemical mechanisms that mediate this colocalization. They found that cohesin depletion affects gene expression, but the interpretation of these data is complicated by global shifts in gene expression. The authors dealt with this issue by focusing on estrogen-responsive genes, but many other gene expression changes remain to be explained. Kagey et al. (2010) deprived ES cells of cohesin, a complex that is essential for successful chromosome segregation in mitosis and for other aspects of chromosome biology in cycling cells such as DNA replication and postreplicative DNA repair. This resulted in the misexpression of most ES-cell-expressed genes. However, ES cells are rapidly cycling, making it difcult to discern whether loss of cohesin brought about changes in gene expression as a result of specic gene regulatory functions or the activation of DNA damage checkpoints. The authors deliberately limited the scope of their analysis by focusing on the effects of knocking down cohesin, cohesin-loading factors, and mediator subunits and by combining gene expression data with genomic binding data. Nevertheless, it is important to remember that the loss of cohesin from cycling cells can trigger damage responses that may radically alter the pattern of gene expression and antagonizes the expression of pluripotency factors (Lin et al., 2005). A study by Seitan et al. (2011) largely avoids cell-cycle-related issues but does make the assumption that cohesin-dependent enhancerpromoter interactions are the causerather than a correlateof defective transcription in cohesin-depleted cells. Studies on postmitotic cells in Drosophila provide the clearest dissociation to date between cohesin functions in cycling and noncycling cells (Pauli et al., 2008, 2010; Schuldiner et al., 2008) but provide little mechanistic insight into how cohesin affects gene expression.

Figure 4. Cohesion and CTCF Link Regulatory Elements at the Tcra Locus
Cohesin binding sites ank major regulatory elements of Tcra, the TEA promoter, and the Ea enhancer. Cohesin strengthens promoter-enhancer interactions over a genomic distance of 80 kb, facilitating Tcra transcription and rearrangement of coding sequences (Seitan et al., 2011). A CTCFdependent insulator separates the Ea enhancer from the housekeeping gene Dad1 (Magdinier et al., 2004; Zhong and Krangel, 1999). Cohesin depletion increases the transcription of Dad1 at the expense of Tcra (Seitan et al., 2011).

Cohesin Functions in Gene Regulation and Development Mediating Chromosomal Long-Range Interactions The demonstration of long-range interactions between cohesin binding sites (Hadjur et al., 2009; Nativio et al., 2009; Kagey et al., 2010; Seitan et al., 2011) suggested that cohesin may affect gene expression by this mechanism. CTCF had long been thought to contribute to the spatial organization of the genome (Wallace and Felsenfeld, 2007), but a dependence of CTCF-based long-range interactions on cohesin was rst demonstrated for the mouse Ifng locus (Hadjur et al., 2009). A CTCF binding site 6070 kb upstream of the Ifng coding region is conserved in many mammals and is selectively retained in rodent genomes, despite the near-complete deletion of the associated gene, Il26 (Figure 3B). This site is preserved despite the insertion of a long interspersed nuclear element (LINE) at +5759 kb and a long terminal repeat (LTR)LINELTR at +7387 kb (Schoenborn et al., 2007) and complex structural rearrangements and segmental duplications that disrupt synteny with human over a region of 50 kb upstream of the Ifng coding region (Schoenborn et al., 2007; She et al., 2008). In both human and mouse, this site contacts two other CTCF sites, one in the rst intron of Ifng and the other about 100 kb downstream of the locus (Sekimata et al., 2009; Hadjur et al., 2009). These long-range interactions occur selectively in T helper 1 cells, which inducibly express Ifng. CTCF and cohesin are both required for these interactions. The contribution of the upstream CTCF binding site suggests that the selective retention of this site, despite the deletion of the associated Il26 locus, is functionally relevant for the regulation of Ifng (Sekimata et al., 2009; Hadjur et al., 2009). Cohesin depletion is linked to disrupted promoter-enhancer interactions in embryonic stem (ES) cells (Kagey et al., 2010) and in thymocytes (Seitan et al., 2011). Interactions mapped in

ES cells involve relatively short distances (35 kb; Kagey et al., 2010), whereas deletion of the cohesin subunit Rad21 in noncycling mouse thymocytes distorted the chromatin architecture of the developmentally regulated T cell receptor a locus Tcra over at least 80 kb. Interestingly, cohesin binding sites ank major promoter and enhancer elements of Tcra, and cohesin strengthens long-range promoter-enhancer interactions (Figure 4). This correlates with transcription and rearrangement of the locus and, ultimately, thymocyte differentiation (Seitan et al., 2011). In another example, the imprinted H19/IGF2 locus, CTCF-based, cohesin-mediated long-range interactions were shown to disrupt enhancer-promoter contacts (Nativio et al., 2009). It is tempting to think that the impact of cohesin on gene regulation depends on the nature of gene regulatory elements it connects at a specic locus. Although these examples show correlations between gene expression, long-range interactions, and cohesin binding, it should be noted that the detailed causal relationships remain to be worked out. It also remains to be explored how the mechanism of cohesin-mediated long-range interactions in cis relates to the topological embrace thought to provide sister chromatid cohesion in trans (Nasmyth and Haering, 2009). Limitations of current experimental approaches to understanding cohesins role in gene expression are discussed in Box 4.
Cell 152, March 14, 2013 2013 Elsevier Inc. 1291

Strengthening Transcription Factor Binding at Low-Afnity Motifs In addition to its role in supporting long-range interactions, cohesin may facilitate the binding of transcription factors to suboptimal sequence motifs (Faure et al., 2012). A recent study exhaustively compared the genomic binding of a large set of tissue-specic transcription factors, cohesin, and RNAPII with full annotation of chromatin state in mouse liver with the goal of understanding the transcriptional interplay among these elements of regulation. Cohesin is found to stabilize the binding of transcription factors to lower-afnity sequence motifsa hypothesis conrmed by testing whether specic transcription factor modules (identied based on their motif quality) are destabilized in a mouse haploinsufcient for the cohesin subunit RAD21. In summary, mounting evidence argues for multiple roles of cohesin in gene regulation. In a few examples (Pauli et al., 2008, 2010; Schuldiner et al., 2008; Seitan et al., 2011), the impact of cohesin on gene expression has been dissociated from cohesins essential functions in DNA repair, chromosome segregation, and emerging functions in DNA replication (TittelElmer et al., 2012). CTCF and Cohesin Regulate Complex Loci The interdependence of chromatin modications, regulatory elements, transcription factor binding, and promoter-enhancer interactions is illustrated by the imprinted H19/IGF2 locus, which provides a well-documented example of a mammalian insulator (Figure 1). The IGF2/H19 imprinting control region (ICR) comprises a cluster of CTCF sites, and imprinted H19/ IGF2 expression is regulated by the selective ICR methylation in sperm, but not in ova. Thus, CTCF selectively binds the unmethylated maternal allele, where it blocks the expression of IGF2 (Bell and Felsenfeld, 2000; Hark et al., 2000; Kanduri et al., 2000). The insulator function of CTCF at the maternal IGF2/H19 allele is reected in reduced long-range interactions of a distal enhancer with the maternal IGF2 promoter (Murrell et al., 2004). In contrast, methylation of the paternal ICR precludes CTCF binding and abrogates insulator function so that paternal IGF2 is expressed (Bell and Felsenfeld, 2000; Hark et al., 2000; Kanduri et al., 2000; Figure 1). Maternally inherited ICR microdeletions that remove a subset of CTCF sites can result in the methylation of remaining sites and the loss of imprinting in Beckwith-Wiedemann syndrome (Choufani et al., 2010). The impact of such deletions correlates with the spatial arrangement rather than the number of the remaining CTCF sites (Beygo et al., 2013). In addition to H19/IGF2, CTCF and cohesin regulate many other complex loci, including the b-globin locus (Splinter et al., 2006), proto-cadherin loci (Hirayama et al., 2012; Remeseiro et al., 2012), lymphocyte receptor loci (Seitan et al., 2012), and the X chromosome inactivation region (Spencer et al., 2011). It is possible that complex loci are particularly dependent on CTCF and cohesin. Regulation of Multigene Cluster Loci Conditional deletion of CTCF from postmitotic projection neurons results in the misexpression of several hundred transcripts, including the clustered protocadherin genes. Mice lacking CTCF in a subset of their neurons have defects in func1292 Cell 152, March 14, 2013 2013 Elsevier Inc.

tional somatosensory mapping and suffer from postnatal growth retardation and abnormal behavior (Hirayama et al., 2012). A different mouse model demonstrates that the cohesin subunit SA1 positively regulates neuronal protocadherin gene expression (Remeseiro et al., 2012). Lymphocyte receptor loci contain hundreds of coding elements arranged over large genomic regions. To make functional lymphocyte receptors, these regions must be rearranged by a somatic recombination process mediated by the transposon-derived recombinases Rag1 and Rag2 (Schatz, 2004). Rag2 links chromatin structure to the somatic rearrangement of lymphocyte receptor gene loci due to its selective interaction with H3K4me3 (Matthews et al., 2007). Recruitment of Rag2 by transcription-associated histone modications explains why the initiation of recombination requires transcriptional activity. Regulation of this activity in a cell-type- and developmentalstage-specic manner provides a mechanism for rearranging each lymphocyte receptor locus at the appropriate time and in the appropriate cell type (Stanhope-Baker et al., 1996). Interestingly, the coordination of cell-type- and developmental-stagespecic lymphocyte receptor locus transcription requires both CTCF and cohesin (Degner et al., 2011; Seitan et al., 2011; Ribeiro de Almeida et al., 2011; reviewed by Seitan et al., 2012; Bossen et al., 2012). Furthermore, transcription and rearrangement of lymphocyte receptor loci are perturbed by the deletion of endogenous CTCF sites (Guo et al., 2011) or the introduction of ectopic CTCF sites (Shrimali et al., 2012). CTCF Control of Noncoding RNA Transcription The impact of CTCF and cohesin on the transcription and rearrangement of lymphocyte receptor gene loci is mediated in part by long-range interactions and in part by antisense transcription (Degner et al., 2011; Featherstone et al., 2010). This theme is reiterated at the locus encoding ataxin-7, which is anked by a CAG/polyglutamine repeat. When expanded, this repeat results in the neurodegenerative disorder spinocerebellar ataxia. The ataxin-7 repeat and translation start site are anked by binding sites for CTCF, and CTCF promotes the transcription of a noncoding, convergently transcribed antisense RNA, which determines ataxin-7 promoter usage (Sopher et al., 2011). The ribosomal DNA (rDNA) locus contains hundreds of copies of rDNA genes, only some of which are actively transcribed. In addition to rDNA gene promoters, rDNA transcription is regulated by spacer promoters that give rise to noncoding RNAs and are regulated by CTCF (van de Nobelen et al., 2010). In mammals, X chromosome inactivation equalizes X-linked gene expression between XY male and XX female cells and is controlled by a genomic region designated the X-inactivation center. This region harbors two distinct chromatin segments, each centered around noncoding genes transcribed in opposite directions, Xss, ist and Tsix. A conserved CTCF binding element positioned between these regions facilitates Xist induction and X chromosome inactivation in female cells (Spencer et al., 2011). Transcriptional Regulation Linked to CTCF Eviction or Recruitment Inducible noncoding RNA transcription has been reported to evict CTCF from a site upstream of the chicken lysozyme promoter (Lefevre et al., 2008). The RARb2 gene displays an intriguing mechanism for regulated CTCF recruitment (Le May

et al., 2012). It starts with the introduction of DNA breaks by the XPG endonuclease and is followed by DNA repair, which replaces methylated with unmethylated DNA. This allows CTCF to bind and to form chromatin loops that correlate with locus transcription (Le May et al., 2012). Regulation of RNA Polymerase Elongation and Alternative Splicing Fay et al. (2011) have shown that local cohesin binding can impact the processivity of RNAPII. The rate of transcriptional elongation is known to impact on alternative splicing (Ip et al., 2011), and CTCF can promote the inclusion of weak exons by mediating local RNAPII pausing at the alternatively spliced CD45 locus as well as genome wide (Shukla et al., 2011). Both CTCF binding and exon inclusion are sensitive to DNA methylation, linking the developmental regulation of splicing with epigenetic marks. The mechanisms described in this section are the result of detailed locus-specic studies, and their general signicance remains to be tested on a genome-wide level. Perspective CTCF binding is often associated with constitutive DNaseI hypersensitive sites (Parelho et al., 2008). Within one species, some CTCF sites can reect cell-type-specic chromatin states (Wang et al., 2012), but most CTCF sites are shared among different cell types (Kim et al., 2007). Mostbut not allCTCF sites attract cohesin and, although the mechanisms of selective cohesin recruitment by CTCF remain to be dened, it is clear that, in isolation, CTCF-associated cohesin sites are relatively static among diverse cell types and tissues. On the scale of evolutionary time, the ancient and ongoing remodeling of the mammalian genome by repeat elements that carry CTCF insures that even these stable CTCF-cohesin anchorages diverge between species. In contrast, cohesin binding at enhancers and promoters is often cell-type specic and thus reects the dynamic transcriptional state of different cell types (Kagey et al., 2010; Schmidt et al., 2010a; Kim et al., 2005; Rada-Iglesias et al., 2011; Heintzman et al., 2009; Visel et al., 2009; Shen et al., 2012). The interaction of cohesin with both CTCF and active enhancers and promoters can be thought of as a unifying mechanism that links the rapidly evolving binding of tissue-specic transcription factors with the more developmentally and evolutionarily stable binding of CTCF into networks of long-range interactions that reect and promote the transcriptional programs of specic cell types.
ACKNOWLEDGMENTS We thank Vlad Seitan and other lab members for suggestions and Anthony Lewis (MRC Clinical Sciences Centre) and Michelle Ward (University of Cambridge) for help with graphics, and we apologize to our colleagues whose work could not be cited due to space constraints. This work was supported by the Medical Research Council, UK (M.M.), the Wellcome Trust (M.M.), Cancer Research UK (D.T.O.), European Research Council (D.T.O.), and EMBO Young Investigators Programme (D.T.O.). REFERENCES hne, A.C., and Renkawitz, R. (1990). Modular Baniahmad, A., Steiner, C., Ko structure of a chicken lysozyme silencer: involvement of an unusual thyroid hormone receptor binding site. Cell 61, 505514.

Bannister, A.J., and Kouzarides, T. (2011). Regulation of chromatin by histone modications. Cell Res. 21, 381395. Bell, A.C., and Felsenfeld, G. (2000). Methylation of a CTCF-dependent boundary controls imprinted expression of the Igf2 gene. Nature 405, 482485. Bell, A.C., West, A.G., and Felsenfeld, G. (1999). The protein CTCF is required for the enhancer blocking activity of vertebrate insulators. Cell 98, 387396. , N.H., and Schu beler, D. (2011). Determinants and Bell, O., Tiwari, V.K., Thoma dynamics of genome accessibility. Nat. Rev. Genet. 12, 554564. Bernstein, B.E., Kamal, M., Lindblad-Toh, K., Bekiranov, S., Bailey, D.K., Huebert, D.J., McMahon, S., Karlsson, E.K., Kulbokas, E.J., 3rd, Gingeras, T.R., et al. (2005). Genomic maps and comparative analysis of histone modications in human and mouse. Cell 120, 169181. Beygo, J., Citro, V., Sparago, A., De Crescenzo, A., Cerrato, F., Heitmann, M., Rademacher, K., Guala, A., Enklaar, T., Anichini, C., et al. (2013). The molecular function and clinical phenotype of partial deletions of the IGF2/H19 imprinting control region depends on the spatial arrangement of the remaining CTCF-binding sites. Hum. Mol. Genet. 22, 544557. Bird, A.P. (1980). DNA methylation and the frequency of CpG in animal DNA. Nucleic Acids Res. 8, 14991504. Borneman, A.R., Leigh-Bell, J.A., Yu, H., Bertone, P., Gerstein, M., and Snyder, M. (2006). Target hub proteins serve as master regulators of development in yeast. Genes Dev. 20, 435448. Bossen, C., Mansson, R., and Murre, C. (2012). Chromatin topology and the regulation of antigen receptor assembly. Annu. Rev. Immunol. 30, 337356. Bourque, G., Leong, B., Vega, V.B., Chen, X., Lee, Y.L., Srinivasan, K.G., Chew, J.L., Ruan, Y., Wei, C.L., Ng, H.H., and Liu, E.T. (2008). Evolution of the mammalian transcription factor binding repertoire via transposable elements. Genome Res. 18, 17521762. Boyle, S., Rodesch, M.J., Halvensleben, H.A., Jeddeloh, J.A., and Bickmore, W.A. (2011). Fluorescence in situ hybridization with high-complexity repeatfree oligonucleotide probes generated by massively parallel synthesis. Chromosome Res. 19, 901909. Brockdorff, N. (2011). Chromosome silencing mechanisms in X-chromosome inactivation: unknown unknowns. Development 138, 50575065. Burcin, M., Arnold, R., Lutz, M., Kaiser, B., Runge, D., Lottspeich, F., Filippova, G.N., Lobanenkov, V.V., and Renkawitz, R. (1997). Negative protein 1, which is required for function of the chicken lysozyme gene silencer in conjunction with hormone receptors, is identical to the multivalent zinc nger repressor CTCF. Mol. Cell. Biol. 17, 12811288. Campos, E.I., and Reinberg, D. (2009). Histones: annotating chromatin. Annu. Rev. Genet. 43, 559599. Chen, H.S., Wikramasinghe, P., Showe, L., and Lieberman, P.M. (2012). Cohesins repress Kaposis sarcoma-associated herpesvirus immediate early gene transcription during latency. J. Virol. 86, 94549464. Choufani, S., Shuman, C., and Weksberg, R. (2010). Beckwith-Wiedemann syndrome. Am. J. Med. Genet. C. Semin. Med. Genet. 154C, 343354. Church, D.M., Goodstadt, L., Hillier, L.W., Zody, M.C., Goldstein, S., She, X., Bult, C.J., Agarwala, R., Cherry, J.L., DiCuccio, M., et al.; Mouse Genome Sequencing Consortium. (2009). Lineage-specic biology revealed by a nished genome assembly of the mouse. PLoS Biol. 7, e1000112. Ciofani, M., Madar, A., Galan, C., Sellars, M., Mace, K., Pauli, F., Agarwal, A., Huang, W., Parkurst, C.N., Muratet, M., et al. (2012). A validated regulatory network for Th17 cell specication. Cell 151, 289303. Cohen, N.M., Kenigsberg, E., and Tanay, A. (2011). Primate CpG islands are maintained by heterogeneous evolutionary regimes involving minimal selection. Cell 145, 773786. Conaway, R.C., and Conaway, J.W. (2011). Function and regulation of the Mediator complex. Curr. Opin. Genet. Dev. 21, 225230. Cuddapah, S., Jothi, R., Schones, D.E., Roh, T.Y., Cui, K., and Zhao, K. (2009). Global analysis of the insulator binding protein CTCF in chromatin barrier regions reveals demarcation of active and repressive domains. Genome Res. 19, 2432.

Cell 152, March 14, 2013 2013 Elsevier Inc. 1293

Deardorff, M.A., Bando, M., Nakato, R., Watrin, E., Itoh, T., Minamino, M., Saitoh, K., Komata, M., Katou, Y., Clark, D., et al. (2012). HDAC8 mutations in Cornelia de Lange syndrome affect the cohesin acetylation cycle. Nature 489, 313317. Degner, S.C., Verma-Gaur, J., Wong, T.P., Bossen, C., Iverson, G.M., Torkamani, A., Vettermann, C., Lin, Y.C., Ju, Z., Schulz, D., et al. (2011). CCCTC-binding factor (CTCF) and cohesin inuence the genomic architecture of the Igh locus and antisense transcription in pro-B cells. Proc. Natl. Acad. Sci. USA 108, 95669571. Dekker, J. (2008). Gene regulation in the third dimension. Science 319, 1793 1794. Dixon, J.R., Selvaraj, S., Yue, F., Kim, A., Li, Y., Shen, Y., Hu, M., Liu, J.S., and Ren, B. (2012). Topological domains in mammalian genomes identied by analysis of chromatin interactions. Nature 485, 376380. Dostie, J., and Bickmore, W.A. (2012). Chromosome organization in the nucleus - charting new territory across the Hi-Cs. Curr. Opin. Genet. Dev. 22, 125131. Dostie, J., Richmond, T.A., Arnaout, R.A., Selzer, R.R., Lee, W.L., Honan, T.A., Rubio, E.D., Krumm, A., Lamb, J., Nusbaum, C., et al. (2006). Chromosome Conformation Capture Carbon Copy (5C): a massively parallel solution for mapping interactions between genomic elements. Genome Res. 16, 1299 1309. Dunham, I., Kundaje, A., Aldred, S.F., Collins, P.J., Davis, C.A., Doyle, F., Epstein, C.B., Frietze, S., Harrow, J., Kaul, R., et al.; ENCODE Project Consortium. (2012). An integrated encyclopedia of DNA elements in the human genome. Nature 489, 5774. Faure, A.J., Schmidt, D., Watt, S., Schwalie, P.C., Wilson, M.D., Xu, H., Ramsay, R.G., Odom, D.T., and Flicek, P. (2012). Cohesin regulates tissuespecic expression by stabilising highly occupied cis-regulatory modules. Genome Res. 22, 21632175. Fay, A., Misulovin, Z., Li, J., Schaaf, C.A., Gause, M., Gilmour, D.S., and Dorsett, D. (2011). Cohesin selectively binds and regulates genes with paused RNA polymerase. Curr. Biol. 21, 16241634. Featherstone, K., Wood, A.L., Bowen, A.J., and Corcoran, A.E. (2010). The mouse immunoglobulin heavy chain V-D intergenic sequence contains insulators that may regulate ordered V(D)J recombination. J. Biol. Chem. 285, 93279338. Fedoriw, A.M., Stein, P., Svoboda, P., Schultz, R.M., and Bartolomei, M.S. (2004). Transgenic RNAi reveals essential function for CTCF in H19 gene imprinting. Science 303, 238240. Fedoroff, N.V. (2012). Presidential address. Transposable elements, epigenetics, and genome evolution. Science 338, 758767. Filippova, G.N., Fagerlie, S., Klenova, E.M., Myers, C., Dehner, Y., Goodwin, G., Neiman, P.E., Collins, S.J., and Lobanenkov, V.V. (1996). An exceptionally conserved transcriptional repressor, CTCF, employs different combinations of zinc ngers to bind diverged promoter sequences of avian and mammalian c-myc oncogenes. Mol. Cell. Biol. 16, 28022813. Galande, S., Purbey, P.K., Notani, D., and Kumar, P.P. (2007). The third dimension of gene regulation: organization of dynamic chromatin loopscape by SATB1. Curr. Opin. Genet. Dev. 17, 408414. Giles, K.E., Gowher, H., Ghirlando, R., Jin, C., and Felsenfeld, G. (2010). Chromatin boundaries, insulators, and long-range interactions in the nucleus. Cold Spring Harb. Symp. Quant. Biol. 75, 7985. Guo, C., Yoon, H.S., Franklin, A., Jain, S., Ebert, A., Cheng, H.L., Hansen, E., Despo, O., Bossen, C., Vettermann, C., et al. (2011). CTCF-binding elements mediate control of V(D)J recombination. Nature 477, 424430. Guy, J., Gan, J., Selfridge, J., Cobb, S., and Bird, A. (2007). Reversal of neurological defects in a mouse model of Rett syndrome. Science 315, 11431147. Hadjur, S., Williams, L.M., Ryan, N.K., Cobb, B.S., Sexton, T., Fraser, P., Fisher, A.G., and Merkenschlager, M. (2009). Cohesins form chromosomal cis-interactions at the developmentally regulated IFNG locus. Nature 460, 410413.

Handoko, L., Xu, H., Li, G., Ngan, C.Y., Chew, E., Schnapp, M., Lee, C.W., Ye, C., Ping, J.L., Mulawadi, F., et al. (2011). CTCF-mediated functional chromatin interactome in pluripotent cells. Nat. Genet. 43, 630638. Harbison, C.T., Gordon, D.B., Lee, T.I., Rinaldi, N.J., Macisaac, K.D., Danford, T.W., Hannett, N.M., Tagne, J.B., Reynolds, D.B., Yoo, J., et al. (2004). Transcriptional regulatory code of a eukaryotic genome. Nature 431, 99104. Hark, A.T., Schoenherr, C.J., Katz, D.J., Ingram, R.S., Levorse, J.M., and Tilghman, S.M. (2000). CTCF mediates methylation-sensitive enhancerblocking activity at the H19/Igf2 locus. Nature 405, 486489. He, Q., Bardet, A.F., Patton, B., Purvis, J., Johnston, J., Paulson, A., Gogol, M., Stark, A., and Zeitlinger, J. (2011). High conservation of transcription factor binding and evidence for combinatorial regulation across six Drosophila species. Nat. Genet. 43, 414420. Heintzman, N.D., Hon, G.C., Hawkins, R.D., Kheradpour, P., Stark, A., Harp, L.F., Ye, Z., Lee, L.K., Stuart, R.K., Ching, C.W., et al. (2009). Histone modications at human enhancers reect global cell-type-specic gene expression. Nature 459, 108112. Hesselberth, J.R., Chen, X., Zhang, Z., Sabo, P.J., Sandstrom, R., Reynolds, A.P., Thurman, R.E., Neph, S., Kuehn, M.S., Noble, W.S., et al. (2009). Global mapping of protein-DNA interactions in vivo by digital genomic footprinting. Nat. Methods 6, 283289. Hirayama, T., Tarusawa, E., Yoshimura, Y., Galjart, N., and Yagi, T. (2012). CTCF is required for neural development and stochastic expression of clustered Pcdh genes in neurons. Cell Rep. 2, 345357. Holdorf, M.M., Cooper, S.B., Yamamoto, K.R., and Miranda, J.J. (2011). Occupancy of chromatin organizers in the Epstein-Barr virus genome. Virology 415, 15. Hughes, D.J., Marendy, E.M., Dickerson, C.A., Yetming, K.D., Sample, C.E., and Sample, J.T. (2012). Contributions of CTCF and DNA methyltransferases DNMT1 and DNMT3B to Epstein-Barr virus restricted latency. J. Virol. 86, 10341045. Ip, J., Schmidt, D., Pan, Q., Ramani, A., Fraser, A., Odom, D.T., and Blencowe, B. (2011). Global impact of RNA polymerase II elongation inhibition on alternative splicing regulation. Genome Res. 21, 390401. Jang, M.K., Mochizuki, K., Zhou, M., Jeong, H.S., Brady, J.N., and Ozato, K. (2005). The bromodomain protein Brd4 is a positive regulatory component of P-TEFb and stimulates RNA polymerase II-dependent transcription. Mol. Cell 19, 523534. Johnson, D.S., Mortazavi, A., Myers, R.M., and Wold, B. (2007). Genome-wide mapping of in vivo protein-DNA interactions. Science 316, 14971502. Kagey, M.H., Newman, J.J., Bilodeau, S., Zhan, Y., Orlando, D.A., van Berkum, N.L., Ebmeier, C.C., Goossens, J., Rahl, P.B., Levine, S.S., et al. (2010). Mediator and cohesin connect gene expression and chromatin architecture. Nature 467, 430435. Kanduri, C., Pant, V., Loukinov, D., Pugacheva, E., Qi, C.F., Wolffe, A., Ohlsson, R., and Lobanenkov, V.V. (2000). Functional association of CTCF with the insulator upstream of the H19 gene is parent of origin-specic and methylation-sensitive. Curr. Biol. 10, 853856. Kang, H., Wiedmer, A., Yuan, Y., Robertson, E., and Lieberman, P.M. (2011). Coordination of KSHV latent and lytic gene control by CTCF-cohesin mediated chromosome conformation. PLoS Pathog. 7, e1002140. Kim, T.H., Barrera, L.O., Zheng, M., Qu, C., Singer, M.A., Richmond, T.A., Wu, Y., Green, R.D., and Ren, B. (2005). A high-resolution map of active promoters in the human genome. Nature 436, 876880. Kim, T.H., Abdullaev, Z.K., Smith, A.D., Ching, K.A., Loukinov, D.I., Green, R.D., Zhang, M.Q., Lobanenkov, V.V., and Ren, B. (2007). Analysis of the vertebrate insulator protein CTCF-binding sites in the human genome. Cell 128, 12311245. Klenova, E.M., Nicolas, R.H., Paterson, H.F., Carne, A.F., Heath, C.M., Goodwin, G.H., Neiman, P.E., and Lobanenkov, V.V. (1993). CTCF, a conserved nuclear factor required for optimal transcriptional activity of the chicken c-myc gene, is an 11-Zn-nger protein differentially expressed in multiple forms. Mol. Cell. Biol. 13, 76127624.

1294 Cell 152, March 14, 2013 2013 Elsevier Inc.

Kunarso, G., Chia, N.Y., Jeyakani, J., Hwang, C., Lu, X., Chan, Y.S., Ng, H.H., and Bourque, G. (2010). Transposable elements have rewired the core regulatory network of human embryonic stem cells. Nat. Genet. 42, 631634. ` res, P., and Egly, J.M. (2012). XPG and Le May, N., Fradin, D., Iltis, I., Bougne XPF endonucleases trigger chromatin looping and DNA demethylation for accurate expression of activated genes. Mol. Cell 47, 622632. Lee, T.I., Rinaldi, N.J., Robert, F., Odom, D.T., Bar-Joseph, Z., Gerber, G.K., Hannett, N.M., Harbison, C.T., Thompson, C.M., Simon, I., et al. (2002). Transcriptional regulatory networks in Saccharomyces cerevisiae. Science 298, 799804. Lefevre, P., Witham, J., Lacroix, C.E., Cockerill, P.N., and Bonifer, C. (2008). The LPS-induced transcriptional upregulation of the chicken lysozyme locus involves CTCF eviction and noncoding RNA transcription. Mol. Cell 32, 129139. Li, G., Ruan, X., Auerbach, R.K., Sandhu, K.S., Zheng, M., Wang, P., Poh, H.M., Goh, Y., Lim, J., Zhang, J., et al. (2012). Extensive promoter-centered chromatin interactions provide a topological basis for transcription regulation. Cell 148, 8498. Lieberman-Aiden, E., van Berkum, N.L., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B.R., Sabo, P.J., Dorschner, M.O., et al. (2009). Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science 326, 289293. Lin, T., Chao, C., Saito, S., Mazur, S.J., Murphy, M.E., Appella, E., and Xu, Y. (2005). p53 induces differentiation of mouse embryonic stem cells by suppressing Nanog expression. Nat. Cell Biol. 7, 165171. Lindblad-Toh, K., Garber, M., Zuk, O., Lin, M.F., Parker, B.J., Washietl, S., Kheradpour, P., Ernst, J., Jordan, G., Mauceli, E., et al.; Broad Institute Sequencing Platform and Whole Genome Assembly Team; Baylor College of Medicine Human Genome Sequencing Center Sequencing Team; Genome Institute at Washington University. (2011). A high-resolution map of human evolutionary constraint using 29 mammals. Nature 478, 476482. Lobanenkov, V.V., Nicolas, R.H., Adler, V.V., Paterson, H., Klenova, E.M., Polotskaja, A.V., and Goodwin, G.H. (1990). A novel sequence-specic DNA binding protein which interacts with three regularly spaced direct repeats of the CCCTC-motif in the 50 -anking sequence of the chicken c-myc gene. Oncogene 5, 17431753. pez-Flores, I., and Garrido-Ramos, M.A. (2010). The repetitive DNA content Lo of eukaryotic genomes. Genome Dyn. 7, 128. Loukinov, D.I., Pugacheva, E., Vatolin, S., Pack, S.D., Moon, H., Chernukhin, I., Mannan, P., Larsson, E., Kanduri, C., Vostrov, A.A., et al. (2002). BORIS, a novel male germ-line-specic protein associated with epigenetic reprogramming events, shares the same 11-zinc-nger domain with CTCF, the insulator protein involved in reading imprinting marks in the soma. Proc. Natl. Acad. Sci. USA 99, 68066811. Lusk, R.W., and Eisen, M.B. (2010). Evolutionary mirages: selection on binding site composition creates the illusion of conserved grammars in Drosophila enhancers. PLoS Genet. 6, e1000829. Lynch, M. (2007). The frailty of adaptive hypotheses for the origins of organismal complexity. Proc. Natl. Acad. Sci. USA 104(Suppl 1), 85978604. MacArthur, S., Li, X.Y., Li, J., Brown, J.B., Chu, H.C., Zeng, L., Grondona, B.P., nen, S.V., et al. (2009). Developmental roles of Hechmer, A., Simirenko, L., Kera 21 Drosophila transcription factors are determined by quantitative differences in binding to an overlapping set of thousands of genomic regions. Genome Biol. 10, R80. Magdinier, F., Yusufzai, T.M., and Felsenfeld, G. (2004). Both CTCF-dependent and -independent insulators are found between the mouse T cell receptor alpha and Dad1 genes. J. Biol. Chem. 279, 2538125389. n-Maiques, S., Han, S., Champagne, K.S., Matthews, A.G., Kuo, A.J., Ramo Ivanov, D., Gallardo, M., Carney, D., Cheung, P., Ciccone, D.N., et al. (2007). RAG2 PHD nger couples histone H3 lysine 4 trimethylation with V(D)J recombination. Nature 450, 11061110. Morey, L., and Helin, K. (2010). Polycomb group protein-mediated repression of transcription. Trends Biochem. Sci. 35, 323332.

Mortazavi, A., Leeper Thompson, E.C., Garcia, S.T., Myers, R.M., and Wold, B. (2006). Comparative genomics modeling of the NRSF/REST repressor network: from single conserved sites to genome-wide repertoire. Genome Res. 16, 12081221. Murrell, A., Heeson, S., and Reik, W. (2004). Interaction between differentially methylated regions partitions the imprinted genes Igf2 and H19 into parentspecic chromatin loops. Nat. Genet. 36, 889893. Nasmyth, K., and Haering, C.H. (2009). Cohesin: its roles and mechanisms. Annu. Rev. Genet. 43, 525558. Nativio, R., Wendt, K.S., Ito, Y., Huddleston, J.E., Uribe-Lewis, S., Woodne, K., Krueger, C., Reik, W., Peters, J.M., and Murrell, A. (2009). Cohesin is required for higher-order chromatin conformation at the imprinted IGF2-H19 locus. PLoS Genet. 5, e1000739. Neph, S., Vierstra, J., Stergachis, A.B., Reynolds, A.P., Haugen, E., Vernot, B., Thurman, R.E., John, S., Sandstrom, R., Johnson, A.K., et al. (2012). An expansive human regulatory lexicon encoded in transcription factor footprints. Nature 489, 8390. Nguyen, P., Cui, H., Bisht, K.S., Sun, L., Patel, K., Lee, R.S., Kugoh, H., Oshimura, M., Feinberg, A.P., and Gius, D. (2008). CTCFL/BORIS is a methylation-independent DNA-binding protein that preferentially binds to the paternal H19 differentially methylated region. Cancer Res. 68, 55465551. Noonan, J.P., and McCallion, A.S. (2010). Genomics of long-range regulatory elements. Annu. Rev. Genomics Hum. Genet. 11, 123. Noordermeer, D., Leleu, M., Splinter, E., Rougemont, J., De Laat, W., and Duboule, D. (2011). The dynamic architecture of Hox gene clusters. Science 334, 222225. Nora, E.P., Lajoie, B.R., Schulz, E.G., Giorgetti, L., Okamoto, I., Servant, N., Piolot, T., van Berkum, N.L., Meisig, J., Sedat, J., et al. (2012). Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature 485, 381385. Odom, D.T., Zizlsperger, N., Gordon, D.B., Bell, G.W., Rinaldi, N.J., Murray, H.L., Volkert, T.L., Schreiber, J., Rolfe, P.A., Gifford, D.K., et al. (2004). Control of pancreas and liver gene expression by HNF transcription factors. Science 303, 13781381. Odom, D.T., Dowell, R.D., Jacobsen, E.S., Nekludova, L., Rolfe, P.A., Danford, T.W., Gifford, D.K., Fraenkel, E., Bell, G.I., and Young, R.A. (2006). Core transcriptional regulatory circuitry in human hepatocytes. Mol. Syst. Biol. 2, 2006.0017. Parelho, V., Hadjur, S., Spivakov, M., Leleu, M., Sauer, S., Gregson, H.C., Jarmuz, A., Canzonetta, C., Webster, Z., Nesterova, T., et al. (2008). Cohesins functionally associate with CTCF on mammalian chromosome arms. Cell 132, 422433. Pauli, A., Althoff, F., Oliveira, R.A., Heidmann, S., Schuldiner, O., Lehner, C.F., Dickson, B.J., and Nasmyth, K. (2008). Cell-type-specic TEV protease cleavage reveals cohesin functions in Drosophila neurons. Dev. Cell 14, 239251. Pauli, A., van Bemmel, J.G., Oliveira, R.A., Itoh, T., Shirahige, K., van Steensel, B., and Nasmyth, K. (2010). A direct role for cohesin in gene regulation and ecdysone response in Drosophila salivary glands. Curr. Biol. 20, 17871798. Phillips, J.E., and Corces, V.G. (2009). CTCF: master weaver of the genome. Cell 137, 11941211. Rada-Iglesias, A., Bajpai, R., Swigut, T., Brugmann, S.A., Flynn, R.A., and Wysocka, J. (2011). A unique chromatin signature uncovers early developmental enhancers in humans. Nature 470, 279283. mez-Lo pez, G., Pisano, D.G., and Losada, A. Remeseiro, S., Cuadrado, A., Go (2012). A unique role of cohesin-SA1 in gene regulation and development. EMBO J. 31, 20902102. Rhee, H.S., and Pugh, B.F. (2011). Comprehensive genome-wide protein-DNA interactions detected at single-nucleotide resolution. Cell 147, 14081419. Ribeiro de Almeida, C., Stadhouders, R., de Bruijn, M.J., Bergen, I.M., Thongjuea, S., Lenhard, B., van Ijcken, W., Grosveld, F., Galjart, N., Soler, E., and Hendriks, R.W. (2011). The DNA-binding protein CTCF limits proximal Vk

Cell 152, March 14, 2013 2013 Elsevier Inc. 1295

recombination and restricts k enhancer interactions to the immunoglobulin k light chain locus. Immunity 35, 501513. Rollins, R.A., Morcillo, P., and Dorsett, D. (1999). Nipped-B, a Drosophila homologue of chromosomal adherins, participates in activation by remote enhancers in the cut and Ultrabithorax genes. Genetics 152, 577593. Rubio, E.D., Reiss, D.J., Welcsh, P.L., Disteche, C.M., Filippova, G.N., Baliga, N.S., Aebersold, R., Ranish, J.A., and Krumm, A. (2008). CTCF physically links cohesin to chromatin. Proc. Natl. Acad. Sci. USA 105, 83098314. Samstein, R.M., Arvey, A., Josefowicz, S.Z., Peng, X., Reynolds, A., Sandstrom, R., Neph, S., Sabo, P., Kim, J.M., Liao, W., Li, M.O., Leslie, C., Stamatoyannopoulos, J.A., and Rudensky, A.Y. (2012). Foxp3 exploits a pre-existent enhancer landscape for regulatory T cell lineage specication. Cell 151, 153166. Sanyal, A., Lajoie, B.R., Jain, G., and Dekker, J. (2012). The long-range interaction landscape of gene promoters. Nature 489, 109113. Schatz, D.G. (2004). Antigen receptor genes and the evolution of a recombinase. Semin. Immunol. 16, 245256. Schmidt, D., Schwalie, P.C., Ross-Innes, C.S., Hurtado, A., Brown, G.D., Carroll, J.S., Flicek, P., and Odom, D.T. (2010a). A CTCF-independent role for cohesin in tissue-specic transcription. Genome Res. 20, 578588. Schmidt, D., Wilson, M.D., Ballester, B., Schwalie, P.C., Brown, G.D., Marshall, A., Kutter, C., Watt, S., Martinez-Jimenez, C.P., Mackay, S., et al. (2010b). Five-vertebrate ChIP-seq reveals the evolutionary dynamics of transcription factor binding. Science 328, 10361040. Schmidt, D., Schwalie, P.C., Wilson, M.D., Ballester, B., Gonc alves, A., Kutter, C., Brown, G.D., Marshall, A., Flicek, P., and Odom, D.T. (2012a). Waves of retrotransposon expansion remodel genome organization and CTCF binding in multiple mammalian lineages. Cell 148, 335348. Schmidt, E.F., Warner-Schmidt, J.L., Otopalik, B.G., Pickett, S.B., Greengard, P., and Heintz, N. (2012b). Identication of the cortical neurons that mediate antidepressant responses. Cell 149, 11521163. Schoenborn, J.R., Dorschner, M.O., Sekimata, M., Santer, D.M., Shnyreva, M., Fitzpatrick, D.R., Stamatoyannopoulos, J.A., and Wilson, C.B. (2007). Comprehensive epigenetic proling identies multiple distal regulatory elements directing transcription of the gene encoding interferon-gamma. Nat. Immunol. 8, 732742. Schuldiner, O., Berdnik, D., Levy, J.M., Wu, J.S., Luginbuhl, D., Gontang, A.C., and Luo, L. (2008). piggyBac-based mosaic screen identies a postmitotic function for cohesin in regulating developmental axon pruning. Dev. Cell 14, 227238. Segal, E., and Widom, J. (2009). From DNA sequence to transcriptional behaviour: a quantitative approach. Nat. Rev. Genet. 10, 443456. Seitan, V.C., Hao, B., Tachibana-Konwalski, K., Lavagnolli, T., Mira-Bontenbal, H., Brown, K.E., Teng, G., Carroll, T., Terry, A., Horan, K., et al. (2011). A role for cohesin in T cell receptor rearrangement and thymocyte differentiation. Nature 476, 467471. Seitan, V.C., Krangel, M.S., and Merkenschlager, M. (2012). Cohesin, CTCF and lymphocyte antigen receptor locus rearrangement. Trends Immunol. 33, 153159. rez-Melgosa, M., Miller, S.A., Weinmann, A.S., Sabo, P.J., Sekimata, M., Pe Sandstrom, R., Dorschner, M.O., Stamatoyannopoulos, J.A., and Wilson, C.B. (2009). CCCTC-binding factor and the transcription factor T-bet orchestrate T helper 1 cell-specic structure and function at the interferon-gamma locus. Immunity 31, 551564. llner, S., Church, D.M., and Eichler, E.E. (2008). Mouse She, X., Cheng, Z., Zo segmental duplication and copy number variation. Nat. Genet. 40, 909914. Shen, Y., Yue, F., McCleary, D.F., Ye, Z., Edsall, L., Kuan, S., Wagner, U., Dixon, J., Lee, L., Lobanenkov, V.V., and Ren, B. (2012). A map of the cis-regulatory sequences in the mouse genome. Nature 488, 116120. Shrimali, S., Srivastava, S., Varma, G., Grinberg, A., Pfeifer, K., and Srivastava, M. (2012). An ectopic CTCF-dependent transcriptional insulator inuences the choice of Vb gene segments for VDJ recombination at TCRb locus. Nucleic Acids Res. 40, 77537765.

Shukla, S., Kavak, E., Gregory, M., Imashimizu, M., Shutinoski, B., Kashlev, M., Oberdoerffer, P., Sandberg, R., and Oberdoerffer, S. (2011). CTCF-promoted RNA polymerase II pausing links DNA methylation to splicing. Nature 479, 7479. Simonis, N., Rual, J.F., Carvunis, A.R., Tasan, M., Lemmens, I., Hirozane-Kishikawa, T., Hao, T., Sahalie, J.M., Venkatesan, K., Gebreab, F., et al. (2009). Empirically controlled mapping of the Caenorhabditis elegans protein-protein interactome network. Nat. Methods 6, 4754. Siomi, H., and Siomi, M.C. (2011). Stress signaling etches heritable marks on chromatin. Cell 145, 10051007. Skibbens, R.V. (2009). Establishment of sister chromatid cohesion. Curr. Biol. 19, R1126R1132. Sopher, B.L., Ladd, P.D., Pineda, V.V., Libby, R.T., Sunkin, S.M., Hurley, J.B., Thienes, C.P., Gaasterland, T., Filippova, G.N., and La Spada, A.R. (2011). CTCF regulates ataxin-7 expression through promotion of a convergently transcribed, antisense noncoding RNA. Neuron 70, 10711084. Sou, A., Donahue, G., and Zaret, K.S. (2012). Facilitators and impediments of the pluripotency reprogramming factors initial engagement with the genome. Cell 151, 9941004. Soutourina, J., Wydau, S., Ambroise, Y., Boschiero, C., and Werner, M. (2011). Direct interaction of RNA polymerase II and mediator required for transcription in vivo. Science 331, 14511454. Spencer, R.J., del Rosario, B.C., Pinter, S.F., Lessing, D., Sadreyev, R.I., and Lee, J.T. (2011). A boundary element between Tsix and Xist binds the chromatin insulator Ctcf and contributes to initiation of X-chromosome inactivation. Genetics 189, 441454. Splinter, E., Heath, H., Kooren, J., Palstra, R.J., Klous, P., Grosveld, F., Galjart, N., and de Laat, W. (2006). CTCF mediates long-range chromatin looping and local histone modication in the beta-globin locus. Genes Dev. 20, 23492354. ler, A., Stadler, M.B., Murr, R., Burger, L., Ivanek, R., Lienert, F., Scho van Nimwegen, E., Wirbelauer, C., Oakeley, E.J., Gaidatzis, D., et al. (2011). DNA-binding factors shape the mouse methylome at distal regulatory regions. Nature 480, 490495. Stanhope-Baker, P., Hudson, K.M., Shaffer, A.L., Constantinescu, A., and Schlissel, M.S. (1996). Cell type-specic chromatin structure determines the targeting of V(D)J recombinase activity in vitro. Cell 85, 887897. Stedman, W., Kang, H., Lin, S., Kissil, J.L., Bartolomei, M.S., and Lieberman, P.M. (2008). Cohesins localize with CTCF at the KSHV latency control region and at cellular c-myc and H19/Igf2 insulators. EMBO J. 27, 654666. Stevens, H.C., Cham, K.S., Hughes, D.J., Sun, R., Sample, J.T., Bubb, V.J., Stewart, J.P., and Quinn, J.P. (2012). CTCF and Sp1 interact with the Murine gammaherpesvirus 68 internal repeat elements. Virus Genes 45, 265273. Strachan, T. (2005). Cornelia de Lange Syndrome and the link between chromosomal function, DNA repair and developmental gene regulation. Curr. Opin. Genet. Dev. 15, 258264. m, L., Lindroos, H.B., Shirahige, K., and Sjo gren, C. (2004). Postreplicative Stro recruitment of cohesin to double-strand breaks is required for DNA repair. Mol. Cell 16, 10031015. Suganuma, T., and Workman, J.L. (2011). Signals and combinatorial functions of histone modications. Annu. Rev. Biochem. 80, 473499. Tempera, I., Wiedmer, A., Dheekollu, J., and Lieberman, P.M. (2010). CTCF prevents the epigenetic drift of EBV latency promoter Qp. PLoS Pathog. 6, e1001048. Thomas, M.C., and Chiang, C.M. (2006). The general transcription machinery and general cofactors. Crit. Rev. Biochem. Mol. Biol. 41, 105178. Thomson, J.P., Skene, P.J., Selfridge, J., Clouaire, T., Guy, J., Webb, S., Kerr, A.R., Deaton, A., Andrews, R., James, K.D., et al. (2010). CpG islands inuence chromatin structure via the CpG-binding protein Cfp1. Nature 464, 10821086. Tittel-Elmer, M., Lengronne, A., Davidson, M.B., Bacal, J., Franc ois, P., Hohl, M., Petrini, J.H., Pasero, P., and Cobb, J.A. (2012). Cohesin association to replication sites depends on rad50 and promotes fork restart. Mol. Cell 48, 98108.

1296 Cell 152, March 14, 2013 2013 Elsevier Inc.

Unal, E., Arbel-Eden, A., Sattler, U., Shroff, R., Lichten, M., Haber, J.E., and Koshland, D. (2004). DNA damage response pathway uses histone modication to assemble a double-strand break-specic cohesin domain. Mol. Cell 16, 9911002. Vahedi, G., Takahashi, H., Nakayamada, S., Sun, H., Sartorelli, V., Kanno, Y., and OShea, J.J. (2012). STATs shape the active enhancer landscape of T cell populations. Cell 151, 981993. van de Nobelen, S., Rosa-Garrido, M., Leers, J., Heath, H., Soochit, W., Joosen, L., Jonkers, I., Demmers, J., van der Reijden, M., Torrano, V., et al. (2010). CTCF regulates the local epigenetic state of ribosomal DNA repeats. Epigenetics Chromatin 3, 19. van Steensel, B. (2011). Chromatin: constructing the big picture. EMBO J. 30, 18851895. Vega, H., Waissz, Q., Gordillo, M., Sakai, N., Yanagihara, I., Yamada, M., van Gosliga, D., Kayserili, H., Xu, C., Ozono, K., et al. (2005). Roberts syndrome is caused by mutations in ESCO2, a human homolog of yeast ECO1 that is essential for the establishment of sister chromatid cohesion. Nat. Genet. 37, 468470. Vermeulen, M., Mulder, K.W., Denissov, S., Pijnappel, W.W., van Schaik, F.M., Varier, R.A., Baltissen, M.P., Stunnenberg, H.G., Mann, M., and Timmers, H.T. (2007). Selective anchoring of TFIID to nucleosomes by trimethylation of histone H3 lysine 4. Cell 131, 5869. Visel, A., Blow, M.J., Li, Z., Zhang, T., Akiyama, J.A., Holt, A., Plajzer-Frick, I., Shoukry, M., Wright, C., Chen, F., et al. (2009). ChIP-seq accurately predicts tissue-specic activity of enhancers. Nature 457, 854858.

Wallace, J.A., and Felsenfeld, G. (2007). We gather together: insulators and genome organization. Curr. Opin. Genet. Dev. 17, 400407. Walsh, C.P., and Bestor, T.H. (1999). Cytosine methylation and mammalian development. Genes Dev. 13, 2634. Wang, H., Maurano, M.T., Qu, H., Varley, K.E., Gertz, J., Pauli, F., Lee, K., Caneld, T., Weaver, M., Sandstrom, R., et al. (2012). Widespread plasticity in CTCF occupancy linked to DNA methylation. Genome Res. 22, 16801688. Wendt, K.S., Yoshida, K., Itoh, T., Bando, M., Koch, B., Schirghuber, E., Tsutsumi, S., Nagae, G., Ishihara, K., Mishiro, T., et al. (2008). Cohesin mediates transcriptional insulation by CCCTC-binding factor. Nature 451, 796801. Yang, Z., Yik, J.H., Chen, R., He, N., Jang, M.K., Ozato, K., and Zhou, Q. (2005). Recruitment of P-TEFb for stimulation of transcriptional elongation by the bromodomain protein Brd4. Mol. Cell 19, 535545. Zhang, B., Chang, J., Fu, M., Huang, J., Kashyap, R., Salavaggione, E., Jain, S., Kulkarni, S., Deardorff, M.A., Uzielli, M.L., et al. (2009). Dosage effects of cohesin regulatory factor PDS5 on mammalian development: implications for cohesinopathies. PLoS ONE 4, e5232. Zhong, X.P., and Krangel, M.S. (1999). Enhancer-blocking activity within the DNase I hypersensitive site 2 to 6 region between the TCR alpha and Dad1 genes. J. Immunol. 163, 295300.

Cell 152, March 14, 2013 2013 Elsevier Inc. 1297

Review
Long Noncoding RNAs: Cellular Address Codes in Development and Disease
Pedro J. Batista1 and Howard Y. Chang1,*
1Howard Hughes Medical Institute and Program in Epithelial Biology, Stanford University School of Medicine, Stanford, CA 94305, USA *Correspondence: howchang@stanford.edu http://dx.doi.org/10.1016/j.cell.2013.02.012

Leading Edge

In biology as in real estate, location is a cardinal organizational principle that dictates the accessibility and ow of informational trafc. An essential question in nuclear organization is the nature of the address codehow objects are placed and later searched for and retrieved. Long noncoding RNAs (lncRNAs) have emerged as key components of the address code, allowing protein complexes, genes, and chromosomes to be trafcked to appropriate locations and subject to proper activation and deactivation. lncRNA-based mechanisms control cell fates during development, and their dysregulation underlies some human disorders caused by chromosomal deletions and translocations.
Introduction From a single cell to an entire organism, spatial positioning is a key problem in biology. It is well appreciated that robust systems sort and distribute macromolecules, a property essential for the function of cells and tissues (Shevtsov and Dundr, 2011; Wolpert, 2011). A historical example illustrates the general utility of spatial organization. As the Roman Empire expanded and the Romans were faced with the need to construct cities in new lands, they developed a city prototype that included a group of answers to the many practical problems related to the creation and maintenance of a city (Figure 1A). This was a universal plan of simple execution. City walls protected the citizens from attack and delimited the city. At the center stood the forum, where the business and political activities of the city were concentrated. Fountains were placed throughout the city to supply water, and other spaces, such as amphitheaters, temples, and baths, were dedicated to organize daily activities. Thus, a group of structures analogous in function was always present in an organization that follows the original prototype (Grimal and Woloch, 1983). Just like the Roman city, the nucleus of the eukaryotic cell is a highly organized space (Figure 1B). Evolution gave rise to a nuclear prototype that provides answers to the many challenges the cell has to respond to maintain homeostasis and growth, though subject to developmental specialization (Solovei et al., 2009). Chromosomes are not randomly organized in the nucleus, and during interphase, each chromosome occupies a discrete territory (reviewed in Cremer and Cremer, 2010). Furthermore, whereas the densely compacted heterochromatin is localized at the nuclear envelope, euchromatin localizes to the interior regions of the nucleus. Gene expression is also localized and occurs mostly at nuclear center. In addition, active genes that are coregulated are often found forming clusters. During development, individual loci such as immunoglobulin or Hox genes are known to change position within the nucleus according to their transcriptional status (reviewed in Misteli, 2007).
1298 Cell 152, March 14, 2013 2013 Elsevier Inc.

Large portions of the genome are partitioned into topological domains of chromatin interaction ranging from hundreds of kilobases to megabases (the resolution of current methods), within which the genes tend to be more coregulated (Dixon et al., 2012; Nora et al., 2012). The complex task of gene expressionensuring the proper timing, space, and rate of expressioninvolves noncoding regions of the genome, chromatin modications, and the arrangement of chromosomes and nuclear domains. Here, we review the evidence that lncRNAs are a rich source of molecular addresses in the eukaryotic nucleus. Biogenesis and Characteristics Efforts over the last decade revealed that a large fraction of the noncoding genome is transcribed. Extensive annotation of lncRNA has been performed in multiple model organisms (reviewed in Rinn and Chang, 2012), and there is now evidence that, whereas 2% of the genome encodes for proteins (IHGSC, 2004), primary transcripts cover 75% of the human genome, with processed transcripts covering 62.1% of the genome (Djebali et al., 2012). In this Review, we focus on a particular class of noncoding transcripts known as long noncoding RNAs (lncRNAs) and the roles that they play in nuclear organization. lncRNAs are currently dened as transcripts of greater than 200 nucleotides without evident protein coding function (Rinn and Chang, 2012). It is important to note that lncRNA is a broad denition that encompasses different classes of RNA transcripts, including enhancer RNAs, small nucleolar RNA (snoRNA) hosts, intergenic transcripts, and transcripts overlapping other transcripts in either sense or antisense orientation. lncRNAs predominantly localize to the nucleus and have, on average, a lower level of expression than protein coding genes, although details vary for different classes (Djebali et al., 2012; Ravasi et al., 2006). Multiple studies have shown that lncRNA expression is more cell type specic than protein-coding genes (Cabili et al., 2011; Djebali et al., 2012; Ravasi et al., 2006). At the DNA and

Figure 1. Comparison between a Roman City and the Cell Nucleus Reveals the Importance of Spatial Organization
(A) Depiction of the basic features of a Roman city. City walls delimit the city, with gates at the two main roads that intersect at the center of the city. The Forum was the business and political center of the city, and many buildings provided specic functions that were essential for city life. (B) Schematic representation of the typical nuclear organization during interphase. Each chromosome occupies a discrete territory. Euchromatin localizes to the interior regions of the nucleus, and the densely compacted heterochromatin localizes near the nuclear envelope. Many specialized functions are executed in distinct regions in the nucleus, known as nuclear bodies. One example is the nucleolus, where ribosomes are assembled. Adapted from Solovei et al., 2009.

chromatin level, lncRNA loci are similar to mRNA loci, but lncRNAs show a bias for having just one intron and a trend for less-efcient cotranscriptional splicing (Derrien et al., 2012; Tilgner et al., 2012). Although lncRNAs are under lower selective pressure than protein-coding genes, sequence analysis shows that lncRNAs are under higher selective pressures than ancestral repeat sequences, which are considered to be under neutral selection. Interestingly, the promoters of lncRNAs are the region of the lncRNA gene under higher selective pressure, displaying levels of selection comparable to the promoters of proteincoding genes (Derrien et al., 2012; Guttman et al., 2009; Marques and Ponting, 2009; rom et al., 2010; Ponjavic et al., 2007). This analysis has also revealed a high number of correlated positions between lncRNA in sequence alignments, an observation that ts the hypothesis that lncRNAs are under selective pressure to maintain a functional RNA structure (Derrien et al., 2012). Comparison between mammalian and zebrash lncRNAs revealed that short stretches of conserved sequence are functionally important and that location and structure of lncRNAs can be conserved, even in the absence of strong sequence conservation. The ability to induce a loss-of-function phenotype by blocking the short conserved motif in addition to the ability to rescue loss of function of two lncRNAs with the addition of human and mouse lncRNAs (Ulitsky et al., 2011) demonstrates that these in silico observations are of biological signicance. Sequence analysis of lncRNAs, focusing on presence and size of open reading frames as well as codon conservation frequency, has been used to exclude protein coding potential. Ribosome proling, a method that enumerates transcripts associated with ribosomes, had detected many lncRNAs, but it was unclear whether these lncRNAs are just being scanned similarly

to 50 untranslated regions or actually are productively engaged in translation (Ingolia et al., 2011). Comparison of RNA sequencing (RNA-seq) data to tandem mass spectrometry data for two cell lines suggests that 92% of the annotated lncRNAs do not yield nfai et al., 2012; Derrien detectable peptides in these cell lines (Ba et al., 2012). Although the differences between these two studies may stem from measuring two different endpoints, they suggest that lncRNAs have low translational potential even when ribosomes attempt to decode them. Current annotations suggest that the actual number of lncRNAs exceeds that of protein coding genes (Derrien et al., 2012). The repertoire of roles performed by lncRNAs is growing, as there is now evidence that lncRNAs participate in multiple networks regulating gene expression and function. Several characteristics of lncRNAs make them the ideal system to provide the nucleus with a system of molecular addresses. lncRNAs, unlike proteins, can function both in cis, at the site of transcription, or in trans. An RNA-based address code may be deployed more rapidly and economically than a system that relies only on proteins. lncRNAs do not need to be translated and do not require transport between the cytoplasm and the nucleus. lncRNAs can also interact with multiple proteins, enabling scaffolding functions and combinatorial control (Wang and Chang, 2011). As such, the act of transcription can rapidly create an anchor that will lead to the formation, or remodeling, of nuclear domains through the recruitment or sequestration of proteins already present in the nuclear compartment. Using lncRNAs allows cells to create addresses that are regional-, locus- or even allele-specic (Lee, 2009). At the regional level, lncRNAs can inuence the formation of nuclear domains and the transcriptional status of an entire chromosome, and they can participate in the interaction of two different chromosomal regions. At a more ne-grained level, lncRNAs can control the chromatin state and activity of a chromosomal locus or specic gene. We explore each of these concepts below with recently published examples. Locus Control of Gene Regulation Cells can use noncoding RNAs to modulate gene expression by changing the accessibility of gene promoters. These mechanisms can be used to ne-tune gene expression in response to environmental conditions or to silence a gene as part of a developmental program. First, the act of noncoding RNA (ncRNA) transcription itself can be purposed for regulatory function. For example, transcription through a regulatory sequence, such as a promoter, can block its function, a mechanism termed transcriptional interference (Figure 2A) rst identied in yeast (Martens et al., 2004). In such instances, the lncRNA promoter is nely tuned to receive appropriate inputs to exert regulatory function; the lncRNA product is typically a faithful biomarker of transcriptional interference in action but is not required for its success. In conditions that limit vegetative growth, diploid S. cerevisiae cells enter sporulation, a differentiation program that results in the formation of haploid daughter cells. Entry into meiosis has catastrophic consequences in haploid cells and is therefore inhibited via a transcriptional interference mechanism. A transcription factor in haploid cells activates the expression of IRT1(SUT643),
Cell 152, March 14, 2013 2013 Elsevier Inc. 1299

Figure 2. Functional Modules of lncRNAs in the Nucleus


(A) The act of transcription at noncoding regions can modulate gene expression through the recruitment of chromatin modiers to the site of transcription. These complexes can create a local chromatin environment that facilitates or blocks the binding of other regulators. (B) lncRNAs can function in cis, recruiting protein complexes to their site of transcription and thus creating a locus-specic address. Cells can use this mechanism to repress or activate gene expression.

a noncoding RNA that overlaps the promoter of IME1, the master regulator of sporulation. Transcription of IRT1 establishes a repressive chromatin state at the IME1 promoter through the recruitment of histone methyltransferase Set2 and the histone deacetylase Set3 (van Werven et al., 2012). The use of noncoding transcription to control chromatin modication is a widespread strategy. The Set3 histone deacetylase has also been implicated in the modulation of gene induction kinetics during changes of carbon source. Transcription of ncRNAs that overlap the regulated genes leads to the establishment of H3K4me2, which recruits Set3 and leads to the deacetylation of the gene promoter. Deacetylation of the promoter results in delayed or reduced induction of the regulated genes. This mechanism is also involved in the inhibition of cryptic promoters (Kim et al., 2012). Expression of GAL10-ncRNA, driven by Reb1, leads to deacetylation across the GAL1-10 promoter, facilitating glucose repression of GAL1-10 (Houseley et al., 2008). In mammalian imprinting, the noncoding RNA Air (also known as Airn) is expressed from the paternal chromosome and is involved in silencing the paternal alleles of multiple genes. The promoter of one of these genes, Igf2r, overlaps with the Air transcriptional unit and is silenced by transcriptional interference (Latos et al., 2012). Transcriptional interference can also be used to activate gene expression by inhibiting the action of repressor elements, functioning as an antisilencing mechanism. In Drosophila embryogenesis, transcription through Polycomb response elements (PRE) alters the function of these elements, blocking the establishment of repressive chromatin (Schmitt et al., 2005). Second, lncRNAs can silence or activate gene expression in cis, acting on neighboring genes of the lncRNA locus. Some of the rst studied examples of lncRNA function involve dosage compensation and genomic imprinting, whereby lncRNAs provide allele-specic gene regulation to differentially control two copies of the same gene within one cell (see the Review by Lee and Bartolomei on page 1308 of this issue; Lee and Bartolomei, 2013) (Figure 2B). Several such lncRNAs are now recognized to interact with and recruit histone modication complexes, including Xist (recruits PRC2 for H3K27me3 and RYBP-PRC1 for H2A ubiquitylation) and Kcnq1ot1 (recruits G9a for H3K9me3 and PRC2) (Pandey et al., 2008; Tavares et al., 2012; Zhao et al., 2010). The Air lncRNA (the transcription of which inhibits Igfr2) targets G9a and H3K9me3 to silence more distantly located genes on the paternal chromosome (Nagano et al., 2008); hence, one lncRNA gene can employ multiple mechanisms to regulate nearby and distantly located genes. In genome-wide studies, numerous lncRNAs have now been found to interact with chromatin modication complexes (Guil et al., 2012; Guttman et al., 2011; Khalil et al., 2009; Zhao et al., 2010). In the plant A. thaliana, two cold-inducible lncRNAs, COOLAIR and COLDAIR, are embedded antisense or intronic to the owering control locus gene FLC, and they help to recruit PRC2 to stably silence FLC in a cold-dependent manner, a key
(C) lncRNAs can function in trans and recruit protein complexes to chromatin loci away from their site of transcription. (D) lncRNAs can bind and sequester transcription factors away from their target chromosomal regions.

1300 Cell 152, March 14, 2013 2013 Elsevier Inc.

mechanism to ensure the proper owering time after winter termed vernalization (reviewed in Ietswaart et al., 2012). In an analogous fashion, DNA damage induces a lncRNA from the promoter of cyclin D1 gene (CCND1); this lncRNA binds to TLS protein to allosterically inhibit histone acetyltransferase in cis, which suppresses CCND1 transcription (Wang et al., 2008). DNA methylation can occur as a long-term silencing mechanism downstream of repressive histone modications, and lncRNAs may also guide DNA methylation in addition to histone modication. The ribosomal DNA (rDNA) loci are tandemly repeated in the genome, with some copies being transcriptionally active, whereas others are silenced by DNA methylation and histone modications. Each ribosomal DNA transcribes rRNA separated by intergenic spacers (IGSs) as a polycistronic unit, and IGSs can be processed to 150250 nt fragments termed promoter RNAs (pRNAs) (reviewed in Bierhoff et al., 2010). pRNA serves as a platform to recruit the de novo cytosine methylase DNMT3 and the NoRC complex containing poly-ADP ribose polymerase-1 (PARP-1) to promote silencing of rDNA (Guetg et al., 2012; Mayer et al., 2006). Notably, a stretch of 20 nt in pRNA binds the rDNA promoter, forming a RNA:DNA:DNA triplex (Schmitz et al., 2010). This triplex structure is proposed to recruit DNMT3 and also serves as the specic recognition mechanism between lncRNA and genomic DNAa model that likely applies to other lncRNA-DNA interactions (Martianov et al., 2007). A distinct family of lncRNAs serves to activate gene expression. Many active enhancer elements transcribe lncRNAs, termed eRNAs (De Santa et al., 2010; Kim et al., 2010), and several lncRNAs are required to activate gene expression, which are termed enhancer-like RNAs (rom et al., 2010). Evf is a cisacting lncRNA that is required for the activation of Dlx5/6 genes and generation of GABAergic interneurons in vivo (Bond et al., 2009). A key mechanism of lncRNA specicity in cis is the higher-order chromosomal conguration (Wang et al., 2011). The noncoding RNA HOTTIP is expressed from the 50 end tip of the HoxA locus and drives histone H3 lysine 4 trimethylation and gene transcription of HoxA distal genes through the recruitment of the WDR5/MLL complex (Wang et al., 2011). Endogenous HOTTIP is brought to its target genes by chromosomal looping, and ectopic HOTTIP only activates transcription when it is articially tethered to the reporter gene (Wang et al., 2011). The MLL complex is also recruited to the Hox locus by the noncoding RNA Mistral, located between Hoxa6 and Hoxa7. Mistral directly interacts with MLL1, leading to changes at the chromatin level that activate Hoxa6 and Hoxa7 (Bertani et al., 2011). Hence, lncRNA interaction with MLL/Trx complexes and likely additional proteins will dene their function in enforcing active chromatin states and gene activation. Third, lncRNAs can control chromatin states at distantly located genes (i.e., in trans) for both gene silencing and activation (Figure 2C). These lncRNAs bind to some of the same effector chromatin modication complexes but target them to genomic loci genome-wide. For instance, human HOTAIR lncRNA binds to PRC2 and LSD1 complexes and couples H3K27 methylation and H3K4 demethylation activity to hundreds of sites genome-wide (Chu et al., 2011; Tsai et al., 2010). HOTAIR is located in the HOXC locus and is regulated in an anatomic

position-specic fashion. Linc-p21 is induced by p53 during DNA damage and recruits hnRNPK via physical interaction to mediate p53-mediated gene repression (Huarte et al., 2010). Linc-p21 also has a recently recognized role in translational control (Yoon et al., 2012). In contrast, PANDA, another lncRNA induced by p53, acts as a decoy by binding to the transcription factor NF-YA and preventing NF-YA from activating genes encoding cell death proteins (Hung et al., 2011) (Figure 2D). lncRNA-mediated activation can also occur in trans. Jpx, an Xlinked lncRNA that activates Xist expression, is important for X chromosome inactivation in female cells, and Jpx deletion can be rescued by Jpx supplied in trans (Tian et al., 2010). Nuclear Domains The concept of lncRNA recruitment of factors to genes may be more properly considered a two-way street, with genes being moved into specic cytotopic locations by lncRNAs. One type of molecular address can be found in the formation of nuclear domains. These are regions of the nucleus where specic functions are performed. Unlike cellular organelles, these domains are not membrane delimited. They are instead characterized by the components that form them. These domains are believed to form through molecular interactions between its components. Once a stable interaction is found, the components remain associated. These domains are often formed around the sites of transcription of RNA components, which function as molecular anchors (reviewed in Dundr and Misteli, 2010). The noncoding RNA NEAT1, an essential component of the Paraspeckle, is a well-characterized example of how noncoding RNAs can function as structural components of nuclear bodies. Upon transcription of NEAT1, diffusible components of this domain nucleate at the site of NEAT1 accumulation, leading to the formation of the Paraspeckle (Figure 3A) (Chen and Carmichael, 2009; Clemson et al., 2009; Mao et al., 2011; Sasaki et al., 2009; Shevtsov and Dundr, 2011; Sunwoo et al., 2009). Nuclear domains can be dynamically regulated in an RNAdependent fashion. In response to serum stimulation, the demethylase KDM4C is recruited to the promoters of genes controlled by the cell-cycle-specic transcription factor E2F, where it demethylates Polycomb protein Pc2. Whereas methylated Pc2 interacts with the noncoding RNA TUG1, a component of Polycomb bodies, unmethylated Pc2 interacts with the noncoding RNA MALAT1/NEAT2, a component of interchromatin granules. Therefore, changes in the methylation status of Pc2 lead to the relocation of growth control genes from an environment that inhibits gene expression, the Polycomb body, to a domain that is permissive of gene expression, the interchromatin granule (Figure 3B). Interestingly, the reading ability of Pc2 is modulated by the noncoding RNA that it is interacting with. When bound to TUG1, Pc2 reads H4R3me2s and H3K27me2, whereas it reads H2AK5ac and H2AK13ac when interacting with MALAT1/ NEAT2 (Yang et al., 2011). These interplays control the growthfactor-dependent expression of cell-cycle genes in vitro, but it came as a surprise that mouse knockouts of either NEAT1 or MALAT1/NEAT2 had no little overt phenotype (Eissmann et al., 2012; Nakagawa et al., 2012; Nakagawa et al., 2011; Zhang et al., 2012). Clearly, the question of redundancy or compensation in vivo needs to be addressed in the future.
Cell 152, March 14, 2013 2013 Elsevier Inc. 1301

structure at the 30 end of MALAT1/NEAT2 lncRNA, which lacks a polyA tail, stabilizes the lncRNA and presumably limits its export to the cytoplasm (Brown et al., 2012; Wilusz et al., 2012). Viral nuclear lncRNAs have also adapted this strategy and hide their 30 polyA tails in a triplex RNA structure to prevent decay (Mitton-Fry et al., 2010; Tycowski et al., 2012). Gene Control through Sequestration In contrast to the model of nuclear domains that concentrate and thereby facilitate molecular interactions, spatial control can also separate reactants until the moment is right. For example, certain environmental stresses trigger the retention of select proteins in the nucleolus away from their normal site of action. The retention at the nucleolus requires a signal sequence and the expression of specic noncoding RNAs expressed from the large intergenic spacer (IGS) of the rDNA repeats. IGS ncRNAs turn out to gate the responses to cellular stress. Unique IGS ncRNAs are transcriptionally induced by specic stressors, functioning as baits for proteins with specic signal sequences. Interfering with a specic IGSRNA does not affect the function of other IGSRNAs (Audas et al., 2012) (Figure 3D). In S. pombe, both mRNAs and lncRNAs function together to form heterochromatin and sequester genes in the control of meiosis. During vegetative growth, the expression of meiotic genes is repressed through selective elimination of meiotic mRNAs. Meiotic genes contain within their transcripts a region known as determinant of selective removal (DSR) that determines their degradation. This sequence is recognized by Mmi1, which promotes both mRNA degradation (Harigaya et al., 2006) as well as formation of facultative heterochromatic islands (Zofall et al., 2012). Hence, aberrant nascent mRNAs can function in an lncRNA-like fashion to tether the formation for heterochromatin. Furthermore, during vegetative growth, Mei2p, an RNA-binding protein that is crucial for entry in meiosis, is kept in an inactive form. When cells commit to the meiosis expression program, Mei2p accumulates in its active form and sequesters Mmi1 to a structure known as Mei2 dot, where Mmi1 function is inhibited. The Mei2 dot forms at the sme2 locus at the site of transcription of two noncoding RNAs, meiRNA-S and meiRNA-L, which are necessary for the formation of the Mei2 dot structure and, therefore, entry in meiosis (Yamamoto, 2010). Higher-Order Chromosomal Interactions An intriguing possibility is that lncRNAs can regulate the threedimensional structure of the chromosomes by facilitating the interaction of specic chromosomal loci. The act of transcription itself can inuence gene expression and genome organization by promoting chromatin modications, by recruiting gene active regions to common transcription factories, or by exposing the DNA strands to enzymatic activity. Hence, the presence of multiple lncRNA genes in a region may help chromosomal loci adopt distinct conformation with transcriptional activation. For example, in the Hox loci, collinear expression of Hox mRNA genes and Hox lncRNAs along the chromosome is associated with the progressive recruitment of those chromosomal segments into a tightly interacting domain that is distinct from the transcriptionally silent portion of the loci (Noordermeer

Figure 3. Schematic Representation of the Cell Nucleus, Showing the Nucleolus and Chromosomal Territories
(A) Protein components of the Paraspeckle diffused throughout the nucleoplasm aggregate upon the transcription of NEAT1, forming the Paraspeckle nuclear domain. (B) Pc2 differentially binds MALAT1/NEAT2 or TUG1 depending on methylation status. Methylated Pc2 interacts with TUG1, bringing associated growth control genes to a repressive environment, the polycomb body (PcG). Unmethylated Pc2 interacts with MALAT1/NEAT2 at the interchromatin granule (ICG), where gene expression is permitted. (C) Expression of lncRNAs with snoRNA ends from the Prader-Willi syndrome locus functions as a sink for the FOX2 protein, leading to redistribution of this splicing factor in this nuclear region. (D) In response to cellular stress, transcription of specic IGSRNAs leads to the retention of targeted proteins at the nucleolus. Different types of stress lead to the retention of different proteins through the expression of specic noncoding RNAs.

Unusual processing mechanisms may explain the localization activity of certain lncRNAs. An imprinted region in chromosome 15 (15q11-q13) that had been implicated in Prader-Willi syndrome (PWS) hosts multiple intron-derived lncRNAs with small nucleolar RNAs at their endsso called sno-lncRNAs. It is probable that the presence of structured snoRNAs at the ends of lncRNAs stabilizes these molecules, which have no 50 cap or polyA tail. These RNAs are retained in the nucleus and localize to, or remain near, their sites of transcription. Knockdown of sno-lncRNAs has little effect on the expression of nearby genes, suggesting that it does not affect gene expression in cis. Instead, these sno-lncRNAs seem to create a domain where the splicing factor Fox2 is enriched. These sno-lncRNAs contain multiple binding sites for Fox2, and altering the level of sno-lncRNAs led to a redistribution of Fox2 in the nucleus and changes in mRNA splicing patterns. Hence, the sno-lncRNAs appear to function as Fox2 sinks, participating in the regulation of splicing in specic subnuclear domains (Yin et al., 2012) (Figure 3C). Similarly, formation of a blunt-ended triplex RNA
1302 Cell 152, March 14, 2013 2013 Elsevier Inc.

prevent aneuploidy, homologous chromosomes must interact and generate stable associations. The sme2 locus plays a key role in the mutual identication of homologous chromosomes during meiosis, in addition to its role in the mitosis/meiosis switch discussed above. The meiRNA-L transcript accumulates at the sme2 locus and is necessary for the robust chromosomal pairing (Ding et al., 2012). These studies suggest that noncoding RNAs can be components of a cis-acting pairing factor that allows homologous chromosomes to identify each other. Cytoplasmic Functions The ultimate function of mRNAs is to be translated, and like other steps of gene expression, multiple layers of posttranscriptional regulation exist in the cytoplasm (Figure 4). lncRNAs can also identify mRNAs in the cytoplasm and modulate their life cycle. Recent works demonstrated that lncRNAs impact both the mRNA half-life and translation of mRNAs. The lncRNA TINCR (terminal differentiation-induced ncRNA) is induced during epidermal differentiation and is required for normal induction of key mediators of epidermal differentiation. TINCR localizes to the cytoplasm, where it interacts with Staufen 1 protein (STAU1) to promote the stability of mRNAs containing the TINCR box motif (Kretz et al., 2013) (Figure 4A). Hence, the TINCR mechanism is the diametric opposite of posttranscriptional silencing by small regulatory RNAs like siRNA or miRNAs. STAU1 can also be programmed by other lncRNAs to facilitate mRNA degradation. The half-STAU1-binding site RNAs (1/2sbsRNAs) contain Alu elements that bind to Alu elements in the 30 UTR of actively transcribed target genes, generating a STAU1-binding site. These mRNAs are therefore identied as STAU1-mediated messenger RNA decay (SMD) targets (Gong and Maquat, 2011) (Figure 4B). In addition, a recently identied class of lncRNA impacts gene expression by promoting translation of targets mRNAs. Expression of antisense Uchl1 RNA leads to an increase in Uchl1 protein level without any change at the mRNA level. Antisense Uchl1 lncRNA is composed by a region that overlaps with the rst 73 nucleotides of Uchl1 and two embedded repetitive sequences, one of which (SINEB2) is required for the ability of the lncRNA to induce protein translation. Under stress conditions in which cap-dependent translation is inhibited, antisense Uchl1 lncRNA, previously enriched in the nucleus, moves into the cytoplasm and hybridizes with Uchl1 mRNA to enable cap-independent translation of Uchl1. In other words, the lncRNA acts like a mobile internal ribosomal entry element to promote selective translation. Other SINEB2-containing antisense lncRNAs may function in a similar way (Carrieri et al., 2012) (Figure 4C). Conversely, lincRNA-p21 can inhibit the translation of target mRNAs. In the absence of HuR, lincRNA-p21 is stable and interacts with the mRNAs CTNNB1 and JUNB and translational repressor Rck, repressing the translation of the targeted mRNAs (Yoon et al., 2012) (Figure 4D). These emerging examples illustrate that lncRNAs can provide a rich palette of regulatory capacities in the cytoplasm. Human Diseases Considering the wide range of roles that lncRNAs play in cellular networks, it is not surprising that noncoding RNAs have been implicated in disease. Genome-wide association studies have
Cell 152, March 14, 2013 2013 Elsevier Inc. 1303

Figure 4. lncRNAs Regulate Gene Expression in the Cytoplasm


(A) The lncRNA TINCR interacts with STAU1 and target mRNAs containing the TINCR box motif, promoting their stability. (B) lncRNAs of the 1/2-sbsRNAs class hybridize with 30 -UTR-containing Alu elements and promote the degradation of these target mRNAs. (C) Under stress conditions, the lncRNA antisense to Uchl1 moves from the nucleus to the cytoplasm and binds the 50 end of the Uchl1 mRNA to promote its translation under stress conditions. (D) lincRNA-p21 interacts with and targets RcK to mRNAs, resulting in translation inhibition.

et al., 2011; Wang et al., 2011). A similar phenomenon was rst appreciated in the b-globin locus, and intergenic transcripts from its locus control regions (Ashe et al., 1997). Transcriptioncoupled looping is likely to be related to the fact that the Mediator complex that links transcription factors to basal transcription machinery promotes long-range enhancer-promoter interactions (Kagey et al., 2010). A similar transcription-directed mechanism has also been proposed to guide DNA recombination of lymphocyte receptor genes over megabases (Verma-Gaur et al., 2012). The lncRNA transcripts are useful readouts of the chromosomal conguration but are not necessarily required for the chromosomal interactions. lncRNAs can also regulate chromosome structure through direct mechanisms. High-throughput chromosomal conformation assays revealed that the active and inactive X chromosomes adopt quite distinct conformations. The inactive X (Xi) is coated by the Xist lncRNA, which is required for choosing the inactive X chromosome. Importantly, conditional knockout of Xist has demonstrated that the folding of inactive X requires the Xist RNA. After Xist deletion, the Xi chromosome adopts a conformation that is more similar to that of the active X chromosome (Xa) without reactivation of Xi gene expression. Hence, Xist appears to regulate X chromosome structure through mechanisms other than the relocation of active genes to transcriptional factories (Splinter et al., 2011). One intriguing clue is that conditional Xist deletion also led to loss of PRC2 and H3K27me3 marks. The conformations of the two X chromosomes appear to be regulated by distinct mechanisms because PRC2 is dispensable for the topological domains of Xa (Nora et al., 2012). Whether one or several Xa-expressed lncRNA controls Xa conformation remains to be seen. lncRNAs can also regulate the interaction between chromosomes, a concept that is exemplied by S. pombe meiosis. In order for chromosomes to properly segregate in meiosis and

revealed that only 7% of disease or trait-associated singlenucleotide polymorphisms (SNPs) reside in protein-coding exons, whereas 43% of trait-/disease-associated SNP are found outside of protein-coding genes (Hindorff et al., 2009). In addition to the example of sno-lncRNAs in Prader-Willi syndrome discussed above, several recent discoveries of lncRNAs in Mendelian disorders illustrate the emerging recognition of lncRNAs in human diseases. Facioscapulohumeral muscular dystrophy (FSHD) is the third most common myopathy and is predominantly caused by a contraction in copy number of the D4Z4 repeats mapping to 4q35. The D4Z4 repeat is the target of several chromatin modications, including H3K9me3 and H3K27me3, which are reduced in FSHD patients. Cabianca et al. found that a long array of D4Z4 repeats recruit Polycomb complexes to promote the formation of a repressive chromatin state that inhibits the expression of genes at 4q35. Loss of D4Z4 repeats results in derepression of DBE-T, a novel lncRNA that functions in cis and localizes to the FSHD locus. DBE-T recruits ASH1L (a component of MLL/TrX complex), leading to improper establishment of active chromatin and expression of genes from 4q35 (Cabianca et al., 2012). Hence, DBE-T is a lncRNA that functions as a locus control element by promoting active chromatin domain, and FSHD results from lncRNA promoter mutations that perturb DBE-T regulation. HELLP syndrome (hemolysis, elevated liver enzymes, low platelets) is a recessively inherited life-threatening pregnancy complication. Linkage analysis narrowed the HELLP locus to a gene desert between C12orf48 and IGF1 on 12q23.2, where a single 205 kb capped and polyadenylated lncRNA is transcribed (van Dijk et al., 2012). Knockdown of this lncRNA revealed a role in the transition from G2 to mitosis and trophoblast cell invasion, although the precise mechanism is still unclear. Notably, morpholino oligonucleotides complementary to the mutation site in HELLP lncRNA boosted lncRNA level and reversed the gene expression and cell invasion defects. Similarly, deletions in a coding-gene desert at 16q24.1 lead to alveolar capillary dysplasia with misalignment of pulmonary veins (ACD/MPV) (Szafranski et al., 2013). This region contains a distant enhancer of FOXF1, a key regulator of lung development. This enhancer element interacts with FOXF1 in human pulmonary microvascular endothelial cells, but not in lymphoblasts, suggesting that FOXF1 expression in the lung endothelium is regulated at the chromatin structure levels. In addition to transcription-factor-binding sites, the focal deletion includes two lncRNA expressed specically in the lung. An intriguing possibility is that the expression of these lncRNAs, which happens specically in the lung, contributes to the establishment of a chromatin loop that brings the enhancer in close proximity to FOXF1. Chromosomal translocations lead to inheritable structural and genetic changes and, as such, are relevant causes of genetic disease. One way that chromosomal translocations can lead to disease is through disruption of the higher-order chromatin organization and the cis-regulatory landscape. Recently, two different translocations have been identied in brachydactyly type E (BDE) that implicate lncRNA dysregulation (Maass et al., 2012). These translocations affect a regulatory region that inter1304 Cell 152, March 14, 2013 2013 Elsevier Inc.

acts in cis with PTHLH and in trans with SOX9. Interestingly, this region is home to a lncRNA whose expression is important for the proper expression of PTHLH and SOX9. Depletion of this lncRNA (DA125942) resulted in downregulation of PTHLH and SOX9. The lncRNA interacts with both loci, and the occupancy is reduced in chromatin originated from BDE patients. This study demonstrates how lncRNAs and chromatin higher-order organization collaborate in the regulation of gene expression. Recognition of the roles of lncRNAs in human disease has unveiled new diagnostic and therapeutic opportunities. lncRNAs are expressed in a more tissue-specic fashion than mRNA genes, a pattern that has been found to hold true in pathologic states such as cancer (Brunner et al., 2012). lncRNA measurements could hence trace cancer metastases or circulating cancer cells to their origins. In addition, a strong connection between lncRNAs and cancer has been clearly established, as many lncRNAs are dysregulated in human cancers. The lncRNA HOTAIR in overexpressed in breast, colon, pancreas, and liver cancers, and overexpression of HOTAIR has been shown to drive breast cancer metastasis in vivo (Gupta et al., 2010; Gutschner and Diederichs, 2012). lncRNAs appear to be more structured and stable than mRNA transcripts, which facilitate their detection as free nucleic acids in body uid such as urine and bloodknowledge already put to good use in clinically approved tests for prostate cancer (Fradet et al., 2004; Shappell, 2008; Tinzl et al., 2004). Aberrant lncRNAs can be knocked down in vivo using oligonucleotide drugs (Modarresi et al., 2012; Wheeler et al., 2012), which should spur advance in lncRNA genetics and therapeutics. Conclusions lncRNAs are well poised to be molecular address codes, particularly in the nucleus. On the one hand, transcription of lncRNAs is often exquisitely regulated, reecting the particular developmental stage and external environment that the cell has experienced. On the other, the capacity of lncRNAs to function as guides, scaffolds, and decoys endows them with enormous regulatory potential in gene expression and for spatial control within the cell. These outstanding properties of long RNAs have already been leveraged to make designer RNA scaffolds for synthetic cell circuits (Delebecque et al., 2011). Many questions remain to be addressed in this rapidly expanding eld. First, the in vivo function of most lncRNAs has not been determined. An extensive catalog of lncRNAs has recently been described available for several model organisms (Nam and Bartel, 2012; Pauli et al., 2012; Ulitsky et al., 2011), opening the door of a wide array of powerful techniques to be used in the in vivo study of lncRNAs that will complement the study of human lncRNAs. In addition, detailed knowledge of structure-function relationship in lncRNAs is still lacking, which prohibits the de novo prediction of lncRNA domains and functions that we take for granted in protein-coding transcripts. New technologies to deconvolute RNA structure and function (Martin et al., 2012; Wan et al., 2012), probe RNA-chromatin interactions (Chu et al., 2011; Simon et al., 2011), and track RNA movement in real time (Paige et al., 2011) will be crucial for understanding lncRNAs and realizing their therapeutic potential.

ACKNOWLEDGMENTS We thank members of the Chang lab for discussion and apologize to colleagues whose works are not discussed due to space limitation. We acknowledge support from NIH and California Institute for Regenerative Medicine (H.Y.C.). P.J.B. is the Kenneth G. and Elaine A. Langone Fellow of the Damon Runyon Cancer Research Foundation. H.Y.C. is an Early Career Scientist of the Howard Hughes Medical Institute. H.Y.C. is on the Scientic Advisory Board of RaNA Therapeutics, which works on long noncoding RNAs. REFERENCES Ashe, H.L., Monks, J., Wijgerde, M., Fraser, P., and Proudfoot, N.J. (1997). Intergenic transcription and transinduction of the human beta-globin locus. Genes Dev. 11, 24942509. Audas, T.E., Jacob, M.D., and Lee, S. (2012). Immobilization of proteins in the nucleolus by ribosomal intergenic spacer noncoding RNA. Mol. Cell 45, 147157. nfai, B., Jia, H., Khatun, J., Wood, E., Risk, B., Gundling, W.E., Jr., Kundaje, Ba A., Gunawardena, H.P., Yu, Y., Xie, L., et al. (2012). Long noncoding RNAs are rarely translated in two human cell lines. Genome Res. 22, 16461657. Bertani, S., Sauer, S., Bolotin, E., and Sauer, F. (2011). The noncoding RNA Mistral activates Hoxa6 and Hoxa7 expression and stem cell differentiation by recruiting MLL1 to chromatin. Mol. Cell 43, 10401046. Bierhoff, H., Schmitz, K., Maass, F., Ye, J., and Grummt, I. (2010). Noncoding transcripts in sense and antisense orientation regulate the epigenetic state of ribosomal RNA genes. Cold Spring Harb. Symp. Quant. Biol. 75, 357364. Bond, A.M., Vangompel, M.J., Sametsky, E.A., Clark, M.F., Savage, J.C., Disterhoft, J.F., and Kohtz, J.D. (2009). Balanced gene regulation by an embryonic brain ncRNA is critical for adult hippocampal GABA circuitry. Nat. Neurosci. 12, 10201027. Brown, J.A., Valenstein, M.L., Yario, T.A., Tycowski, K.T., and Steitz, J.A. (2012). Formation of triple-helical structures by the 30 -end sequences of MALAT1 and MENb noncoding RNAs. Proc. Natl. Acad. Sci. USA 109, 1920219207. Brunner, A.L., Beck, A.H., Edris, B., Sweeney, R.T., Zhu, S.X., Li, R., Montgomery, K., Varma, S., Gilks, T., Guo, X., et al. (2012). Transcriptional proling of lncRNAs and novel transcribed regions across a diverse panel of archived human cancers. Genome Biol. 13, R75. Cabianca, D.S., Casa, V., Bodega, B., Xynos, A., Ginelli, E., Tanaka, Y., and Gabellini, D. (2012). A long ncRNA links copy number variation to a polycomb/trithorax epigenetic switch in FSHD muscular dystrophy. Cell 149, 819831. Cabili, M.N., Trapnell, C., Goff, L., Koziol, M., Tazon-Vega, B., Regev, A., and Rinn, J.L. (2011). Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specic subclasses. Genes Dev. 25, 19151927. Carrieri, C., Cimatti, L., Biagioli, M., Beugnet, A., Zucchelli, S., Fedele, S., Pesce, E., Ferrer, I., Collavin, L., Santoro, C., et al. (2012). Long non-coding antisense RNA controls Uchl1 translation through an embedded SINEB2 repeat. Nature 491, 454457. Chen, L.L., and Carmichael, G.G. (2009). Altered nuclear retention of mRNAs containing inverted repeats in human embryonic stem cells: functional role of a nuclear noncoding RNA. Mol. Cell 35, 467478. Chu, C., Qu, K., Zhong, F.L., Artandi, S.E., and Chang, H.Y. (2011). Genomic maps of long noncoding RNA occupancy reveal principles of RNA-chromatin interactions. Mol. Cell 44, 667678. Clemson, C.M., Hutchinson, J.N., Sara, S.A., Ensminger, A.W., Fox, A.H., Chess, A., and Lawrence, J.B. (2009). An architectural role for a nuclear noncoding RNA: NEAT1 RNA is essential for the structure of paraspeckles. Mol. Cell 33, 717726. Cremer, T., and Cremer, M. (2010). Chromosome territories. Cold Spring Harb. Perspect. Biol. 2, a003889.

De Santa, F., Barozzi, I., Mietton, F., Ghisletti, S., Polletti, S., Tusi, B.K., Muller, H., Ragoussis, J., Wei, C.L., and Natoli, G. (2010). A large fraction of extragenic RNA pol II transcription sites overlap enhancers. PLoS Biol. 8, e1000384. Delebecque, C.J., Lindner, A.B., Silver, P.A., and Aldaye, F.A. (2011). Organization of intracellular reactions with rationally designed RNA assemblies. Science 333, 470474. Derrien, T., Johnson, R., Bussotti, G., Tanzer, A., Djebali, S., Tilgner, H., Guernec, G., Martin, D., Merkel, A., Knowles, D.G., et al. (2012). The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression. Genome Res. 22, 17751789. Ding, D.Q., Okamasa, K., Yamane, M., Tsutsumi, C., Haraguchi, T., Yamamoto, M., and Hiraoka, Y. (2012). Meiosis-specic noncoding RNA mediates robust pairing of homologous chromosomes in meiosis. Science 336, 732736. Dixon, J.R., Selvaraj, S., Yue, F., Kim, A., Li, Y., Shen, Y., Hu, M., Liu, J.S., and Ren, B. (2012). Topological domains in mammalian genomes identied by analysis of chromatin interactions. Nature 485, 376380. Djebali, S., Davis, C.A., Merkel, A., Dobin, A., Lassmann, T., Mortazavi, A., Tanzer, A., Lagarde, J., Lin, W., Schlesinger, F., et al. (2012). Landscape of transcription in human cells. Nature 489, 101108. Dundr, M., and Misteli, T. (2010). Biogenesis of nuclear bodies. Cold Spring Harb. Perspect. Biol. 2, a000711. mmerle, M., Gu nther, S., Caudron-Herger, M., Eissmann, M., Gutschner, T., Ha rnig, M., and Diederichs, S. Gro, M., Schirmacher, P., Rippe, K., Braun, T., Zo (2012). Loss of the abundant nuclear non-coding RNA MALAT1 is compatible with life and development. RNA Biol. 9, 10761087. sse, Fradet, Y., Saad, F., Aprikian, A., Dessureault, J., Elhilali, M., Trudel, C., Ma , L., and Chypre, C. (2004). uPM3, a new molecular urine test for the B., Piche detection of prostate cancer. Urology 64, 311315, discussion 315316. Gong, C., and Maquat, L.E. (2011). lncRNAs transactivate STAU1-mediated mRNA decay by duplexing with 30 UTRs via Alu elements. Nature 470, 284288. Grimal, P., and Woloch, M. (1983). Roman cities = Les villes romaines (Madison, Wis.: University of Wisconsin Press). Guetg, C., Scheifele, F., Rosenthal, F., Hottiger, M.O., and Santoro, R. (2012). Inheritance of silent rDNA chromatin is mediated by PARP1 via noncoding RNA. Mol. Cell 45, 790800. ` re, J., Fonalleras, E., Go mez, A., VillaGuil, S., Soler, M., Portela, A., Carre nueva, A., and Esteller, M. (2012). Intronic RNAs mediate EZH2 regulation of epigenetic targets. Nat. Struct. Mol. Biol. 19, 664670. Gupta, R.A., Shah, N., Wang, K.C., Kim, J., Horlings, H.M., Wong, D.J., Tsai, M.C., Hung, T., Argani, P., Rinn, J.L., et al. (2010). Long non-coding RNA HOTAIR reprograms chromatin state to promote cancer metastasis. Nature 464, 10711076. Gutschner, T., and Diederichs, S. (2012). The hallmarks of cancer: a long noncoding RNA point of view. RNA Biol. 9, 703719. Guttman, M., Amit, I., Garber, M., French, C., Lin, M.F., Feldser, D., Huarte, M., Zuk, O., Carey, B.W., Cassady, J.P., et al. (2009). Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals. Nature 458, 223227. Guttman, M., Donaghey, J., Carey, B.W., Garber, M., Grenier, J.K., Munson, G., Young, G., Lucas, A.B., Ach, R., Bruhn, L., et al. (2011). lincRNAs act in the circuitry controlling pluripotency and differentiation. Nature 477, 295300. Harigaya, Y., Tanaka, H., Yamanaka, S., Tanaka, K., Watanabe, Y., Tsutsumi, C., Chikashige, Y., Hiraoka, Y., Yamashita, A., and Yamamoto, M. (2006). Selective elimination of messenger RNA prevents an incidence of untimely meiosis. Nature 442, 4550. Hindorff, L.A., Sethupathy, P., Junkins, H.A., Ramos, E.M., Mehta, J.P., Collins, F.S., and Manolio, T.A. (2009). Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc. Natl. Acad. Sci. USA 106, 93629367.

Cell 152, March 14, 2013 2013 Elsevier Inc. 1305

Houseley, J., Rubbi, L., Grunstein, M., Tollervey, D., and Vogelauer, M. (2008). A ncRNA modulates histone modication and mRNA induction in the yeast GAL gene cluster. Mol. Cell 32, 685695. Huarte, M., Guttman, M., Feldser, D., Garber, M., Koziol, M.J., KenzelmannBroz, D., Khalil, A.M., Zuk, O., Amit, I., Rabani, M., et al. (2010). A large intergenic noncoding RNA induced by p53 mediates global gene repression in the p53 response. Cell 142, 409419. Hung, T., Wang, Y., Lin, M.F., Koegel, A.K., Kotake, Y., Grant, G.D., Horlings, H.M., Shah, N., Umbricht, C., Wang, P., et al. (2011). Extensive and coordinated transcription of noncoding RNAs within cell-cycle promoters. Nat. Genet. 43, 621629. Ietswaart, R., Wu, Z., and Dean, C. (2012). Flowering time control: another window to the connection between antisense RNA and chromatin. Trends Genet. 28, 445453. Ingolia, N.T., Lareau, L.F., and Weissman, J.S. (2011). Ribosome proling of mouse embryonic stem cells reveals the complexity and dynamics of mammalian proteomes. Cell 147, 789802. IHGSC (International Human Genome Sequencing Consortium). (2004). Finishing the euchromatic sequence of the human genome. Nature 431, 931945. Kagey, M.H., Newman, J.J., Bilodeau, S., Zhan, Y., Orlando, D.A., van Berkum, N.L., Ebmeier, C.C., Goossens, J., Rahl, P.B., Levine, S.S., et al. (2010). Mediator and cohesin connect gene expression and chromatin architecture. Nature 467, 430435. Khalil, A.M., Guttman, M., Huarte, M., Garber, M., Raj, A., Rivea Morales, D., Thomas, K., Presser, A., Bernstein, B.E., van Oudenaarden, A., et al. (2009). Many human large intergenic noncoding RNAs associate with chromatinmodifying complexes and affect gene expression. Proc. Natl. Acad. Sci. USA 106, 1166711672. Kim, T.K., Hemberg, M., Gray, J.M., Costa, A.M., Bear, D.M., Wu, J., Harmin, D.A., Laptewicz, M., Barbara-Haley, K., Kuersten, S., et al. (2010). Widespread transcription at neuronal activity-regulated enhancers. Nature 465, 182187. nster, S., Steinmetz, L.M., and Buratowski, S. Kim, T., Xu, Z., Clauder-Mu (2012). Set3 HDAC mediates effects of overlapping noncoding transcription on gene induction kinetics. Cell 150, 11581169. Kretz, M., Siprashvili, Z., Chu, C., Webster, D.E., Zehnder, A., Qu, K., Lee, C.S., Flockhart, R.J., Groff, A.F., Chow, J., et al. (2013). Control of somatic tissue differentiation by the long non-coding RNA TINCR. Nature 493, 231235. Published online December 2, 2012. http://dx.doi.org/10.1038/nature11661. Latos, P.A., Pauler, F.M., Koerner, M.V., S xenergin, H.B., Hudson, Q.J., Stocsits, R.R., Allhoff, W., Stricker, S.H., Klement, R.M., Warczok, K.E., et al. (2012). Airn transcriptional overlap, but not its lncRNA products, induces imprinted Igf2r silencing. Science 338, 14691472. Lee, J.T. (2009). Lessons from X-chromosome inactivation: long ncRNA as guides and tethers to the epigenome. Genes Dev. 23, 18311842. Lee, J.T., and Bartolomei, M.S. (2013). X-inactivation, imprinting, and long noncoding RNAs in health and disease. Cell 152, this issue, 13081323. Maass, P.G., Rump, A., Schulz, H., Stricker, S., Schulze, L., Platzer, K., Aydin, hring, S. (2012). A misplaced A., Tinschert, S., Goldring, M.B., Luft, F.C., and Ba lncRNA causes brachydactyly in humans. J. Clin. Invest. 122, 39904002. Mao, Y.S., Sunwoo, H., Zhang, B., and Spector, D.L. (2011). Direct visualization of the co-transcriptional assembly of a nuclear body by noncoding RNAs. Nat. Cell Biol. 13, 95101. Marques, A.C., and Ponting, C.P. (2009). Catalogues of mammalian long noncoding RNAs: modest conservation and incompleteness. Genome Biol. 10, R124. Martens, J.A., Laprade, L., and Winston, F. (2004). Intergenic transcription is required to repress the Saccharomyces cerevisiae SER3 gene. Nature 429, 571574. Martianov, I., Ramadass, A., Serra Barros, A., Chow, N., and Akoulitchev, A. (2007). Repression of the human dihydrofolate reductase gene by a noncoding interfering transcript. Nature 445, 666670.

Martin, L., Meier, M., Lyons, S.M., Sit, R.V., Marzluff, W.F., Quake, S.R., and Chang, H.Y. (2012). Systematic reconstruction of RNA functional motifs with high-throughput microuidics. Nat. Methods 9, 11921194. Mayer, C., Schmitz, K.M., Li, J., Grummt, I., and Santoro, R. (2006). Intergenic transcripts regulate the epigenetic state of rRNA genes. Mol. Cell 22, 351361. Misteli, T. (2007). Beyond the sequence: cellular organization of genome function. Cell 128, 787800. Mitton-Fry, R.M., DeGregorio, S.J., Wang, J., Steitz, T.A., and Steitz, J.A. (2010). Poly(A) tail recognition by a viral RNA element through assembly of a triple helix. Science 330, 12441247. Modarresi, F., Faghihi, M.A., Lopez-Toledano, M.A., Fatemi, R.P., Magistri, M., Brothers, S.P., van der Brug, M.P., and Wahlestedt, C. (2012). Inhibition of natural antisense transcripts in vivo results in gene-specic transcriptional upregulation. Nat. Biotechnol. 30, 453459. Nagano, T., Mitchell, J.A., Sanz, L.A., Pauler, F.M., Ferguson-Smith, A.C., Feil, R., and Fraser, P. (2008). The Air noncoding RNA epigenetically silences transcription by targeting G9a to chromatin. Science 322, 17171720. Nakagawa, S., Naganuma, T., Shioi, G., and Hirose, T. (2011). Paraspeckles are subpopulation-specic nuclear bodies that are not essential in mice. J. Cell Biol. 193, 3139. Nakagawa, S., Ip, J.Y., Shioi, G., Tripathi, V., Zong, X., Hirose, T., and Prasanth, K.V. (2012). Malat1 is not an essential component of nuclear speckles in mice. RNA 18, 14871499. Nam, J.W., and Bartel, D.P. (2012). Long noncoding RNAs in C. elegans. Genome Res. 22, 25292540. Noordermeer, D., Leleu, M., Splinter, E., Rougemont, J., De Laat, W., and Duboule, D. (2011). The dynamic architecture of Hox gene clusters. Science 334, 222225. Nora, E.P., Lajoie, B.R., Schulz, E.G., Giorgetti, L., Okamoto, I., Servant, N., Piolot, T., van Berkum, N.L., Meisig, J., Sedat, J., et al. (2012). Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature 485, 381385. rom, U.A., Derrien, T., Beringer, M., Gumireddy, K., Gardini, A., Bussotti, G., Lai, F., Zytnicki, M., Notredame, C., Huang, Q., et al. (2010). Long noncoding RNAs with enhancer-like function in human cells. Cell 143, 4658. Paige, J.S., Wu, K.Y., and Jaffrey, S.R. (2011). RNA mimics of green uorescent protein. Science 333, 642646. Pandey, R.R., Mondal, T., Mohammad, F., Enroth, S., Redrup, L., Komorowski, J., Nagano, T., Mancini-Dinardo, D., and Kanduri, C. (2008). Kcnq1ot1 antisense noncoding RNA mediates lineage-specic transcriptional silencing through chromatin-level regulation. Mol. Cell 32, 232246. Pauli, A., Valen, E., Lin, M.F., Garber, M., Vastenhouw, N.L., Levin, J.Z., Fan, L., Sandelin, A., Rinn, J.L., Regev, A., and Schier, A.F. (2012). Systematic identication of long noncoding RNAs expressed during zebrash embryogenesis. Genome Res. 22, 577591. Ponjavic, J., Ponting, C.P., and Lunter, G. (2007). Functionality or transcriptional noise? Evidence for selection within long noncoding RNAs. Genome Res. 17, 556565. Ravasi, T., Suzuki, H., Pang, K.C., Katayama, S., Furuno, M., Okunishi, R., Fukuda, S., Ru, K., Frith, M.C., Gongora, M.M., et al. (2006). Experimental validation of the regulated expression of large numbers of non-coding RNAs from the mouse genome. Genome Res. 16, 1119. Rinn, J.L., and Chang, H.Y. (2012). Genome regulation by long noncoding RNAs. Annu. Rev. Biochem. 81, 145166. Sasaki, Y.T., Ideue, T., Sano, M., Mituyama, T., and Hirose, T. (2009). MENepsilon/beta noncoding RNAs are essential for structural integrity of nuclear paraspeckles. Proc. Natl. Acad. Sci. USA 106, 25252530. Schmitt, S., Prestel, M., and Paro, R. (2005). Intergenic transcription through a polycomb group response element counteracts silencing. Genes Dev. 19, 697708.

1306 Cell 152, March 14, 2013 2013 Elsevier Inc.

Schmitz, K.M., Mayer, C., Postepska, A., and Grummt, I. (2010). Interaction of noncoding RNA with the rDNA promoter mediates recruitment of DNMT3b and silencing of rRNA genes. Genes Dev. 24, 22642269. Shappell, S.B. (2008). Clinical utility of prostate carcinoma molecular diagnostic tests. Rev. Urol. 10, 4469. Shevtsov, S.P., and Dundr, M. (2011). Nucleation of nuclear bodies by RNA. Nat. Cell Biol. 13, 167173. Simon, M.D., Wang, C.I., Kharchenko, P.V., West, J.A., Chapman, B.A., Alekseyenko, A.A., Borowsky, M.L., Kuroda, M.I., and Kingston, R.E. (2011). The genomic binding sites of a noncoding RNA. Proc. Natl. Acad. Sci. USA 108, 2049720502. t, C., Ko sem, S., Peichl, L., Cremer, T., Guck, Solovei, I., Kreysing, M., Lancto J., and Joffe, B. (2009). Nuclear architecture of rod photoreceptor cells adapts to vision in mammalian evolution. Cell 137, 356368. Splinter, E., de Wit, E., Nora, E.P., Klous, P., van de Werken, H.J., Zhu, Y., Kaaij, L.J., van Ijcken, W., Gribnau, J., Heard, E., and de Laat, W. (2011). The inactive X chromosome adopts a unique three-dimensional conformation that is dependent on Xist RNA. Genes Dev. 25, 13711383. Sunwoo, H., Dinger, M.E., Wilusz, J.E., Amaral, P.P., Mattick, J.S., and Spector, D.L. (2009). MEN epsilon/beta nuclear-retained non-coding RNAs are up-regulated upon muscle differentiation and are essential components of paraspeckles. Genome Res. 19, 347359. Szafranski, P., Dharmadhikari, A.V., Brosens, E., Gurha, P., Kolodziejska, K.E., Ou, Z., Dittwald, P., Majewski, T., Mohan, K.N., Chen, B., et al. (2013). Small non-coding differentially-methylated copy-number variants, including lncRNA genes, cause a lethal lung developmental disorder. Genome Res. 23, 2333. Published online October 3, 2012. http://dx.doi.org/10.1101/gr.141887.112. Tavares, L., Dimitrova, E., Oxley, D., Webster, J., Poot, R., Demmers, J., Bezstarosti, K., Taylor, S., Ura, H., Koide, H., et al. (2012). RYBP-PRC1 complexes mediate H2A ubiquitylation at polycomb target sites independently of PRC2 and H3K27me3. Cell 148, 664678. Tian, D., Sun, S., and Lee, J.T. (2010). The long noncoding RNA, Jpx, is a molecular switch for X chromosome inactivation. Cell 143, 390403. Tilgner, H., Knowles, D.G., Johnson, R., Davis, C.A., Chakrabortty, S., Djebali, , R. (2012). Deep S., Curado, J., Snyder, M., Gingeras, T.R., and Guigo sequencing of subcellular RNA fractions shows splicing to be predominantly co-transcriptional in the human genome but inefcient for lncRNAs. Genome Res. 22, 16161625. Tinzl, M., Marberger, M., Horvath, S., and Chypre, C. (2004). DD3PCA3 RNA analysis in urinea new perspective for detecting prostate cancer. Eur. Urol. 46, 182186, discussion 187. Tsai, M.C., Manor, O., Wan, Y., Mosammaparast, N., Wang, J.K., Lan, F., Shi, Y., Segal, E., and Chang, H.Y. (2010). Long noncoding RNA as modular scaffold of histone modication complexes. Science 329, 689693. Tycowski, K.T., Shu, M.D., Borah, S., Shi, M., and Steitz, J.A. (2012). Conservation of a triple-helix-forming RNA stability element in noncoding and genomic RNAs of diverse viruses. Cell Rep. 2, 2632. Ulitsky, I., Shkumatava, A., Jan, C.H., Sive, H., and Bartel, D.P. (2011). Conserved function of lincRNAs in vertebrate embryonic development despite rapid sequence evolution. Cell 147, 15371550. van Dijk, M., Thulluru, H.K., Mulders, J., Michel, O.J., Poutsma, A., Windhorst, S., Kleiverda, G., Sie, D., Lachmeijer, A.M., and Oudejans, C.B. (2012). HELLP babies link a novel lincRNA to the trophoblast cell cycle. J. Clin. Invest. 122, 40034011.

van Werven, F.J., Neuert, G., Hendrick, N., Lardenois, A., Buratowski, S., van Oudenaarden, A., Primig, M., and Amon, A. (2012). Transcription of two long noncoding RNAs mediates mating-type control of gametogenesis in budding yeast. Cell 150, 11701181. Verma-Gaur, J., Torkamani, A., Schaffer, L., Head, S.R., Schork, N.J., and Feeney, A.J. (2012). Noncoding transcription within the Igh distal V(H) region at PAIR elements affects the 3D structure of the Igh locus in pro-B cells. Proc. Natl. Acad. Sci. USA 109, 1700417009. Wan, Y., Qu, K., Ouyang, Z., Kertesz, M., Li, J., Tibshirani, R., Makino, D.L., Nutter, R.C., Segal, E., and Chang, H.Y. (2012). Genome-wide measurement of RNA folding energies. Mol. Cell 48, 169181. Wang, K.C., and Chang, H.Y. (2011). Molecular mechanisms of long noncoding RNAs. Mol. Cell 43, 904914. Wang, X., Arai, S., Song, X., Reichart, D., Du, K., Pascual, G., Tempst, P., Rosenfeld, M.G., Glass, C.K., and Kurokawa, R. (2008). Induced ncRNAs allosterically modify RNA-binding proteins in cis to inhibit transcription. Nature 454, 126130. Wang, K.C., Yang, Y.W., Liu, B., Sanyal, A., Corces-Zimmerman, R., Chen, Y., Lajoie, B.R., Protacio, A., Flynn, R.A., Gupta, R.A., et al. (2011). A long noncoding RNA maintains active chromatin to coordinate homeotic gene expression. Nature 472, 120124. Wheeler, T.M., Leger, A.J., Pandey, S.K., MacLeod, A.R., Nakamori, M., Cheng, S.H., Wentworth, B.M., Bennett, C.F., and Thornton, C.A. (2012). Targeting nuclear RNA for in vivo correction of myotonic dystrophy. Nature 488, 111115. Wilusz, J.E., Jnbaptiste, C.K., Lu, L.Y., Kuhn, C.D., Joshua-Tor, L., and Sharp, P.A. (2012). A triple helix stabilizes the 30 ends of long noncoding RNAs that lack poly(A) tails. Genes Dev. 26, 23922407. Wolpert, L. (2011). Positional information and patterning revisited. J. Theor. Biol. 269, 359365. Yamamoto, M. (2010). The selective elimination of messenger RNA underlies the mitosis-meiosis switch in ssion yeast. Proc. Jpn. Acad., Ser. B, Phys. Biol. Sci. 86, 788797. Yang, L., Lin, C., Liu, W., Zhang, J., Ohgi, K.A., Grinstein, J.D., Dorrestein, P.C., and Rosenfeld, M.G. (2011). ncRNA- and Pc2 methylation-dependent gene relocation between nuclear structures mediates gene activation programs. Cell 147, 773788. Yin, Q.F., Yang, L., Zhang, Y., Xiang, J.F., Wu, Y.W., Carmichael, G.G., and Chen, L.L. (2012). Long noncoding RNAs with snoRNA ends. Mol. Cell 48, 219230. Yoon, J.H., Abdelmohsen, K., Srikantan, S., Yang, X., Martindale, J.L., De, S., Huarte, M., Zhan, M., Becker, K.G., and Gorospe, M. (2012). LincRNA-p21 suppresses target mRNA translation. Mol. Cell 47, 648655. Zhang, B., Arun, G., Mao, Y.S., Lazar, Z., Hung, G., Bhattacharjee, G., Xiao, X., Booth, C.J., Wu, J., Zhang, C., and Spector, D.L. (2012). The lncRNA Malat1 is dispensable for mouse development but its transcription plays a cis-regulatory role in the adult. Cell Rep. 2, 111123. Zhao, J., Ohsumi, T.K., Kung, J.T., Ogawa, Y., Grau, D.J., Sarma, K., Song, J.J., Kingston, R.E., Borowsky, M., and Lee, J.T. (2010). Genome-wide identication of polycomb-associated RNAs by RIP-seq. Mol. Cell 40, 939953. Zofall, M., Yamanaka, S., Reyes-Turcu, F.E., Zhang, K., Rubin, C., and Grewal, S.I. (2012). RNA elimination machinery targeting meiotic mRNAs promotes facultative heterochromatin formation. Science 335, 96100.

Cell 152, March 14, 2013 2013 Elsevier Inc. 1307

Review
X-Inactivation, Imprinting, and Long Noncoding RNAs in Health and Disease
Jeannie T. Lee1,2,* and Marisa S. Bartolomei3,*
Hughes Medical Institute, Department of Molecular Biology, Massachusetts General Hospital, Boston, MA 02114, USA of Genetics, Harvard Medical School, Boston, MA 02114, USA 3Department of Cell and Developmental Biology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA 19104, USA *Correspondence: lee@molbio.mgh.harvard.edu (J.T.L.), bartolom@mail.med.upenn.edu (M.S.B.) http://dx.doi.org/10.1016/j.cell.2013.02.016
2Department 1Howard

Leading Edge

X chromosome inactivation and genomic imprinting are classic epigenetic processes that cause disease when not appropriately regulated in mammals. Whereas X chromosome inactivation evolved to solve the problem of gene dosage, the purpose of genomic imprinting remains controversial. Nevertheless, the two phenomena are united by allelic control of large gene clusters, such that only one copy of a gene is expressed in every cell. Allelic regulation poses signicant challenges because it requires coordinated long-range control in cis and stable propagation over time. Long noncoding RNAs have emerged as a common theme, and their contributions to diseases of imprinting and the X chromosome have become apparent. Here, we review recent advances in basic biology, the connections to disease, and preview potential therapeutic strategies for future development.
Introduction Every organism faces the challenge of regulating gene dosage. In diploids, genes are generally assumed to be expressed from both alleles but, in mammals, several classes of genes are expressed from only one allele per cell. Two of the most prominent examples of allelic phenomena are X chromosome inactivation (XCI) and genomic imprinting. Because of XCI, only one copy of each X-linked gene is active in female cells (XX). Because male cells carry only one X chromosome (XY), they are inherently hemizygous for all X-linked genes. In genomic imprinting, genes within a discrete domain are coordinately regulated and expressed according to parent of origin. Research over the past 50 years has identied many similarities between XCI and genomic imprinting. Apart from monoallelic expression, genes subject to the two processes tend to cluster, are inuenced at long-range by a master control region, and are associated with multiple long noncoding RNAs (lncRNA). Some of the most fascinating stories to emerge in recent years have been related to lncRNAs as master regulators. Some of the rst epigenetic lncRNAs in mammals were, in fact, identied from genomic imprinting and XCI studies. Such lncRNAs have been proposed to serve as recruiting tools for chromatin-modifying complexes that would in turn silence or activate genes residing within the allelically regulated clusters. Together, XCI and imprinting affect expression of 5%10% of genes in the mammalian genome. From a functional standpoint, mutations within these regions could be easily unmasked, as they are often unbuffered by contributions from the silenced wild-type copy and could thereby cause severe developmental defects. This explains why X-linked and imprinted diseases are among the most common congenital human disorders, accounting for easily recognizable childhood syndromes such
1308 Cell 152, March 14, 2013 2013 Elsevier Inc.

as Rett, fragile X, Prader-Willi/Angelman, and Beckwith-Wiedemann syndromes, as well as conditions such as hemophilia, testicular feminization, and red-green color blindness. More recently, imprinting and X-linked anomalies have also been of interest for stem cell maturation and reprogramming, cancer, assisted reproductive technology (ART), and cognition. This article will review the state-of-the-art in genomic imprinting and XCI, focusing on recent advances in studying mechanism, the emerging roles of lncRNAs, and their relevance for understanding and treating various human conditions. X Chromosome Inactivation and Genomic Imprinting Genomic Imprinting Mammals require both maternal and paternal genomic contributions to develop into healthy, viable organisms (Solter, 1988). This is, in large part, due to the inheritance of autosomally imprinted genes, which are expressed only from a single allele in accordance with its parent of origin (Bartolomei, 2009). That is, imprinted genes are expressed either from the maternally or paternally inherited allele, so that, when summed across the whole genome, contributions from both parents are necessary for expression of the full complement of imprinted genes and for proper development. The elegant nuclear transplantation experiments by Solter and Surani in the 1980s were the rst to suggest that the mammalian genome harbored imprinted genes (McGrath and Solter, 1984; Surani et al., 1984). They showed that diploid androgenetic embryos derived from two male pronuclei or diploid gynogenetic embryos derived from two female pronuclei failed to develop and reasoned that this must be due to genes that are expressed exclusively from one of the parental genomes. Later genetic experiments extended these ndings by demonstrating that the proposed imprinted

Figure 1. Mechanisms of Imprinting


(A)The insulator model is exemplied by the H19/Igf2 domain. Here, the intergenic ICR is paternally methylated. On the unmethylated maternal allele, CTCF binding prevents enhancers from interacting with the Igf2 promoter. Instead, the enhancers activate H19 expression. On the paternal allele, methylation of the ICR spreads to the H19 promoter, silencing its expression, and prevents CTCF from binding the ICR, allowing the enhancers to activate Igf2 expression. (BD) The ncRNA model is illustrated by the Kcnq1 (B), Igf2r (C), and Snprn (D) domains. (B) For Kcnq1, the ICR contains the promoter of the Kcnq1ot1 lncRNA. On the paternal allele, the ICR is unmethylated, allowing Kcnq1ot1 expression. Kcnq1ot1 expression silences the paternal allele of the linked genes in cis. On the maternal allele, Kcnq1ot1 is not expressed due to methylation of the ICR, and the adjacent imprinted genes are expressed. (C) For the Igf2r domain, transcription of the Airn lncRNA is governed by a promoter within the ICR and is expressed from the unmethylated paternal allele. In somatic cells, transcription of Airn over the Igf2r promoter precludes Igf2r expression, in part by kicking RNA polymerase II (POL-II) off of the promoter. In extraembryonic lineages, Airn lncRNA is postulated to recruit enzymes that confer repressive histone modications to silence genes in cis. (D) The Snrpn locus uses the ncRNA model. Ube3a is expressed from the maternal allele exclusively in brain (in other tissues, it is biallelically expressed). The lncRNA on the paternal allele occurs in multiple, variably processed

genes mapped to specic mouse chromosomes (Searle and Beechey, 1978; Cattanach, 1986). The current number of imprinted genes in the mouse is approximately 150 (http://www.mousebook.org/catalog.php? catalog=imprinting), with a smaller number identied in humans, in part because many genes have not been tested in humans (Weksberg, 2010). The imprinted genes are typically located in clusters of 312 (or more) genes that are spread over 20 3,700 kb of DNA (Barlow, 2011), but, interestingly, genes within one cluster are not necessarily expressed from the same parental chromosome (Figure 1). Most imprinted clusters contain protein coding genes and noncoding RNAs (ncRNAs). The ncRNAs are of different varieties (microRNAs, snoRNAs, and lncRNAs), some of which are essential to the mechanism that imprints these genes in cis. Each well-studied cluster has a discrete imprinting control region (ICR) that exhibits parent-oforigin-specic epigenetic modications (DNA methylation and posttranslational histone modications) and governs the imprinting of the locus. Although the mechanism(s) that confer the allele-specic epigenetic modications is poorly understood, DNA methylation has been shown to be imposed at a precise time in germ cells by a mechanism that is postulated to involve transcription (Chotalia et al., 2009; Henckel et al., 2012) and is maintained after fertilization despite extensive reprogramming of the genome (Bartolomei and Ferguson-Smith, 2011) (Figure 1). Moreover, germline deletion of the ICR results in the loss of imprinting of multiple genes in the cluster, thus demonstrating that the clustering of imprinted genes is required for their appropriate expression. Many imprinted genes undergo tissue-specic imprinting. Of the approximately 150 imprinted genes identied in mouse, a few are imprinted exclusively in the placenta, an extraembryonic organ that plays a crucial role in regulating fetal growth by controlling the supply of nutrients (Frost and Moore, 2010). Imprinting is hypothesized to be a mechanism to balance growth, with many imprinted genes having a demonstrated role in growth control. Thus, the placenta is particularly signicant in the physiology and study of imprinting. Interestingly, as described below, mechanisms that control imprinting in the placentaa short-lived organmay differ from those mechanisms that regulate imprinting in the much longer-lived somatic lineages. This holds true not only for autosomal but also X-linked imprinting. X Chromosome Inactivation In 1949, Murray Barr showed that the sex of cat cells could be deduced by a subnuclear structure now called the Barr body in honor of his seminal work (Barr and Bertram, 1949). Susumu
forms, some of which are brain-specic variants that contain upstream promoters/exons and sequences overlapping with Ube3a. Expression of these lncRNAs occurs when the ICR is unmethylated, with the result that expression of Ube3a is repressed. On the maternal allele, transcription of the upstream (U) exons is proposed to direct the maternal methylation imprint at the ICR. Topoisomerase I inhibitors identied by a mouse screen activate Ube3a on the paternal allele. As a result, Snprn and Ube3a-ATS were no longer expressed and the ICR exhibited increased methylation, relative to the wild-type paternal allele. All imprinted domains, which are not drawn to scale, are depicted for the mouse, although the human regions are largely conserved. T refers to the telomeric end of the cluster and C the proximal end of the chromosome.

Cell 152, March 14, 2013 2013 Elsevier Inc. 1309

Figure 2. The X-Inactivation and X-Reactivation Cycle during Mouse Development


Mammalian dosage compensation occurs within a continual cycle of XCI and X reactivation. The XCI cycle begins in the male germline. During the rst meiotic prophase of spermatogenesis, the X and Y undergo MSCI. After meiosis, 85% of X-linked genes remain suppressed through spermiogenesis, forming postmeiotic sex chromatin (PMSC). This germline-inactivated X has been proposed to be passed onto the next generation in a partially silent state, accounting for the preferential XP inactivation of the early female mouse embryo. In the two-cell mouse embryo, repetitive elements on XP are already suppressed. XP-linked coding genes are initially active but become progressively inactivated during preimplantation development. The maternal germline also plays a crucial role in imprinted XCI by marking the future XM during the oocyte growth phase, ensuring that XM is protected in both XX and XY embryos. Beyond the blastocyst, these marks persist only in the placenta of the mouse. Whereas extraembryonic tissues, including the primitive endoderm (PE) and the trophectoderm (TE), maintain imprinted XP inactivation, the epiblast lineage undergoes XCR and initiates zygotically driven random XCI. XCR also occurs in primordial germ cells (PGCs) in preparation for equal segregation during meiosis. Xp, paternal X; Xm, maternal X; Xa, active X; MSCI, meiotic sex chromosome inactivation. Adapted from Payer and Lee, 2008.

Ohno later demonstrated that the Barr body is a condensed X chromosome (Ohno et al., 1959), and Mary Lyon followed with the understanding that the condensed X is the result of whole-chromosome silencing (Lyon, 1961). We now know that XCI compensates for dosage differences between males and females by rendering all cells functionally monosomic for the X chromosome (reviewed in Payer and Lee, 2008; Starmer and
1310 Cell 152, March 14, 2013 2013 Elsevier Inc.

Magnuson, 2009; Wutz, 2011). XCI is coordinated by an X-inactivation center (Xic), which controls most, if not all, of the steps of XCI, including X chromosome counting, random X chromosome choice, and the initiation of silencing along 1,000 genes of the X (Brown et al., 1991b) (Figure 2). These steps are completed in the peri-implantation embryo within the 1020 cell epiblast lineage (which gives rise to all somatic cells) (Puck et al., 1992).

Once established, the pattern of XCI is stably propagated in the soma, with the same X chromosome maintained as Xi in subsequent mitotic divisions. The mammalian female is therefore a mosaic (Figure 2). Whereas the choice of XCI in somatic cells of eutherian mammals occurs randomly, the choice in marsupial mammals is xed. In marsupials (Sharman, 1971), and also in the extraembryonic tissues of some eutherian mammals (Takagi and Sasaki, 1975), the paternal X (XP) is imprinted to undergo silencing, providing a rst example of mammalian imprinting. The phenomenon is conceptually similar to autosomal imprinting in that monoallelic expression is determined by parent-of-origin and has mechanistic underpinnings in the parental germline. Imprinted XCI in the placenta adds to the number of imprinted mammalian genes and further supports the idea that imprinting balances fetal growth by controlling the nutrient supply (Frost and Moore, 2010). Mammalian dosage compensation occurs within a continual cycle of XCI and reactivation (XCR) (Figure 2). Although random XCI is female-specic, X silencing also occurs in the male germline (Lifschytz and Lindsley, 1972). For some mammals, the male germline is where the XCI cycle begins. During the rst meiotic prophase of spermatogenesis, the X and Y undergo meiotic sex chromosome inactivation (MSCI) and form the XY body. The X and Y do not wholly reactivate after completion of meiosis; in mice, 85% of genes on the X chromosome remain transcriptionally suppressed in postmeiotic spermiogenesis (Namekawa et al., 2006). This postmeiotic sex chromatin (PMSC) is decorated by distinct heterochromatic signatures (Greaves et al., 2006; Namekawa et al., 2006; Turner et al., 2006) and is consistent with the hypothesis that the germlineinactivated X may be passed onto the next generation at least in a partially preinactivated state, accounting for the preferential XP inactivation that occurs in the early female embryo (Cooper, 1971; Lyon, 1999; Huynh and Lee, 2003). At zygotic gene activation in the two-cell mouse embryo, transcription of repetitive elements on XP is already suppressed, reecting their suppression in the male germline (Namekawa et al., 2006; Namekawa et al., 2010) (Figure 2). Although X-linked coding genes on the XP are initially active, they are progressively inactivated during preimplantation development (Okamoto and Heard, 2006; Kalantry et al., 2009; Namekawa et al., 2010). The X-linked repetitive elements may facilitate formation of the silent compartment for inactivation of XP genes (Namekawa et al., 2010). Thus, imprinted XCI may be a process that begins in the male germline, continues into the zygote as repeat silencing, and progresses through the blastocyst stage with genic silencing. However, the maternal germline also plays a crucial role in imprinted XCI by marking the future XM to resist silencing (Takagi and Abe, 1990; Goto and Takagi, 2000). This occurs during the oocyte growth phase (Tada et al., 2000), ensuring that XM (passed onto both XX and XY embryos) is protected (Figure 2). Thus, it is likely that both XP and XM are parentally marked, with XP subject to imprinted XCI and XM protected from it. Beyond the blastocyst, these marks persist only in the placenta of the mouse (Figure 2). The blastocyst consists of the trophectodermal lineage, which gives rise to placental tissue,

and the inner cell mass, which gives rise to the epiblast lineage that develops into the embryo proper. During peri-implantation development, their epigenetic fates diverge with respect to XCI. Whereas extraembryonic tissues, including the primitive endoderm (PE) and the trophectoderm (TE), maintain imprinted XP inactivation, the epiblast lineage undergoes XCR and initiates a new round of inactivationthis time randomly without a parent-of-origin bias (Mak et al., 2004). Mechanisms Cis-Acting Control Regions Both XCI and genomic imprinting are regulated by cis-acting master control regions. For XCI, a single Xic has been mapped to a 100500 kb region (Brown et al., 1991b; Lee et al., 1999b; Chureau et al., 2002) (Figure 3A). Genetic analyses based on knockouts, gain-of-function mutations, and transgenic overexpression have shown that the Xic is necessary and sufcient to regulate XCI. Deleting the noncoding locus Xist results in loss of silencing capability in cis (Penny et al., 1996; Marahrens et al., 1997), and placing the Xic at an ectopic location results in counting, choosing, and silencing of the autosome (Lee et al., 1996; Migeon et al., 1999; Wutz et al., 2002). The Xic therefore drives XCI without a requirement for additional X-specic elements, such as those that might be responsible for the spread of silencing. Similarly, genomic imprinting is regulated by cis-acting ICRs that inuence allelic expression across long distances. Whereas the Xic controls 150 Mb of a chromosome, ICRs control gene clusters of 0.13.0 Mb. Within the clusters, the direction of transcription and distribution of maternally versus paternally imprinted genes can vary (Figure 1). However, nearly all imprinted clusters studied to date contain at least one each of maternally expressed and paternally expressed genes. ICRs are usually just a few kilobases in length with allele-specic DNA methylation and chromatin modications, but their ICRs location relative to the genes can also vary. Most ICRs are methylated in the female germline during oocyte growth (Bartolomei and Ferguson-Smith, 2011). A few, including the ICRs for the H19/Igf2 and Gtl2/Dlk1 clusters, are methylated on the paternal allele prior to birth in the male germline (Bartolomei and Ferguson-Smith, 2011) (Figure 1A). Maternally methylated ICRs typically harbor the promoter for lncRNAs, examples of which include the ICRs for Kcnq1ot1, Snprn, and Airn (Figures 1B 1D). In contrast, paternally methylated ICRs are intergenic (Barlow, 2011). In the case of the H19/Igf2 locus, the ICR serves as a methylation-sensitive insulator (Figure 1A). In all cases, ICR deletions result in the loss of imprinting of multiple genes within the cluster. Long Noncoding RNAs The X-Inactivation Center. Noted early in the study of both phenomena, a prominent feature of the Xic and ICRs is their association with lncRNAs, the prototypes of which were discovered within these regions (Brannan et al., 1990; Borsani et al., 1991; Brown et al., 1991a; Lee et al., 1999a; Koerner et al., 2009). With respect to function and mechanism, the Xic harbors some of the best-characterized lncRNAs. The X-inactive-specic transcript (XIST/Xist) (Brockdorff et al., 1992; Brown et al., 1992) produces a 1720 kb RNA that decorates
Cell 152, March 14, 2013 2013 Elsevier Inc. 1311

Figure 3. The Control Center and Steps of Initiation during X Chromosome Inactivation
(A) The X-inactivation center consists of multiple genes encoding lncRNA, including Xist, RepA, Tsix, Xite, Jpx/Enox, Ftx, and Tsx. Regions involved in various steps of XCI (counting, choice, pairing, and silencing) are delineated. (B) Converging pathways of RNA-protein interactions during XCI. Yellow ovals represent chromatin complexes (PRC2, YY1, DNMT3a, RNF12, and REX1) that interact with indicated lncRNA or associated loci. Positive regulation shown by pointed arrows; negative regulation shown by blunted arrows. Various steps of XCI are shown in blue lettering. (C) Initiation of XCI by lncRNA. (1) Biallelic Tsix prevents loading of RepA-PRC2 and initiation of XCI; (2) Two events enable Xist expression during cell differentiation: induction of the Jpx activation and monoallelic loss of Tsix on Xi, which allows RepA-PRC2 to load; (3) Xist cotranscriptionally recruits PRC2. YY1 binds Xi nucleation center, but is blocked from binding Xa; (4) Xist-PRC2 complex cotranscriptionally loads onto the YY1-based nucleation center; (5) From the nucleation center, Xist-PRC2 spreads in a cis-limited fashion to 150 strong Polycomb stations, which in turn spread H3K27me3 via 3,0004,000 moderate Polycomb sites.

the X chromosome during the initiation of XCI (Clemson et al., 1996). Xist is expressed only from the Xi and is required for whole-chromosome silencing (Penny et al., 1996; Marahrens et al., 1997). Xist RNA directs chromatin and transcriptional change by binding Polycomb repressive complex 2 (PRC2), the epigenetic complex responsible for trimethylation of histone H3 at lysine 27 (H3K27me3), and targeting PRC2 to the Xi (Zhao et al., 2008) (Figure 3B). This discovery suggests RNA as a crucial guiding factor in Polycomb targeting. However, PRC2 targeting and binding to the chromatin are biologically separable, as indeed chromatin loading is precluded when Xists antisense partner, Tsix (Lee et al., 1999a), is expressed in cis (Zhao et al., 2008). Only when Tsix expression is downregulated during development does the Xist-PRC2 complex load onto the Xi nucleation center within Xists exon 1 (Jeon and Lee, 2011). The nucleation center consists of three binding sites for the transcription factor, YY1, a protein bound only to the Xi allele. By cotranscriptionally tethering Xist RNA to the Xic, YY1 bridges PRC2, Xist RNA, and Xi chromatin (Figure 3B). From the nucleation center, PRC2 spreads initially to 150 strong binding sites along the future Xi, concentrating predomi1312 Cell 152, March 14, 2013 2013 Elsevier Inc.

nantly within bivalent domains coinciding with CpG islands (Pinter et al., 2012) (Figure 3C). As XCI proceeds, the coating of the future Xi by Xist RNA correlates with recruitment of 3,0004,000 moderate Polycomb sites, most of which are intergenic, nonbivalent, and lack CpG islands. The moderate sites cluster around strong sites and facilitate the spreading of H3K27me3 in a graded concentration relative to strong sites. Interestingly, Polycomb stations are not enriched for the LINE1 repeats previously hypothesized to inuence spreading (Lyon, 2003; Chow et al., 2010). Thus, the spreading of XCI is also controlled by Xist RNA and is governed by a hierarchy of dened Polycomb stations along the Xi. Xist is itself controlled by other lncRNAs, some acting negatively (e.g., Tsix), others positively (e.g., Jpx). Tsix RNA, the antisense partner of Xist RNA (Lee et al., 1999a), represses Xist in several ways. First, Tsix coordinates X chromosome pairing to generate the epigenetic asymmetry required for selection of future Xa and Xi (Bacher et al., 2006; Xu et al., 2006; Xu et al., 2007). Second, Tsix also recruits DNA methyltransferase (Dnmt3a) to silence Xist (Sado et al., 2005; Sun et al., 2006). Third, it blocks recruitment of PRC2 to Xist

(and RepA, see below), potentially by binding PRC2 and titrating it from Xist/RepA RNAs (Zhao et al., 2008). Tsix also duplexes with Xist/RepA RNA (Ogawa et al., 2008) and possibly serves as decoy for PRC2 recruitment (by titrating Xist-RepA RNA or PRC2). In these ways, Tsix determines allelic choice by repressing Xist transcription on one allele (Figure 3B). Xist is regulated positively by Jpx RNA (Tian et al., 2010) (Figure 3C). Deleting Jpx abolishes Xist activation, indicating that Jpx is a positive regulator. Because knocking down the RNA recapitulates the deletion, Jpx must function as an RNA and not merely through its act of being transcribed. Moreover, because Jpx expression from an autosomal transgene can rescue the X-linked deletion, Jpx RNA is trans acting, unlike other elements of the Xic. The 1.6 kb RepA RNA (intragenic to Xist) has also been implicated as a potential activator of Xist expression, as its expression appears to be necessary for Xist upregulation (Zhao et al., 2008) and deleting the Repeat A motif (Hoki et al., 2009) abolishes Xist induction. The linked noncoding Ftx locus has also been suggested to regulate Xist, as deleting Ftx in male cells has mild effects on the chromatin prole of Xist (Chureau et al., 2011), but its effects in female cells are currently unknown. These Xist regulators work in parallel with the E3 ubiquitin ligase, RNF12, encoded by an X-linked gene near the Xic (Figures 3A and 3B): Its overexpression causes partial derepression of Xist (Jonkers et al., 2009), and knockouts of Rnf12 block imprinted XCI and delay random XCI (Shin et al., 2010; Barakat et al., 2011). The pluripotency factor, REX1, has been identied as a target of RNF12 (Gontan et al., 2012). It is thought that elimination of REX1 binding to the Xist promoter facilitates activation of Xist. These studies collectively point to central functions for lncRNA-protein interactions, with the lncRNAs targeting epigenetic complexes, serving as antisense inhibitors, and activating sense transcription. Imprinting Clusters. Every imprinted cluster harbors lncRNAs (Figure 1), many of which originate within or near ICRs. These lncRNAs are themselves imprinted. The most common mechanism used for imprinting relies on expression of a lncRNA in cis and exploits much of what has been identied for silencing of the X chromosome during XCI (Figure 1). There are currently six well-characterized clusters of imprinted genes (along with at least nine additional less well-studied clusters), including Igf2r/Airn, Kcnq1, Snprn/Ube3A, Gnas, Igf2/H19, Dlk1/Gtl2 (Barlow, 2011). All of these clusters contain lncRNAs. Three imprinted lncRNAs are long mature RNAs: Airn is 108 kb (Lyle et al., 2000), Kcnq1ot1 is approximately 100 kb (Pauler et al., 2012), and, Ube3a-ATS may be in excess of 1,000 kb (Pauler et al., 2012). The Gtl2 lncRNA contains multiple alternatively spliced transcripts, however, downstream intergenic transcription has also been noted, suggesting longer RNAs are likely (Tierling et al., 2006). Nespas lncRNA exceeds 27 kb (Robson et al., 2012). Experiments that directly test the role of the lncRNA itself have now been performed for a number of loci (Airn, Nespas, Kcnq1ot1, and Ube3aats). Thus far, all have been analyzed by genetic manipulation of the endogenous locus to truncate the lncRNA by insertion of a polyadenylation signal. The 108 kb Airn lncRNA has been examined in the most detail. Initially, Barlow and colleagues reported that truncation of Airn to 3 kb

in a mouse model suggested that the lncRNA itself is necessary to silence all 3 mRNA genes in the Igf2r cluster, indicating a clear regulatory role for this lncRNA (Sleutels and Barlow, 2002) (Figure 1A). Similarly, truncation of the 100 kb Kcnq1ot1 lncRNA to 1.5 kb showed that this lncRNA was directly needed to silence all 10 mRNA genes in the larger Kcnq1 cluster (Mancini-Dinardo et al., 2006) (Figure 1B), and truncation of the 27 kb Nespas lncRNA showed it was necessary to silence the overlapped Nesp gene in the Gnas imprinted cluster (Williamson et al., 2011). Lastly, truncated Ube3a-ATS in an embryonic stem (ES) cell model resulted in activation of paternal Ube3a (Figure 1D), consistent with the role for the Ube3a-ATS lncRNA in repressing paternal Ube3a in neurons (Meng et al., 2012) (Figure 1). At this point, it is not clear how the lncRNAs silence imprinted genes in cis. One possibility is that they overlap adjacent imprinted genes and the sense-antisense overlap causes a form of transcriptional interference of a promoter or an enhancer, which in turn affects transcription from the mRNA promoter (Pauler et al., 2012). In this case, the rst event could be silencing of the overlapped promoter or enhancer followed by accumulation of repressive chromatin that can spread and induce transcriptional gene silencing throughout the cluster. Evidence for this model was recently obtained by Latos and colleagues by generating a series of recombinant endogenous chromosomes at the Igf2r/Airn locus in ES cells (Latos et al., 2012) (Figure 1C). Analogous to XCI, the onset of allele-specic expression at this locus in the embryo can be recapitulated by ES cell differentiation, where Igf2r is initially biallelically expressed but the initiation of Airn expression results in Igf2r imprinting (Latos et al., 2009). To test whether Airn transcription or the lncRNA itself was required for Igf2r silencing, Airn was shortened to different lengths and it was shown that silencing only required Airn transcription overlap of the Igf2r promoter, which interferes with RNA polymerase II recruitment (Latos et al., 2012). This model suggests that Airn acts predominantly through its transcription, rather than as a lncRNA like Xist. It is, however, also possible that imprinted lncRNAs act by coating the local chromosomal region and directly recruiting repressive chromatin proteins to the imprinted cluster, in a manner similar to Xist lncRNA. Many imprinted lncRNAs, such as Gtl2 and Nespas, appear to form a complex with Polycomb proteins (Pandey et al., 2008; Zhao et al., 2010). Evidence for a function of the lncRNA in recruitment of histone posttranslational modication machinery comes from experiments in placental tissues. RNA uorescence in situ hybridization experiments showed that Airn and Kcnq1ot1 form RNA clouds at their site of transcription and are associated with a repressive histone compartment and Polycomb proteins (Nagano et al., 2008; Pandey et al., 2008; Terranova et al., 2008; Redrup et al., 2009). This nuclear compartment is also devoid of RNA polymerase II and exists in a three-dimensionally contracted state. Other studies on the Airn lncRNA go further in suggesting that the lncRNAs actively recruit repressive histone modications (Nagano et al., 2008) but only in the placenta. In this latter case, Airn was shown to actively recruit the histone H3 lysine 9 methyltransferase, G9a, and paternal-specic silencing of the Slc22a3 gene but not the Igf2r gene, was dependent on
Cell 152, March 14, 2013 2013 Elsevier Inc. 1313

Figure 4. LncRNAs Tether Epigenetic Complexes to Chromatin, Enabling Allelic, and Locus-Specic Regulation
(AC) LncRNA transcribed by RNA polymerase II (POL-II) (A) cotranscriptionally binds to an epigenetic complex (such as PRC2) (B), which loads onto chromatin through DNA-bound factors such as YY1 (for Xist RNA) (C). (D) Epigenetic modications silence the linked gene. Rapid lncRNA turnover prevents diffusion and action at ectopic loci. Adapted from Lee, 2012.

G9ain a mechanism that contrasts with the promotertranscription model hypothesized on the basis of transcript truncation experiments in somatic lineages (Latos et al., 2012). These experiments indicate that lncRNA mediated silencing of imprinted genes may depend on different downstream mechanisms. A new class of lncRNAs was recently discovered, snolncRNAs, that arise from the imprinted Prader-Willi Syndrome (PWS) critical region of human chromosome 15 (Yin et al., 2012). Intriguingly, these lncRNAs, which have a snoRNA sequence at each end as well as intervening sequence, accumulate near the sites of synthesis and strongly associate with Fox family splicing regulators and alter splicing. The investigators hypothesize that the sno-lncRNAs in the PWS locus function as molecular sinks for Fox2, acting locally to alter splicing patterns in specic subnuclear neighborhoods. Thus, the mechanisms by which lncRNAs operate at imprinted loci are diverse. Why Are lncRNAs Central to Imprinting and XCI?. It has been argued that lncRNAs make ideal factors for allelic regulation (Lee, 2012). Indeed, lncRNAs tethering capabilities and potential for fast turnover renders them excellent allelic markers. These transcripts are tethered to the site of synthesis through the RNA polymerase II transcription complex and can therefore function as allele-specic tags (Figure 4). As shown by Xist and RepA RNA, such long transcripts can cotranscriptionally capture chromatin complexes while tethered to the site of transcription (Zhao et al., 2008). Tethering could be aided by bridge proteins, such as YY1 in the case of Xist RNA (Jeon and Lee, 2011). Rapid RNA turnover after transcription would prevent diffusion to ectopic sites. At the Xic, the decoying effect of Tsix for Polycomb proteins would be limited to the Xic by Tsixs very short half-life (3060 min, the time required to synthesize the 40 kb RNA; Sun et al., 2006) so that effective concentrations would only be reached at the site of synthesis. Whereas lncRNAs can mark alleles, proteins cannot retain such allelic memory because their transcriptional origin is lost when mRNA is shuttled to the cytoplasm. Another property of lncRNAs is their ability to specify a unique address (Lee, 2012). Although transcription factors can also recruit epigenetic complexes, lncRNAs offer the possibility of targeting to a single location. Transcription factors typically target complexes to multiple genomic locations because they recognize short DNA motifs that occur hundreds to thousands of times in the genome. In contrast, lncRNAs such as Tsix and RepA/Xist occur only once in the genome. Because of this
1314 Cell 152, March 14, 2013 2013 Elsevier Inc.

uniqueness, lncRNAs can deliver epigenetic complexes to a single address, offering a regulatory specicity not possible with proteins or small RNAs. These properties may explain why the protein-coding region syntenic to the present-day Xic was rapidly transformed into a noncoding landscape 150 million years ago when random XCI rst appeared in eutherian mammals (Duret et al., 2006). Prior to this time, Xist was a ubiquitin ligase, Lnx3, and Jpx was a peptidase, UspL1. It is likely that lncRNAs evolved within imprinted domains and other locations in the mammalian genome for similar reasons. For a discussion of genome-wide lncRNAs with epigenetic functions, we refer readers to the accompanying Review by Batista and Chang on page 1298 of this issue (Batista and Chang, 2013). Insulators Despite the common occurrence of lncRNAs at imprinted loci, insulators may play an equally important role in imprinted regions. The insulator model, which has been described at the Igf2/H19 locus (Figure 1A), is an evolutionarily older mechanism, components of which are conserved in marsupials (Smits et al., 2008). Key to this mechanism is CTCF-binding sites in the ICR, which exhibit insulator or enhancer blocking properties (Bell and Felsenfeld, 2000; Hark et al., 2000). On the maternal allele, CTCF binds to the ICR and blocks the access of Igf2 to enhancers shared with H19, which are located downstream, thereby allowing H19 exclusive enhancer access. On the paternal allele the ICR acquires DNA methylation in the male germline, preventing CTCF binding, allowing Igf2 interaction with the enhancers and paternal-specic expression (Figure 1A). The presence of DNA methylation on the paternal ICR leads to secondary methylation of the H19 promoter and paternal-specic H19 silencing (Thorvaldsen et al., 1998). The involvement of CTCF in the insulator model has prompted the identication of CTCF-binding sites at other imprinted genes such as Rasgrf1, Grb10, and Kcnq1ot1, indicating that the insulator model may operate in other imprinted clusters. CTCF sites have also been identied within the Xic in regions important for imprinted XCI (Chao et al., 2002); however, it is currently unknown if CTCF is central to imprinting the X. Insulator-based and lncRNA-based models are not mutually exclusive. The Enigma of Imprinted XCI Imprinted XCI in the Mouse A further consideration of imprinted XCI is worthwhile for its mechanistic differences and implications for human development. The mechanism of X-imprinting not only differs from

Figure 5. Imprinted XCI in the Mouse


(A) Pictorial representation of genic localization into the preformed silent compartment during imprinted XCI. XP repeats form a silent compartment next to the nucleolus by the two-cell stage and, although Xist RNA localizes within it, formation of this silent compartment does not require Xist. The repeats could potentially contribute to imprinted XCI by setting up a silencing compartment next to the nucleolus. The silent compartment is present by the two-cell stage and enlarges as genic loci are translocated into it and become silenced. Genic silencing depends on Xist. XP silencing is completed by the blastocyst. (B) Pictorial representation of XP and XM in the early mouse embryo. Repeat elements of XP create the silent perinucleolar compartment, whereas XM and active genic loci of XP reside in repeat-expressing regions. (C) Hypothesis: developmental history of the X chromosome from gamete to embryo. Hypothesized events in imprinted XCI of the mouse are shown. In the male germline, during the rst meiotic prophase, the X and Y are inactivated by MSCI and remain suppressed through spermiogenesis as PMSC. This germlineinactivated X may be passed onto the next generation with its repeats preinactivated. In the two-cell mouse embryo, repetitive elements on XP are already suppressed in an Xist-independent manner. XP genic silencing occurs progressively during preimplantation development, strictly depends on Xist, and is completed in the blastocyst stage. Thus, imprinted XCI in the mouse embryo is a two-step process, with repeat silencing (Xist-independent) occurring prior to genic silencing (Xist-dependent). Repeat silencing could account partly for the transgenerational information (the imprint) involved in XP silencing. The maternal germline also plays a crucial role in imprinted XCI by marking the future XM. Adapted from Namekawa et al. (2010).

random XCI but also differs between the imprinted marsupial and eutherian forms. In mouse imprinted XCI, XP-repeat silencing precedes genic inactivation (Figure 5A) (Namekawa et al., 2010). The repeats form a silent compartment next to the nucleolus by the two-cell stage and, although Xist RNA localizes within it, formation of this silent compartment does not require Xist. Repeats could potentially contribute to imprinted XCI by setting up a silencing compartment next to the nucleolus (Figure 5B). If their silencing were indeed carried over from the male germline, repeats could account partly for the transgenerational information (the imprint) for XP silencing.

XP genic silencing follows repeat silencing (Namekawa et al., 2010) and occurs predominantly in the morula-blastocyst stages (Okamoto and Heard, 2006; Namekawa et al., 2010). Although one study suggests an Xist-independent process (Kalantry et al., 2009), the general consensus is that genic silencing depends on Xist (Marahrens et al., 1997; Namekawa et al., 2010). Xist must be marked by a second (presently unknown) imprint that would promote imprinted genic XCI (Figure 5C). In the mouse, Xist and Tsix are opposing regulatory factors for imprinted genic silencing, as they are for random XCI. Deleting Xist from XP precludes placental XCI (Marahrens et al., 1997),
Cell 152, March 14, 2013 2013 Elsevier Inc. 1315

whereas deleting Tsix from XM compromises maternal-specic protection from imprinted silencing in the placenta (Lee, 2000; Sado et al., 2001). Thus, the Xic plays at least a partial role in imprinted XCI in eutherian mammals. Imprinted XCI in Marsupials The eutherian Xic is not recognizable in the marsupial (Duret et al., 2006). The idea of an Xist-independent mechanism based on repeat silencing raises the possibility of a similar mechanism in marsupials. Notably, the opposum male germline demonstrates postmeiotic silencing of X-linked repeat elements (Namekawa et al., 2007), but whether silencing is carried over into the embryo is unknown. The recent identication of RSX indicates that a lncRNA like XIST may be present (Grant et al., 2012). The 27 kb RSX transcript also coats the marsupial Xi and is specically expressed in female cells. Introduction of RSX transgenes into mouse ES cells results in partial silencing of three autosomal genes near the site of integration. These ndings suggest that RSX may be the XIST equivalent in opposum, though an RSX knockout has not been performed and the two lncRNAs do not possess obvious homology. Like in the mouse, an XIC mechanism may occur alongside a repeat-silencing process to implement imprinted XCI. Imprinted XCI in Humans? The question of whether imprinted XCI occurs in the human placenta has not been resolved, but implications for human development are evident. In several studies, examination of single X-linked genes from a small number of placentae suggested preferential maternal expression (e.g., Harrison and Warburton, 1986). Using transdifferentiation of a female human ES line into trophoblast cells, another study found that FMR1 was expressed only from one X, consistent with imprinting (Dhara and Benvenisty, 2004). However, other studies have detected expression from both XM and XP (Moreira de Mello aherrera et al., 2012); et al., 2010; Okamoto et al., 2011; Pen and, in a nonhuman primate model, XIST was detected from either XM or XP of the trophectoderm (Tachibana et al., 2012). The fact that the X chromosome contributing to Turner (XO) and Klinefelter (XXY) syndrome could be of either XM or XP origin (Skuse et al., 1997; Skuse, 2000, 2005) further argues against imprinting. Although the preponderance of evidence may be against imprinted XCI in human placentae, there is the intriguing possibility of X-imprinting in the brain as a basis for male-female differences in behavior and prevalence of autism (Skuse, 2000) (more below). The question of imprinting therefore bears signicance for human development and disease, particularly where X-linked mutations may contribute to early fetal loss, and congenital or cognitive defects. Human Diseases and Conditions Congenital Diseases of Imprinting Because of parental-origin effects, human disease syndromes can result from genetic or epigenetic abnormalities on only a single parental allele. In fact, most well-dened imprinted gene clusters are associated with human diseases (Thorvaldsen and Bartolomei, 2007). Interestingly, aberrant expression of ICR-associated lncRNAs may be implicated in various imprinting disorders. Two of the best-studied imprinting syndromes, PWS and Angelman (AS) syndromes, map to human chromo1316 Cell 152, March 14, 2013 2013 Elsevier Inc.

some 15 (Buiting, 2010). PWS involves loss of function of a number of genes on 15q11-13, including SNORD116. People with PWS are obese and have reduced muscle tone and mental ability. AS syndrome is a complex disorder of the nervous system that arises from loss of function of the UBE3A gene (Figure 1D). AS symptoms include delayed development, intellectual disability, and severe speech impairment. Most PWS and AS cases involve large deletions containing the imprinted genes from the chromosome on which they are expressed. In PWS, there is biallelic repression of the ICR-associated lncRNA; in AS, the lncRNA is biallelically expressed. A smaller number of cases arise from either deletion or aberrant allelic DNA methylation of the ICR, leading to expression changes. With the recent identication of the new class of lncRNAs, snolncRNAs, it is likely that absence of the sno-lncRNA in the PWS critical region impairs brain-specic splicing possibly due to mislocalization of Fox splicing factors. Beckwith-Wiedemann syndrome (BWS), an overgrowth disorder, and Silver-Russell syndrome (SRS), an undergrowth and asymmetry disorder, are two other well-studied imprinting disorders that map to human chromosome 11p15.5, where IGF2 and H19 reside (Figure 1A). Unlike PWS and AS, the majority of individuals with BWS or SRS have epigenetic errors. For example, over half of BWS cases exhibit loss of methylation at the KCNQ1 ICR, which results in biallelic expression of the KCNQ1OT1 lncRNA (Weksberg et al., 2005) (Figure 1B). Inappropriate expression of the lncRNA may lead to aberrant repression of associated disease genes in cisin this case, CDKN1C was silenced. Additionally, some BWS patients exhibit overexpression of IGF2. Most of these cases have small deletions in the ICR on the maternal allele, which disrupts the CTCF-dependent insulator, leading to biallelic IGF2 and loss of H19 expression (Riccio et al., 2009). Curiously, the remaining ICR sequences in these individuals are hypermethylated. Many individuals with SRS have an opposite epigenetic phenotype where the ICR is unmethylated, resulting in biallelic H19 expression and loss of IGF2 expression. In many of these cases, it is unclear what event leads to DNA hypomethylation but in some cases, multiple imprinted loci exhibit loss of ICR methylation (Azzi et al., 2010). Signicantly, some examples of multilocus loss of imprinting involve mutations in ZFP57, a zinc nger protein involved in the postfertilization maintenance of genomic imprints, which was rst reported in individuals presenting with transient neonatal diabetes (Mackay et al., 2008). It is possible that yet-to-be identied proteins are mutated in other cases involving loss of methylation. Alternatively, early environment insults can affect DNA methylation patterns (see Imprinting and Assisted Reproductive Technology as an example). X-Linked Inuences on Disease, Cognition, and Behavior The X chromosome is home to nearly 1,000 genes, many of which result in discernible human phenotypes when mutated. X-linked diseases result from single-gene mutations, which can be classied as dominant or recessive, with the former manifest in both XX and XY individuals and the latter manifest primarily in XY individuals because they lack a wild-type allele. X-linked mutations can cause serious disease, such as hemophilia A (FVIII), Duchenne muscular dystrophy (DMD), Rett syndrome

(MECP2), and fragile X syndrome (FMR1), or less serious conditions, such as red-green color blindness and male-pattern baldness. Because of differential inheritance of sex chromosomes and the hemizygous state of the X chromosome in the male population, more diseases have been described for the X chromosome than any other (Puck and Willard, 1998). As X-linked genes have existed in the hemizygous state for much of the history of sex chromosomes, the X chromosome has been engaged in selection of sexually dimorphic traits for more than 300 million years since the X and Y began to diverge (Arnold et al., 2004; Skuse, 2005). Genes for sexual dimorphism, reproduction, and cognition are enriched on the X chromosome, with their genetic patency making them easy substrates for evolutionary selection. In mice, deleting the Xic-encoded lncRNA, Tsx, has been shown to reduce fear and enhance hippocampal short-term memory in male mice (Anguera et al., 2011). The fact that many X-linked genes are expressed in the brain, some in a sex-specic manner, may explain why mental retardation and autism are up to ten times more common in males, though the underlying mutations are not known for many such disorders (Skuse et al., 1997). Genetic patency of X-linked haplotypes has been hypothesized to increase the likelihood of manifesting extreme behavioral and cognitive phenotypes in males, and the likelihood would also be increased in females when the XCI pattern is skewed to favor XM expression. XCI proles and mosaicism vary extensively between human females, perhaps accounting for greater phenotypic variation among females (Carrel and Willard, 2005). Genes that variably escape XCI also contribute to this effect (Berletch et al., 2011). In the mouse, X-linked modiers such as the Xce can skew XCI ratios (Cattanach and Isaacson, 1967; Percec et al., 2002; Thorvaldsen et al., 2012), providing a mechanism by which nonrandom XCI patterns could be generated. Nonrandom XCI is also not uncommon in human females (Puck and Willard, 1998). In the area of cognitive and behavioral development, the study of X chromosome monosomies (XO, Turner syndrome) has played a major role in elucidating X-linked contributions. Turner syndrome girls usually have normal verbal intelligence but are less developed in spatial and mathematical skills. By comparing Turner syndrome girls who inherited their X chromosome from mother (XMO) versus father (XPO), one study concluded that the XP was associated with enhanced social cognitive function (Skuse et al., 1997). Despite their genotypic similarity, the epigenetically different XPO and XMO girls demonstrated measurable phenotypic differences in social adjustment. The fact that the XP chromosome is normally only inherited by daughters has led some to suggest that it accounts for better social skills in girls on average. XPO and XMO girls also exhibit differences in visual memory and brain structure (Bishop et al., 2000; Kesler et al., 2004). Candidate genes include USP9X, MAOA, and MAOB (monoamine oxidases) on the short arm of the human X chromosome (Good et al., 2003; Oreland et al., 2004). Genes on the X chromosome may be imprinted tissue specifically, particularly in the brain where many X-linked genes are expressed. A transcriptome analysis of the mouse brain suggested that hundreds of alleles on XM may be preferentially expressed in glutamatergic neurons of the female cortex (Gregg

et al., 2010). Although XP alleles are not silenced, they are expressed at lower levels. This type of partial imprinting could contribute to cognitive and behavioral differences. Follow-up analyses have argued that the allelic skewing called by wholetranscriptome analyses may have been an aberration caused by unappreciated statistical limitations of a novel technology (DeVeale et al., 2012; reviewed in Kelsey and Bartolomei, 2012). Thus, the question of how many and in what tissues imprinted X-linked genes may occur in eutherian mammals remains open. This clinically important area has been underexplored. Xist, the X Chromosome, and Cancer An association between the X chromosome and cancer has been noted since the discovery of the Barr body (Moore and Barr, 1955; Liao et al., 2003; Pageau et al., 2007). Breast and ovarian cancer cells, for example, frequently duplicate their Xa. The correlation also holds for men, as XXY men have a 20- to 50fold increased risk of breast cancer in a BRCA1 background (Fentiman et al., 2006), and testicular germ cell tumors often acquire supernumerary Xs (Kawakami et al., 2003). One recent study directly tested the connection of the X to cancer by conditionally deleting Xist RNA in the blood lineages of mice (Yildirim et al., 2013). This deletion resulted in overexpression of the X chromosome and a fulminant hematologic cancer known as mixed MPN/MDS (myeloproliferative neoplasm, myelodysplastic syndrome), a cancer that includes chronic myelomonocytic leukemia, erythroleukemia, histiocytic sarcoma, and bone marrow brosis. The cancer is female specic and 100% penetrant. Intriguingly, in humans, MDS is more common in women, with noted XIST deletions and X chromosome duplications occurring in MPN, MDS, and myeloid cancers (see references within Yildirim et al., 2013). The association is not restricted to women, as extra X chromosomes are seen in a range of leukemias in both sexes. The mouse study showed that loss of Xist perturbed maturation as well as longevity of hematopoietic stem cells. Thus, Xist plays a role not only in dosage compensation but also in suppressing cancer and preserving function of adult stem cell populations. This study illustrates the importance of studying lncRNA function not only in cells ex vivo but also within the context of the organism in vivo. Epigenetic Reprogramming in Human Stem Cells Xist RNA also inuences the pluripotent stem cell population, as shown by recent studies of induced pluripotent stem cells (iPSC) in regenerative medicine. In mice, XCI is tightly linked to cell differentiation in the epiblast and the possession of two Xa is a hallmark of pluripotent cells of both mouse ESC and iPSC (reviewed in Minkovsky et al., 2012). The tight linkage is explained by the physical convergence of many pluripotency factors, such as OCT4, SOX2, NANOG, and REX1, at the Xic, specically within control regions of Xite, Tsix, and Xist (Navarro et al., 2008; Donohoe et al., 2009; Navarro et al., 2010) (Figure 3B). Binding of pluripotency factors to these regions blocks initiation of XCI, and the loss of binding during cell differentiation creates a permissive state for the initiation of XCI. XIST currently provides one of a few tangible readouts for stem cell quality. In human ESC (hESC) and iPSC (hiPSC), XIST expression and XCI do not necessarily occur in the expected manner. Female hESC and hiPSC lines occur in three different epigenetic groups based on XIST expression
Cell 152, March 14, 2013 2013 Elsevier Inc. 1317

(Silva et al., 2008) (Minkovsky et al., 2012). Class I cells are most similar to mESC in that they have two Xa in the undifferentiated state. When placed in differentiation conditions, Class I cells express XIST and initiate XCI. In contrast, Class II cells already express XIST and carry one Xi, even before growth under differentiation conditions. Finally, Class III cells once expressed XIST but irreversibly lost its expression, with evidence of partial X reactivation (Shen et al., 2008; Silva et al., 2008; Anguera et al., 2012; Tomoda et al., 2012). Epigenetic uidity is evident through irreversible progression from Class I to II to III states (Tchieu et al., 2010; Anguera et al., 2012; Mekhoubad et al., 2012). Class I is transient, whereas Class III is dominant and stable. Although failure of XIST expression is lethal in vivo (Penny et al., 1996; Marahrens et al., 1997), loss of XIST does not have the same dire consequences ex vivo, though these cells lack full developmental potential (Silva et al., 2008; Anguera et al., 2012). Class III hiPSCs have limited differentiation capability (Anguera et al., 2012; Mekhoubad et al., 2012), In a xenograft model, Class III hiPSCs produce cystic teratomas composed of simple cystic epithelia and undifferentiated mesenchyme, whereas Class II cells produce well-differentiated structures of three germ layers. Given the tumorigenic phenotype of the murine Xist deletion (Yildirim et al., 2013), most concerning would be the potential of XIST-negative hIPSC lines to cause cancer when introduced in vivo in the clinical setting. Indeed, Class III hiPSC also showed partial X reactivation, faster doubling times, and a distinct gene expression signature of cancer cells (Anguera et al., 2012), urging further careful consideration before using hiPSCs in regenerative medicine. Genomic imprinting also contributes to quality of human and mouse iPSC (Pick et al., 2009; Sun et al., 2012). The imprinted state of the imprinted Dlk1-Dio3 locusin particular the expression of Gtl2 (aka Meg3) lncRNAhas been at the center of attention. One study found that mouse iPSC clones with aberrant Dlk1-Dio3 imprinting and low Gtl2 expression contributed poorly to chimeras (Stadtfeld et al., 2010), whereas another did not observe a difference (Carey et al., 2011). There is, however, general agreement that loss of imprinting at this locus resulted in lower efciency of generating entirely iPSC-derived mice (Stadtfeld et al., 2010; Carey et al., 2011; Stadtfeld et al., 2012). With further investigation, it is likely that other imprinted loci will affect stem cell quality. Nonetheless, despite a number of claims that imprinting is aberrant, iPSCs can be an important tool for studying imprinting perturbations in inaccessible cell types such as neurons in AS (Chamberlain et al., 2010). Imprinting and Assisted Reproductive Technology Related to the issue of epigenetic change within imprinted and X-linked loci in stem cells ex vivo is the question of whether in vitro culture of early human embryos during use of ART might have similar effects on imprinting and XCI. The ex vivo manipulations utilized during ART coincide with the developmental stages in which genome-wide epigenetic reprogramming occurs (i.e., oocyte growth and preimplantation development). The use of ART procedures to help couples with fertility issues conceive children of their own has doubled in the last decade. In 2009, ART contributed to 1.4% of all U.S. births (Sunderam et al.,
1318 Cell 152, March 14, 2013 2013 Elsevier Inc.

2012). Nevertheless, there is growing concern about the safety of these procedures (Manipalviratn et al., 2009). Of particular concern, children conceived by ART have an increased incidence of rare epigenetic disorders, with most of these patients exhibiting loss of DNA methylation at ICRs (Manipalviratn et al., 2009). Specically, cases of AS and BWS in children conceived by ART are associated with loss of methylation of the SNPRN and KCNQ1 ICRs, respectively, which result in biallelic expression of the lncRNAs and loss of expression of UBE3A and CDKN1C, respectively (Figure 1). Consistently, animal studies have demonstrated that embryo culture and embryo transfer as well as hormonal treatments, which are integral components of ART, disrupt normal epigenetic programming in embryonic and extraembryonic lineages (Mann et al., 2004; Fortier et al., 2008; Rivera et al., 2008), although the mechanism for this disruption remains poorly understood. Thus, a greater understanding of in vitro effects on epigenetic regulation during ART is a rising need from a public health perspective of industrialized countries. Conclusions and Therapeutic Prospects XCI, genomic imprinting, and lncRNA clearly have major implications for public health. Yet, in the arena of preventive, diagnostic, and therapeutic medicine, few strategies have targeted regulatory factors for imprinted genes and the Xic to control X-linked disease and conditions. This holds true also for regenerative medicine and stem cell biology, where ex vivo cellular manipulations have not universally considered the impact of imprinting and XCI, in spite of converging indications that these processes impact production, maintenance, and overall quality of stem cells. On a hopeful note, proof-of-concept was reported in one recent study. One of the most intriguing aspects of disorders that involve monoallelically expressed genes is the prospect for therapy that involves derepressing the silenced allele in situations where the expressed allele of an imprinted gene is deleted or contains a loss of function mutation. A recent success was reported for AS, where a screen revealed that small molecule topoisomerase inhibitors reactivated the silenced UBE3A gene and repressed the ICR-associated antisense RNA (Huang et al., 2012). Ironically, however, because of the clustering of imprinted genes, the biallelic activation of UBE3A was accompanied by the loss of expression of the paternally-expressed genes in the locus (Figure 1D). Although the mechanism for reactivation is unclear, these strategies offer hope and suggest that other loci could be subject to similar screens. The successful reactivation of the silent copy of UBE3A raises hopes that a treatment for various X-linked diseases might be similarly achieved. Of particular interest has been Rett syndrome, a neurologic disorder caused by mutations in MECP2. The syndrome affects girls and is manifested by a reversal of developmental milestones after the rst year of life (the disease is fatal in newborn males). Because Rett syndrome is not accompanied by neurodegeneration, efforts have been devoted to restoring expression of MECP2 after birth in hopes of reversing the symptoms. Intriguingly, mouse models have shown that restoration of MECP2 expression after disease rst becomes symptomatic can reverse the neurologic defects

(Giacometti et al., 2007; Guy et al., 2007). Because the possibility that restoration of MECP2 expression might similarly cure Rett syndrome in humans, ongoing studies are now aimed at reactivating the wild-type copy of MECP2 in affected girls through small molecules that target chromatin modiers and other regulators of XCI and XCR. Furthermore, with knowledge that loss of XIST expression and/or overdosage of the X chromosome could result in blood cancer (Yildirim et al., 2013), cancer therapeutics might similarly be directed at genes on the X chromosome. In the future, in addition to trans-acting factors such as topoisomerases, therapeutic strategies could be targeted at control regions (ICRs, Xic) or lncRNAs that regulate crucial genes in cis. For example, it may be productive to determine ways to control expression of XIST RNA, GTL2, and other imprinted genes. We may be some ways from realizing commercial products, but the technologies to develop them are evolving rapidly and may soon enable us to produce drugs to inuence cellular reprogramming ex vivo and to treat human diseases and conditions in vivo.
ACKNOWLEDGMENTS We thank the Lee and Bartolomei labs for many inspirational discussions, and the National Institutes of Health and the Howard Hughes Medical Institute for supporting their research. REFERENCES Anguera, M.C., Ma, W., Clift, D., Namekawa, S.H., Kelleher, R.J., 3rd, and Lee, J.T. (2011). Tsx produces a long noncoding RNA and has general functions in the germline, stem cells, and brain. PLoS Genet. 7, e1002248. Anguera, M.C., Sadreyev, R.I., Zhang, Z., Szanto, A., Payer, B., Sheridan, S.D., Kwok, S., Haggarty, S.J., Sur, M., Alvarez, J., et al. (2012). Molecular signatures of human induced pluripotent stem cells highlight sex differences and cancer genes. Cell Stem Cell 11, 7590. Arnold, A.P., Xu, J., Grisham, W., Chen, X., Kim, Y.H., and Itoh, Y. (2004). Minireview: Sex chromosomes and brain sexual differentiation. Endocrinology 145, 10571062. Azzi, S., Rossignol, S., Le Bouc, Y., and Netchine, I. (2010). Lessons from imprinted multilocus loss of methylation in human syndromes: A step toward understanding the mechanisms underlying these complex diseases. Epigenetics 5, 373377. Bacher, C.P., Guggiari, M., Brors, B., Augui, S., Clerc, P., Avner, P., Eils, R., and Heard, E. (2006). Transient colocalization of X-inactivation centres accompanies the initiation of X inactivation. Nat. Cell Biol. 8, 293299. Barakat, T.S., Gunhanlar, N., Pardo, C.G., Achame, E.M., Ghazvini, M., Boers, R., Kenter, A., Rentmeester, E., Grootegoed, J.A., and Gribnau, J. (2011). RNF12 activates Xist and is essential for X chromosome inactivation. PLoS Genet. 7, e1002001. Barlow, D.P. (2011). Genomic imprinting: a mammalian epigenetic discovery model. Annu. Rev. Genet. 45, 379403. Barr, M.L., and Bertram, E.G. (1949). A morphological distinction between neurones of the male and female, and the behaviour of the nucleolar satellite during accelerated nucleoprotein synthesis. Nature 163, 676. Bartolomei, M.S. (2009). Genomic imprinting: employing and avoiding epigenetic processes. Genes Dev. 23, 21242133. Bartolomei, M.S., and Ferguson-Smith, A.C. (2011). Mammalian genomic imprinting. Cold Spring Harb. Perspect. Biol. 3. http://dx.doi.org/10.1101/ cshperspect.a002592. Batista, P.J., and Chang, H.Y. (2013). Long noncoding RNAs: Cellular address codes in development and disease. Cell 152, this issue, 12981307.

Bell, A.C., and Felsenfeld, G. (2000). Methylation of a CTCF-dependent boundary controls imprinted expression of the Igf2 gene. Nature 405, 482485. Berletch, J.B., Yang, F., Xu, J., Carrel, L., and Disteche, C.M. (2011). Genes that escape from X inactivation. Hum. Genet. 130, 237245. Bishop, D.V., Canning, E., Elgar, K., Morris, E., Jacobs, P.A., and Skuse, D.H. (2000). Distinctive patterns of memory function in subgroups of females with Turner syndrome: evidence for imprinted loci on the X-chromosome affecting neurodevelopment. Neuropsychologia 38, 712721. Borsani, G., Tonlorenzi, R., Simmler, M.C., Dandolo, L., Arnaud, D., Capra, V., Grompe, M., Pizzuti, A., Muzny, D., Lawrence, C., et al. (1991). Characterization of a murine gene expressed from the inactive X chromosome. Nature 351, 325329. Brannan, C.I., Dees, E.C., Ingram, R.S., and Tilghman, S.M. (1990). The product of the H19 gene may function as an RNA. Mol. Cell. Biol. 10, 2836. Brockdorff, N., Ashworth, A., Kay, G.F., McCabe, V.M., Norris, D.P., Cooper, P.J., Swift, S., and Rastan, S. (1992). The product of the mouse Xist gene is a 15 kb inactive X-specic transcript containing no conserved ORF and located in the nucleus. Cell 71, 515526. Brown, C.J., Ballabio, A., Rupert, J.L., Lafreniere, R.G., Grompe, M., Tonlorenzi, R., and Willard, H.F. (1991a). A gene from the region of the human X inactivation centre is expressed exclusively from the inactive X chromosome. Nature 349, 3844. Brown, C.J., Lafreniere, R.G., Powers, V.E., Sebastio, G., Ballabio, A., Pettigrew, A.L., Ledbetter, D.H., Levy, E., Craig, I.W., and Willard, H.F. (1991b). Localization of the X inactivation centre on the human X chromosome in Xq13. Nature 349, 8284. ` re, R.G., Xing, Y., Lawrence, Brown, C.J., Hendrich, B.D., Rupert, J.L., Lafrenie J., and Willard, H.F. (1992). The human XIST gene: analysis of a 17 kb inactive X-specic RNA that contains conserved repeats and is highly localized within the nucleus. Cell 71, 527542. Buiting, K. (2010). Prader-Willi syndrome and Angelman syndrome. Am. J. Med. Genet. C. Semin. Med. Genet. 154C, 365376. Carey, B.W., Markoulaki, S., Hanna, J.H., Faddah, D.A., Buganim, Y., Kim, J., Ganz, K., Steine, E.J., Cassady, J.P., Creyghton, M.P., et al. (2011). Reprogramming factor stoichiometry inuences the epigenetic state and biological properties of induced pluripotent stem cells. Cell Stem Cell 9, 588598. Carrel, L., and Willard, H.F. (2005). X-inactivation prole reveals extensive variability in X-linked gene expression in females. Nature 434, 400404. Cattanach, B.M. (1986). Parental origin effects in mice. J. Embryol. Exp. Morphol. 97 (Suppl), 137150. Cattanach, B.M., and Isaacson, J.H. (1967). Controlling elements in the mouse X chromosome. Genetics 57, 331346. Chamberlain, S.J., Chen, P.F., Ng, K.Y., Bourgois-Rocha, F., Lemtiri-Chlieh, F., Levine, E.S., and Lalande, M. (2010). Induced pluripotent stem cell models of the genomic imprinting disorders Angelman and Prader-Willi syndromes. Proc. Natl. Acad. Sci. USA 107, 1766817673. Chao, W., Huynh, K.D., Spencer, R.J., Davidow, L.S., and Lee, J.T. (2002). CTCF, a candidate trans-acting factor for X-inactivation choice. Science 295, 345347. Chotalia, M., Smallwood, S.A., Ruf, N., Dawson, C., Lucifero, D., Frontera, M., James, K., Dean, W., and Kelsey, G. (2009). Transcription is required for establishment of germline methylation marks at imprinted genes. Genes Dev. 23, 105117. Chow, J.C., Ciaudo, C., Fazzari, M.J., Mise, N., Servant, N., Glass, J.L., Attreed, M., Avner, P., Wutz, A., Barillot, E., et al. (2010). LINE-1 activity in facultative heterochromatin formation during X chromosome inactivation. Cell 141, 956969. Chureau, C., Prissette, M., Bourdet, A., Barbe, V., Cattolico, L., Jones, L., Eggen, A., Avner, P., and Duret, L. (2002). Comparative sequence analysis of the X-inactivation center region in mouse, human, and bovine. Genome Res. 12, 894908. Chureau, C., Chantalat, S., Romito, A., Galvani, A., Duret, L., Avner, P., and Rougeulle, C. (2011). Ftx is a non-coding RNA which affects Xist expression

Cell 152, March 14, 2013 2013 Elsevier Inc. 1319

and chromatin structure within the X-inactivation center region. Hum. Mol. Genet. 20, 705718. Clemson, C.M., McNeil, J.A., Willard, H.F., and Lawrence, J.B. (1996). XIST RNA paints the inactive X chromosome at interphase: evidence for a novel RNA involved in nuclear/chromosome structure. J. Cell Biol. 132, 259275. Cooper, D.W. (1971). Directed genetic change model for X chromosome inactivation in eutherian mammals. Nature 230, 292294. DeVeale, B., van der Kooy, D., and Babak, T. (2012). Critical evaluation of imprinted gene expression by RNA-Seq: a new perspective. PLoS Genet. 8, e1002600. Dhara, S.K., and Benvenisty, N. (2004). Gene trap as a tool for genome annotation and analysis of X chromosome inactivation in human embryonic stem cells. Nucleic Acids Res. 32, 39954002. Donohoe, M.E., Silva, S.S., Pinter, S.F., Xu, N., and Lee, J.T. (2009). The pluripotency factor Oct4 interacts with Ctcf and also controls X-chromosome pairing and counting. Nature 460, 128132. Duret, L., Chureau, C., Samain, S., Weissenbach, J., and Avner, P. (2006). The Xist RNA gene evolved in eutherians by pseudogenization of a protein-coding gene. Science 312, 16531655. Fentiman, I.S., Fourquet, A., and Hortobagyi, G.N. (2006). Male breast cancer. Lancet 367, 595604. ` re, N., Martel, J., and Trasler, J.M. (2008). Fortier, A.L., Lopes, F.L., Darricarre Superovulation alters the expression of imprinted genes in the midgestation mouse placenta. Hum. Mol. Genet. 17, 16531665. Frost, J.M., and Moore, G.E. (2010). The importance of imprinting in the human placenta. PLoS Genet. 6, e1001015. Giacometti, E., Luikenhuis, S., Beard, C., and Jaenisch, R. (2007). Partial rescue of MeCP2 deciency by postnatal activation of MeCP2. Proc. Natl. Acad. Sci. USA 104, 19311936. Gontan, C., Achame, E.M., Demmers, J., Barakat, T.S., Rentmeester, E., van IJcken, W., Grootegoed, J.A., and Gribnau, J. (2012). RNF12 initiates X-chromosome inactivation by targeting REX1 for degradation. Nature 485, 386390. Good, C.D., Lawrence, K., Thomas, N.S., Price, C.J., Ashburner, J., Friston, K.J., Frackowiak, R.S., Oreland, L., and Skuse, D.H. (2003). Dosage-sensitive X-linked locus inuences the development of amygdala and orbitofrontal cortex, and fear recognition in humans. Brain 126, 24312446. Goto, Y., and Takagi, N. (2000). Maternally inherited X chromosome is not inactivated in mouse blastocysts due to parental imprinting. Chromosome Res. 8, 101109. Grant, J., Mahadevaiah, S.K., Khil, P., Sangrithi, M.N., Royo, H., Duckworth, J., McCarrey, J.R., VandeBerg, J.L., Renfree, M.B., Taylor, W., et al. (2012). Rsx is a metatherian RNA with Xist-like properties in X-chromosome inactivation. Nature 487, 254258. Greaves, I.K., Rangasamy, D., Devoy, M., Marshall Graves, J.A., and Tremethick, D.J. (2006). The X and Y chromosomes assemble into H2A.Zcontaining facultative heterochromatin following meiosis. Mol. Cell. Biol. 26, 53945405. Gregg, C., Zhang, J., Butler, J.E., Haig, D., and Dulac, C. (2010). Sex-specic parent-of-origin allelic expression in the mouse brain. Science 329, 682685. Guy, J., Gan, J., Selfridge, J., Cobb, S., and Bird, A. (2007). Reversal of neurological defects in a mouse model of Rett syndrome. Science 315, 11431147. Hark, A.T., Schoenherr, C.J., Katz, D.J., Ingram, R.S., Levorse, J.M., and Tilghman, S.M. (2000). CTCF mediates methylation-sensitive enhancerblocking activity at the H19/Igf2 locus. Nature 405, 486489. Harrison, K.B., and Warburton, D. (1986). Preferential X-chromosome activity in human female placental tissues. Cytogenet. Cell Genet. 41, 163168. Henckel, A., Chebli, K., Kota, S.K., Arnaud, P., and Feil, R. (2012). Transcription and histone methylation changes correlate with imprint acquisition in male germ cells. EMBO J. 31, 606615.

Hoki, Y., Kimura, N., Kanbayashi, M., Amakawa, Y., Ohhata, T., Sasaki, H., and Sado, T. (2009). A proximal conserved repeat in the Xist gene is essential as a genomic element for X-inactivation in mouse. Development 136, 139146. Huang, H.S., Allen, J.A., Mabb, A.M., King, I.F., Miriyala, J., Taylor-Blake, B., Sciaky, N., Dutton, J.W., Jr., Lee, H.M., Chen, X., et al. (2012). Topoisomerase inhibitors unsilence the dormant allele of Ube3a in neurons. Nature 481, 185189. Huynh, K.D., and Lee, J.T. (2003). Inheritance of a pre-inactivated paternal X chromosome in early mouse embryos. Nature 426, 857862. Jeon, Y., and Lee, J.T. (2011). YY1 tethers Xist RNA to the inactive X nucleation center. Cell 146, 119133. Jonkers, I., Barakat, T.S., Achame, E.M., Monkhorst, K., Kenter, A., Rentmeester, E., Grosveld, F., Grootegoed, J.A., and Gribnau, J. (2009). RNF12 is an X-Encoded dose-dependent activator of X chromosome inactivation. Cell 139, 9991011. Kalantry, S., Purushothaman, S., Bowen, R.B., Starmer, J., and Magnuson, T. (2009). Evidence of Xist RNA-independent initiation of mouse imprinted X-chromosome inactivation. Nature 460, 647651. Kawakami, T., Okamoto, K., Sugihara, H., Hattori, T., Reeve, A.E., Ogawa, O., and Okada, Y. (2003). The roles of supernumerical X chromosomes and XIST expression in testicular germ cell tumors. J. Urol. 169, 15461552. Kelsey, G., and Bartolomei, M.S. (2012). Imprinted genes . and the number is? PLoS Genet. 8, e1002601. Kesler, S.R., Garrett, A., Bender, B., Yankowitz, J., Zeng, S.M., and Reiss, A.L. (2004). Amygdala and hippocampal volumes in Turner syndrome: a high-resolution MRI study of X-monosomy. Neuropsychologia 42, 19711978. Koerner, M.V., Pauler, F.M., Huang, R., and Barlow, D.P. (2009). The function of non-coding RNAs in genomic imprinting. Development 136, 17711783. Latos, P.A., Stricker, S.H., Steenpass, L., Pauler, F.M., Huang, R., Senergin, B.H., Regha, K., Koerner, M.V., Warczok, K.E., Unger, C., and Barlow, D.P. (2009). An in vitro ES cell imprinting model shows that imprinted expression of the Igf2r gene arises from an allele-specic expression bias. Development 136, 437448. Latos, P.A., Pauler, F.M., Koerner, M.V., S xenergin, H.B., Hudson, Q.J., Stocsits, R.R., Allhoff, W., Stricker, S.H., Klement, R.M., Warczok, K.E., et al. (2012). Airn transcriptional overlap, but not its lncRNA products, induces imprinted Igf2r silencing. Science 338, 14691472. Lee, J.T. (2000). Disruption of imprinted X inactivation by parent-of-origin effects at Tsix. Cell 103, 1727. Lee, J.T. (2012). Epigenetic regulation by long noncoding RNAs. Science 338, 14351439. Lee, J.T., Strauss, W.M., Dausman, J.A., and Jaenisch, R. (1996). A 450 kb transgene displays properties of the mammalian X-inactivation center. Cell 86, 8394. Lee, J.T., Davidow, L.S., and Warshawsky, D. (1999a). Tsix, a gene antisense to Xist at the X-inactivation centre. Nat. Genet. 21, 400404. Lee, J.T., Lu, N., and Han, Y. (1999b). Genetic analysis of the mouse X inactivation center denes an 80-kb multifunction domain. Proc. Natl. Acad. Sci. USA 96, 38363841. Liao, D.J., Du, Q.Q., Yu, B.W., Grignon, D., and Sarkar, F.H. (2003). Novel perspective: focusing on the X chromosome in reproductive cancers. Cancer Invest. 21, 641658. Lifschytz, E., and Lindsley, D.L. (1972). The role of X-chromosome inactivation during spermatogenesis (Drosophila-allocycly-chromosome evolution-male sterility-dosage compensation). Proc. Natl. Acad. Sci. USA 69, 182186. Lyle, R., Watanabe, D., te Vruchte, D., Lerchner, W., Smrzka, O.W., Wutz, A., Schageman, J., Hahner, L., Davies, C., and Barlow, D.P. (2000). The imprinted antisense RNA at the Igf2r locus overlaps but does not imprint Mas1. Nat. Genet. 25, 1921.

1320 Cell 152, March 14, 2013 2013 Elsevier Inc.

Lyon, M.F. (1961). Gene action in the X-chromosome of the mouse (Mus musculus L.). Nature 190, 372373. Lyon, M.F. (1999). Imprinting and X chromosome inactivation. In Results and problems in cell differentiation, R. Ohlsson, ed. (Heidelberg: Springer-Verlag), pp. 7390. Lyon, M.F. (2003). The Lyon and the LINE hypothesis. Semin. Cell Dev. Biol. 14, 313318. Mackay, D.J., Callaway, J.L., Marks, S.M., White, H.E., Acerini, C.L., Boonen, S.E., Dayanikli, P., Firth, H.V., Goodship, J.A., Haemers, A.P., et al. (2008). Hypomethylation of multiple imprinted loci in individuals with transient neonatal diabetes is associated with mutations in ZFP57. Nat. Genet. 40, 949951. Mak, W., Nesterova, T.B., de Napoles, M., Appanah, R., Yamanaka, S., Otte, A.P., and Brockdorff, N. (2004). Reactivation of the paternal X chromosome in early mouse embryos. Science 303, 666669. Mancini-Dinardo, D., Steele, S.J., Levorse, J.M., Ingram, R.S., and Tilghman, S.M. (2006). Elongation of the Kcnq1ot1 transcript is required for genomic imprinting of neighboring genes. Genes Dev. 20, 12681282. Manipalviratn, S., DeCherney, A., and Segars, J. (2009). Imprinting disorders and assisted reproductive technology. Fertil. Steril. 91, 305315. Mann, M.R., Lee, S.S., Doherty, A.S., Verona, R.I., Nolen, L.D., Schultz, R.M., and Bartolomei, M.S. (2004). Selective loss of imprinting in the placenta following preimplantation development in culture. Development 131, 3727 3735. Marahrens, Y., Panning, B., Dausman, J., Strauss, W., and Jaenisch, R. (1997). Xist-decient mice are defective in dosage compensation but not spermatogenesis. Genes Dev. 11, 156166. McGrath, J., and Solter, D. (1984). Completion of mouse embryogenesis requires both the maternal and paternal genomes. Cell 37, 179183. Mekhoubad, S., Bock, C., de Boer, A.S., Kiskinis, E., Meissner, A., and Eggan, K. (2012). Erosion of dosage compensation impacts human iPSC disease modeling. Cell Stem Cell 10, 595609. Meng, L., Person, R.E., and Beaudet, A.L. (2012). Ube3a-ATS is an atypical RNA polymerase II transcript that represses the paternal expression of Ube3a. Hum. Mol. Genet. 21, 30013012. Migeon, B.R., Kazi, E., Haisley-Royster, C., Hu, J., Reeves, R., Call, L., Lawler, A., Moore, C.S., Morrison, H., and Jeppesen, P. (1999). Human X inactivation center induces random X chromosome inactivation in male transgenic mice. Genomics 59, 113121. Minkovsky, A., Patel, S., and Plath, K. (2012). Concise review: Pluripotency and the transcriptional inactivation of the female Mammalian X chromosome. Stem Cells 30, 4854. Moore, K.L., and Barr, M.L. (1955). The sex chromatin in benign tumours and related conditions in man. Br. J. Cancer 9, 246252. jo, E.S.S., Stabellini, R., Fraga, A.M., de Souza, Moreira de Mello, J.C., de Arau J.E.S., Sumita, D.R., Camargo, A.A., and Pereira, L.V. (2010). Random X inactivation and extensive mosaicism in human placenta revealed by analysis of allele-specic gene expression along the X chromosome. PLoS ONE 5, e10947. Nagano, T., Mitchell, J.A., Sanz, L.A., Pauler, F.M., Ferguson-Smith, A.C., Feil, R., and Fraser, P. (2008). The Air noncoding RNA epigenetically silences transcription by targeting G9a to chromatin. Science 322, 17171720. Namekawa, S.H., Park, P.J., Zhang, L.F., Shima, J.E., McCarrey, J.R., Griswold, M.D., and Lee, J.T. (2006). Postmeiotic sex chromatin in the male germline of mice. Curr. Biol. 16, 660667. Namekawa, S.H., VandeBerg, J.L., McCarrey, J.R., and Lee, J.T. (2007). Sex chromosome silencing in the marsupial male germ line. Proc. Natl. Acad. Sci. USA 104, 97309735. Namekawa, S.H., Payer, B., Huynh, K.D., Jaenisch, R., and Lee, J.T. (2010). Two-step imprinted X inactivation: repeat versus genic silencing in the mouse. Mol. Cell. Biol. 30, 31873205.

Navarro, P., Chambers, I., Karwacki-Neisius, V., Chureau, C., Morey, C., Rougeulle, C., and Avner, P. (2008). Molecular coupling of Xist regulation and pluripotency. Science 321, 16931695. Navarro, P., Oldeld, A., Legoupi, J., Festuccia, N., Dubois, A., Attia, M., Schoorlemmer, J., Rougeulle, C., Chambers, I., and Avner, P. (2010). Molecular coupling of Tsix regulation and pluripotency. Nature 468, 457460. Ogawa, Y., Sun, B.K., and Lee, J.T. (2008). Intersection of the RNA interference and X-inactivation pathways. Science 320, 13361341. Ohno, S., Kaplan, W.D., and Kinosita, R. (1959). Formation of the sex chromatin by a single X-chromosome in liver cells of Rattus norvegicus. Exp. Cell Res. 18, 415418. Okamoto, I., and Heard, E. (2006). The dynamics of imprinted X inactivation during preimplantation development in mice. Cytogenet. Genome Res. 113, 318324. pot, D., Peynot, N., Fauque, P., Daniel, N., DiabanOkamoto, I., Patrat, C., The gouaya, P., Wolf, J.P., Renard, J.P., Duranthon, V., and Heard, E. (2011). Eutherian mammals use diverse strategies to initiate X-chromosome inactivation during development. Nature 472, 370374. Oreland, L., Hallman, J., and Damberg, M. (2004). Platelet MAO and personalityfunction and dysfunction. Curr. Med. Chem. 11, 20072016. Pageau, G.J., Hall, L.L., Ganesan, S., Livingston, D.M., and Lawrence, J.B. (2007). The disappearing Barr body in breast and ovarian cancers. Nat. Rev. Cancer 7, 628633. Pandey, R.R., Mondal, T., Mohammad, F., Enroth, S., Redrup, L., Komorowski, J., Nagano, T., Mancini-Dinardo, D., and Kanduri, C. (2008). Kcnq1ot1 antisense noncoding RNA mediates lineage-specic transcriptional silencing through chromatin-level regulation. Mol. Cell 32, 232246. Pauler, F.M., Barlow, D.P., and Hudson, Q.J. (2012). Mechanisms of long range silencing by imprinted macro non-coding RNAs. Curr. Opin. Genet. Dev. 22, 283289. Payer, B., and Lee, J.T. (2008). X chromosome dosage compensation: how mammals keep the balance. Annu. Rev. Genet. 42, 733772. aherrera, M.S., Jiang, R., Avila, L., Yuen, R.K.C., Brown, C.J., and Pen Robinson, W.P. (2012). Patterns of placental development evaluated by X chromosome inactivation proling provide a basis to evaluate the origin of epigenetic variation. Hum. Reprod. 27, 17451753. Penny, G.D., Kay, G.F., Sheardown, S.A., Rastan, S., and Brockdorff, N. (1996). Requirement for Xist in X chromosome inactivation. Nature 379, 131137. Percec, I., Plenge, R.M., Nadeau, J.H., Bartolomei, M.S., and Willard, H.F. (2002). Autosomal dominant mutations affecting X inactivation choice in the mouse. Science 296, 11361139. Pick, M., Stelzer, Y., Bar-Nur, O., Mayshar, Y., Eden, A., and Benvenisty, N. (2009). Clone- and gene-specic aberrations of parental imprinting in human induced pluripotent stem cells. Stem Cells 27, 26862690. Pinter, S.F., Sadreyev, R.I., Yildirim, E., Jeon, Y., Ohsumi, T.K., Borowsky, M., and Lee, J.T. (2012). Spreading of X chromosome inactivation via a hierarchy of dened Polycomb stations. Genome Res. 22, 18641876. Puck, J.M., and Willard, H.F. (1998). X inactivation in females with X-linked disease. N. Engl. J. Med. 338, 325328. Puck, J.M., Stewart, C.C., and Nussbaum, R.L. (1992). Maximum-likelihood analysis of human T-cell X chromosome inactivation patterns: normal women versus carriers of X-linked severe combined immunodeciency. Am. J. Hum. Genet. 50, 742748. Redrup, L., Branco, M.R., Perdeaux, E.R., Krueger, C., Lewis, A., Santos, F., Nagano, T., Cobb, B.S., Fraser, P., and Reik, W. (2009). The long noncoding RNA Kcnq1ot1 organises a lineage-specic nuclear domain for epigenetic gene silencing. Development 136, 525530. Riccio, A., Sparago, A., Verde, G., De Crescenzo, A., Citro, V., Cubellis, M.V., Ferrero, G.B., Silengo, M.C., Russo, S., Larizza, L., and Cerrato, F. (2009). Inherited and Sporadic Epimutations at the IGF2-H19 locus in BeckwithWiedemann syndrome and Wilms tumor. Endocr. Dev. 14, 19.

Cell 152, March 14, 2013 2013 Elsevier Inc. 1321

Rivera, R.M., Stein, P., Weaver, J.R., Mager, J., Schultz, R.M., and Bartolomei, M.S. (2008). Manipulations of mouse embryos prior to implantation result in aberrant expression of imprinted genes on day 9.5 of development. Hum. Mol. Genet. 17, 114. Robson, J.E., Eaton, S.A., Underhill, P., Williams, D., and Peters, J. (2012). MicroRNAs 296 and 298 are imprinted and part of the GNAS/Gnas cluster and miR-296 targets IKBKE and Tmed9. RNA 18, 135144. Sado, T., Wang, Z., Sasaki, H., and Li, E. (2001). Regulation of imprinted X-chromosome inactivation in mice by Tsix. Development 128, 12751286. Sado, T., Hoki, Y., and Sasaki, H. (2005). Tsix silences Xist through modication of chromatin structure. Dev. Cell 9, 159165. Searle, A.G., and Beechey, C.V. (1978). Complementation studies with mouse translocations. Cytogenet. Cell Genet. 20, 282303. Sharman, G.B. (1971). Late DNA replication in the paternally derived X chromosome of female kangaroos. Nature 230, 231232. Shen, Y., Matsuno, Y., Fouse, S.D., Rao, N., Root, S., Xu, R., Pellegrini, M., Riggs, A.D., and Fan, G. (2008). X-inactivation in female human embryonic stem cells is in a nonrandom pattern and prone to epigenetic alterations. Proc. Natl. Acad. Sci. USA 105, 47094714. Shin, J., Bossenz, M., Chung, Y., Ma, H., Byron, M., Taniguchi-Ishigaki, N., Zhu, X., Jiao, B., Hall, L.L., Green, M.R., et al. (2010). Maternal Rnf12/RLIM is required for imprinted X-chromosome inactivation in mice. Nature 467, 977981. Silva, S.S., Rowntree, R.K., Mekhoubad, S., and Lee, J.T. (2008). X-chromosome inactivation and epigenetic uidity in human embryonic stem cells. Proc. Natl. Acad. Sci. USA 105, 48204825. Skuse, D.H. (2000). Imprinting, the X-chromosome, and the male brain: explaining sex differences in the liability to autism. Pediatr. Res. 47, 916. Skuse, D.H. (2005). X-linked genes and mental functioning. Hum. Mol. Genet. 14(Spec No 1), R27R32. Skuse, D.H., James, R.S., Bishop, D.V., Coppin, B., Dalton, P., AamodtLeeper, G., Bacarese-Hamilton, M., Creswell, C., McGurk, R., and Jacobs, P.A. (1997). Evidence from Turners syndrome of an imprinted X-linked locus affecting cognitive function. Nature 387, 705708. Sleutels, F., and Barlow, D.P. (2002). The origins of genomic imprinting in mammals. In Homology Effects, J.C. Dunlap and C.-t. Wu, eds. (San Diego: Academic Press), pp. 119154. Smits, G., Mungall, A.J., Grifths-Jones, S., Smith, P., Beury, D., Matthews, L., Rogers, J., Pask, A.J., Shaw, G., VandeBerg, J.L., et al.; SAVOIR Consortium. (2008). Conservation of the H19 noncoding RNA and H19-IGF2 imprinting mechanism in therians. Nat. Genet. 40, 971976. Solter, D. (1988). Differential imprinting and expression of maternal and paternal genomes. Annu. Rev. Genet. 22, 127146. Stadtfeld, M., Apostolou, E., Akutsu, H., Fukuda, A., Follett, P., Natesan, S., Kono, T., Shioda, T., and Hochedlinger, K. (2010). Aberrant silencing of imprinted genes on chromosome 12qF1 in mouse induced pluripotent stem cells. Nature 465, 175181. Stadtfeld, M., Apostolou, E., Ferrari, F., Choi, J., Walsh, R.M., Chen, T., Ooi, S.S., Kim, S.Y., Bestor, T.H., Shioda, T., et al. (2012). Ascorbic acid prevents loss of Dlk1-Dio3 imprinting and facilitates generation of all-iPS cell mice from terminally differentiated B cells. Nat. Genet. 44, 398405, S1S2. Starmer, J., and Magnuson, T. (2009). A new model for random X chromosome inactivation. Development 136, 110. Sun, B.K., Deaton, A.M., and Lee, J.T. (2006). A transient heterochromatic state in Xist preempts X inactivation choice without RNA stabilization. Mol. Cell 21, 617628. Sun, B., Ito, M., Mendjan, S., Ito, Y., Brons, I.G., Murrell, A., Vallier, L., Ferguson-Smith, A.C., and Pedersen, R.A. (2012). Status of genomic imprinting in epigenetically distinct pluripotent stem cells. Stem Cells 30, 161168. Sunderam, S., Kissin, D.M., Flowers, L., Anderson, J.E., Folger, S.G., Jamieson, D.J., and Bareld, W.D.; Centers for Disease Control and Prevention

(CDC). (2012). Assisted reproductive technology surveillanceUnited States, 2009. MMWR Surveill. Summ. 61, 123. Surani, M.A., Barton, S.C., and Norris, M.L. (1984). Development of reconstituted mouse eggs suggests imprinting of the genome during gametogenesis. Nature 308, 548550. Tachibana, M., Ma, H., Sparman, M.L., Lee, H.-S., Ramsey, C.M., Woodward, J.S., Sritanaudomchai, H., Masterson, K.R., Wolff, E.E., Jia, Y., and Mitalipov, S.M. (2012). X-chromosome inactivation in monkey embryos and pluripotent stem cells. Dev. Biol. 371, 146155. Tada, T., Obata, Y., Tada, M., Goto, Y., Nakatsuji, N., Tan, S., Kono, T., and Takagi, N. (2000). Imprint switching for non-random X-chromosome inactivation during mouse oocyte growth. Development 127, 31013105. Takagi, N., and Abe, K. (1990). Detrimental effects of two active X chromosomes on early mouse development. Development 109, 189201. Takagi, N., and Sasaki, M. (1975). Preferential inactivation of the paternally derived X chromosome in the extraembryonic membranes of the mouse. Nature 256, 640642. Tchieu, J., Kuoy, E., Chin, M.H., Trinh, H., Patterson, M., Sherman, S.P., Aimiuwu, O., Lindgren, A., Hakimian, S., Zack, J.A., et al. (2010). Female human iPSCs retain an inactive X chromosome. Cell Stem Cell 7, 329342. Terranova, R., Yokobayashi, S., Stadler, M.B., Otte, A.P., van Lohuizen, M., Orkin, S.H., and Peters, A.H. (2008). Polycomb group proteins Ezh2 and Rnf2 direct genomic contraction and imprinted repression in early mouse embryos. Dev. Cell 15, 668679. Thorvaldsen, J.L., and Bartolomei, M.S. (2007). SnapShot: imprinted gene clusters. Cell 130, 958. Thorvaldsen, J.L., Duran, K.L., and Bartolomei, M.S. (1998). Deletion of the H19 differentially methylated domain results in loss of imprinted expression of H19 and Igf2. Genes Dev. 12, 36933702. Thorvaldsen, J.L., Krapp, C., Willard, H.F., and Bartolomei, M.S. (2012). Nonrandom X chromosome inactivation is inuenced by multiple regions on the murine X chromosome. Genetics 192, 10951107. Tian, D., Sun, S., and Lee, J.T. (2010). The long noncoding RNA, Jpx, is a molecular switch for X chromosome inactivation. Cell 143, 390403. Tierling, S., Dalbert, S., Schoppenhorst, S., Tsai, C.E., Oliger, S., FergusonSmith, A.C., Paulsen, M., and Walter, J. (2006). High-resolution map and imprinting analysis of the Gtl2-Dnchc1 domain on mouse chromosome 12. Genomics 87, 225235. Tomoda, K., Takahashi, K., Leung, K., Okada, A., Narita, M., Yamada, N.A., Eilertson, K.E., Tsang, P., Baba, S., White, M.P., et al. (2012). Derivation conditions impact X-inactivation status in female human induced pluripotent stem cells. Cell Stem Cell 11, 9199. Turner, J.M., Mahadevaiah, S.K., Ellis, P.J., Mitchell, M.J., and Burgoyne, P.S. (2006). Pachytene asynapsis drives meiotic sex chromosome inactivation and leads to substantial postmeiotic repression in spermatids. Dev. Cell 10, 521529. Weksberg, R. (2010). Imprinted genes and human disease. Am. J. Med. Genet. C. Semin. Med. Genet. 154C, 317320. Weksberg, R., Shuman, C., and Smith, A.C. (2005). Beckwith-Wiedemann syndrome. Am. J. Med. Genet. C. Semin. Med. Genet. 137C, 1223. Williamson, C.M., Ball, S.T., Dawson, C., Mehta, S., Beechey, C.V., Fray, M., Teboul, L., Dear, T.N., Kelsey, G., and Peters, J. (2011). Uncoupling antisense-mediated silencing and DNA methylation in the imprinted Gnas cluster. PLoS Genet. 7, e1001347. Wutz, A. (2011). Gene silencing in X-chromosome inactivation: advances in understanding facultative heterochromatin formation. Nat. Rev. Genet. 12, 542553. Wutz, A., Rasmussen, T.P., and Jaenisch, R. (2002). Chromosomal silencing and localization are mediated by different domains of Xist RNA. Nat. Genet. 30, 167174. Xu, N., Tsai, C.L., and Lee, J.T. (2006). Transient homologous chromosome pairing marks the onset of X inactivation. Science 311, 11491152.

1322 Cell 152, March 14, 2013 2013 Elsevier Inc.

Xu, N., Donohoe, M.E., Silva, S.S., and Lee, J.T. (2007). Evidence that homologous X-chromosome pairing requires transcription and Ctcf protein. Nat. Genet. 39, 13901396. Yildirim, E., Kirby, J.E., Brown, D.E., Mercier, F.E., Sadreyev, R., Scadden, D.T., and Lee, J.T. (2013). Xist RNA is a potent suppressor of hematologic cancer in mice. Cell 152, 727742. Yin, Q.F., Yang, L., Zhang, Y., Xiang, J.F., Wu, Y.W., Carmichael, G.G., and Chen, L.L. (2012). Long noncoding RNAs with snoRNA ends. Mol. Cell 48, 219230.

Zhao, J., Sun, B.K., Erwin, J.A., Song, J.J., and Lee, J.T. (2008). Polycomb proteins targeted by a short repeat RNA to the mouse X chromosome. Science 322, 750756. Zhao, J., Ohsumi, T.K., Kung, J.T., Ogawa, Y., Grau, D.J., Sarma, K., Song, J.J., Kingston, R.E., Borowsky, M., and Lee, J.T. (2010). Genome-wide identication of polycomb-associated RNAs by RIP-seq. Mol. Cell 40, 939953.

Cell 152, March 14, 2013 2013 Elsevier Inc. 1323

Review
Epigenetics of Reprogramming to Induced Pluripotency
Bernadett Papp1,2,3,4 and Kathrin Plath1,2,3,4,*
of Biological Chemistry, David Geffen School of Medicine Comprehensive Cancer Center 3Bioinformatics Interdepartmental Degree Program, Molecular Biology Institute 4Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research University of California, Los Angeles, Los Angeles, CA 90095, USA *Correspondence: kplath@mednet.ucla.edu http://dx.doi.org/10.1016/j.cell.2013.02.043
2Jonsson 1Department

Leading Edge

Reprogramming to induced pluripotent stem cells (iPSCs) proceeds in a stepwise manner with reprogramming factor binding, transcription, and chromatin states changing during transitions. Evidence is emerging that epigenetic priming events early in the process may be critical for pluripotency induction later. Chromatin and its regulators are important controllers of reprogramming, and reprogramming factor levels, stoichiometry, and extracellular conditions inuence the outcome. The rapid progress in characterizing reprogramming is beneting applications of iPSCs and is already enabling the rational design of novel reprogramming factor cocktails. However, recent studies have also uncovered an epigenetic instability of the X chromosome in human iPSCs that warrants careful consideration.
Decades of research were dedicated to studies of cell fate changes during development and led to the view that, in vivo, differentiated cells are irreversibly committed to their fate. However, reprogramming of somatic cells by transfer into enucleated oocytes pioneered by John Gurdon and colleagues in the 1950s (Gurdon et al., 1958), fusion with other cell partners (Blau et al., 1983), and ectopic transcription factor expression (Davis et al., 1987; Takahashi and Yamanaka, 2006) revealed a remarkable plasticity of the differentiated state. Particularly the exposure to ectopic transcription factors offers a powerful and unexpectedly exible technique to shift a somatic cell toward alternative somatic identities or pluripotency. The reprogramming eld exploded after Takahashi and Yamanaka established a major landmark with the generation of induced pluripotent stem cells (iPSCs) from broblasts by simple ectopic expression of Oct4 (O), Sox2 (S), cMyc (M), and Klf4 (K) (Takahashi and Yamanaka, 2006). Aptly, the Nobel Prize awarded to John Gurdon and Shinya Yamanaka in 2012 symbolizes the extraordinary contribution that reprogramming experiments have made (and will make) to our understanding of cellular identity and the apparently unlimited practical applications of iPSCs and other reprogrammed cells. This Review focuses on reprogramming to iPSCs. The beauty of transcription-factor-induced reprogramming to iPSCs lies in its simplicity and robustness, as many different cell types from a wide range of species can be reprogrammed to pluripotency by ectopic expression of OSKM (for a recent summary, see Stadtfeld and Hochedlinger [2010]). A fundamental feature of the resulting iPSCs is that they are, in their ideal state, functionally indistinguishable from embryonic stem cells (ESCs), which are pluripotent cells derived from preimplantation embryos,
1324 Cell 152, March 14, 2013 2013 Elsevier Inc.

and are capable of differentiation into cells of all three germ layers (Bock et al., 2011; Carey et al., 2011). Consequently, reprogramming changes the transcriptome and chromatin state of the somatic cell to that of a pluripotent cell (Chin et al., 2009; Hawkins et al., 2010; Lister et al., 2011; Maherali et al., 2007; Mikkelsen et al., 2008; Okita et al., 2007; Takahashi and Yamanaka, 2006; Wernig et al., 2007). Therefore, iPSCs offer an invaluable source of patient-specic pluripotent stem cells for disease modeling, drug screening, toxicology tests, and regenerative medicine (recently reviewed in Onder and Daley [2012]; Trounson et al., 2012), and already have been employed to unmask novel insights into human diseases (Koch et al., 2011). Despite the extraordinary delity of the iPSC technology, the induction of pluripotency upon OSKM expression typically requires an extended latency period of around 12 weeks and occurs in less than 1% of the starting cells, even when they are genetically identical and the expression levels of the four transcription factors are similar across all cells in the culture dish (for a review, see Stadtfeld and Hochedlinger [2010]). Although heterogeneity of the starting cell population and differentiation state may affect reprogramming efciency to a certain degree (Stadtfeld and Hochedlinger, 2010), a key question has been why only a few of a pool of seemingly equivalent OSKM-expressing cells induce pluripotency. Genomic approaches, RNAi screens, and simpler genetic methods, as well as emerging single-cell analyses, are beginning to provide answers by dening critical reprogramming events as well as regulators and epigenetic properties that promote or hinder reprogramming transitions, which we will focus on in the rst part of this Review. Particularly the activation of pluripotency genes appears to present a formidable task for the reprogramming factors.

Generally, transcriptional activation begins with the binding of transcription factors to distal enhancer and promoter elements, which initiates the recruitment of coactivators and facilitates the binding of the general transcription machinery and the assembly of the RNA polymerase-II-containing preinitiation complex (PIC) at the core promoter (Green, 2005). Transcription factors can also promote steps in the transcription process subsequent to PIC assembly (which is of interest for the reprogramming factor cMyc) (Green, 2005). Importantly, the packaging of DNA into nucleosomes affects all aspects of transcription, from transcription factor binding to PIC formation and transcriptional elongation (Beato and Eisfeld, 1997; Li et al., 2007). The ability of transcription factors to bind their recognition elements is further modulated by changes in chromatin structure, including DNA methylation, histone modications, histone variants, or ATP-dependent chromatin remodeling. Chromatin therefore plays a critical role in the establishment of cell-typespecic expression patterns and is responsible for the extreme stability of a given cellular identity under physiological conditions, ensuring the stable silencing of lineage-inappropriate genes and restricting transcription factor action to only a subset of their target motifs in the genome (Filion et al., 2010; Gaetz et al., 2012). In differentiated cells, pluripotency loci therefore appear to be in an unfavorable chromatin landscape for binding by most transcription factors. However, we will discuss the remarkable capability of reprogramming factors to engage closed chromatin and induce extensive chromatin changes early in reprogramming before any major transcriptional changes take place, unmasking interesting parallels between reprogramming and developmental processes and highlighting the power of the OSKM reprogramming cocktail. Together, these recent ndings have transformed the iPSC system into a powerful model for the dissection of mechanisms underlying cell fate transitions. The reprogramming process is most scrutinized in the mouse system, but studies of the induced pluripotent state have been extensively performed for both mouse and human iPSCs. Most likely due to the fact that conventional mouse and human iPSCs represent different states of pluripotency, these cells differ epigenetically, as highlighted by their X chromosome inactivation state. In the second part of this Review, we will discuss a selection of recent studies that revealed an epigenetic instability of the inactive X chromosome in female human iPSCs, reminiscent of processes in human ESCs, and we will focus on the implications of these ndings for the utility of iPSCs. Steps Leading to the iPSC State The development of improved reprogramming techniques that include homogeneous and inducible reprogramming factor expression systems (summarized in Stadtfeld and Hochedlinger [2010]) has enabled a more detailed view of the mechanism underlying reprogramming despite the fact that only few starting cells become iPSCs. Mouse embryonic broblasts are most commonly used as a starting cell type for the dissection of the reprogramming process due to the ease of culture and the possibility of derivation from different genetic backgrounds and mouse models. Current evidence argues that reprogramming of these cells to iPSCs requires cell division (Hanna et al.,

2009) and is a multistep process in which the successful induction of the pluripotent state entails the transition through sequential gene expression states (or intermediates) (Figure 1). Failure to transition through any of these steps would lead to a block in reprogramming and would account for the low overall reprogramming efciency. Consistent with this model, it was shown early on by the Jaenisch and Hochedlinger groups that reprogramming cultures represent heterogeneous cell populations that can be resolved based on the expression of cell surface markers (Brambrink et al., 2008; Stadtfeld et al., 2008). Utilizing specic surface marker combinations, cells poised to become iPSCs can be enriched at different times of reprogramming. This knowledge allowed the inference of a reprogramming path in which successfully reprogramming cells rst downregulate the broblast-associated marker Thy1 and then transition to a state that is positive for the embryonic marker SSEA1 and, nally, induce the full pluripotency network (Brambrink et al., 2008; Polo et al., 2012; Stadtfeld et al., 2008) (Figure 1). The downregulation of Thy1 occurs in a large fraction of starting cells, the subsequent gain of SSEA1 only in a subset of Thy1-negative cells, and the induction of the pluripotency network in a small subset of SSEA1-positive cells, indicating that transitions between each of these steps occur with low probability (Figure 1). Cells that are unable to silence Thy1 relatively quickly upon OSKM expression become refractory to the action of the reprogramming factors and can yield iPSCs though with dramatic delay and at much lower efciency (Polo et al., 2012). Accordingly, a single-cell cloning experiment demonstrated that virtually all starting cells have the potential to induce pluripotency in a small subset of their daughter cells when reprogramming is followed over a 6 month period (Hanna et al., 2009). The intermediate states dened by cell sorting experiments likely represent the most favored possibilities on the path of reprogramming. Further purication of reprogramming intermediates should be feasible and provide insight into whether all reprogramming cells have to pass through the same stages to induce pluripotency. Of interest, SSEA1-positive intermediate cells are still plastic early in reprogramming in that some of these cells can regress to the Thy1-positive (i.e., an earlier) reprogramming state in the presence of reprogramming factor expression. By contrast, later in reprogramming, these cells appear to have matured and become much more committed to progressing to the pluripotent state (Polo et al., 2012), indicating that cellular identity is only stabilized and locked in toward the end of the reprogramming process. Genome-wide transcriptional proling was used to further delineate the sequence of events that drive reprogramming. Initially, cells appear to respond relatively homogeneously to the expression of the reprogramming factors (Polo et al., 2012) and robustly silence typical mesenchymal genes expressed in broblasts (such as Snai1, Snai2, Zeb1, and Zeb2) (Li et al., 2010; Mikkelsen et al., 2008; Polo et al., 2012; Samavarchi-Tehrani et al., 2010). These events lead to the activation of epithelial markers (such as Cdh1, Epcam, and Ocln) in a process called mesenchymal-to-epithelial transition (MET), which seems critical for the early reprogramming phase and is accompanied by morphological changes, increased proliferation, and the formation of cell clusters (Li et al., 2010; Mikkelsen
Cell 152, March 14, 2013 2013 Elsevier Inc. 1325

Figure 1. The Generation of iPSCs Is a Multistep Process that Can Be Modulated by Extracellular Cues and Reprogramming Factor Levels
Known events occurring in early, middle, and late phases during the OSKM-mediated reprogramming of mouse embryonic broblasts to iPSCs are depicted. During the nal emergence of fully reprogrammed iPSCs, so-called reprogramming-competent cells appear to be inhibited by the continued expression of the factors. The reprogramming process can be preferentially trapped in partially reprogrammed states when certain reprogramming factor levels and/or stoichiometries are employed (top) or can be redirected to a different cell identity, without going through the pluripotent state, by changing culture/growth factor conditions and timing of OSKM expression (bottom).

et al., 2008; Samavarchi-Tehrani et al., 2010; Smith et al., 2010). Notably, the aforementioned transition to the SSEA1-positive state appears to correlate with the occurrence of MET (Polo et al., 2012; Samavarchi-Tehrani et al., 2010) (Figure 1). The key characteristic of subsequent reprogramming phase is the gradual activation of pluripotency-associated genes (Brambrink et al., 2008; Buganim et al., 2012; Golipour et al., 2012; Mikkelsen et al., 2008; Polo et al., 2012; Samavarchi-Tehrani et al., 2010; Sridharan et al., 2009; Stadtfeld et al., 2008). For example, the pluripotency loci Nanog and Sall4 are transcriptionally upregulated at a late intermediate stage, whereas others, such as Utf1 or endogenous Sox2, are induced even later, closely mirroring the acquisition of the full pluripotency expression programming (Figure 1). Although detailed time course studies describing these transitions in reprogramming cells still need to be performed at the single-cell level, a recent single-cell expression study that compared the expression of candidate genes at various reprogramming stages strongly supports a series of consecutive pluripotency gene activation steps late in the reprogramming process (Buganim et al., 2012). Together, these events culminate in the establishment of the pluripotent state that can be sustained independently of ectopic reprogramming factor expression (Brambrink et al., 2008; Maherali et al., 2007; Okita et al., 2007; Stadtfeld et al., 2008; Wernig et al., 2007).
1326 Cell 152, March 14, 2013 2013 Elsevier Inc.

Modifying the Reprogramming Process Early studies employing inducible reprogramming factor expression systems indicated that reprogramming intermediates are dependent on continued OSKM expression to complete the reprogramming process (Brambrink et al., 2008; Stadtfeld et al., 2008). In addition, evidence is growing that the efciency of reprogramming is strongly inuenced by the levels of the reprogramming factors. For example, broblasts engineered to express a higher dose of OSKM in all cells have a dramatically enhanced ability to induce pluripotency (Polo et al., 2012). A peculiar observation is that cells that become refractory to reprogramming early on (and stay Thy1 positive) have dramatically reduced protein levels of the four reprogramming factors compared to cells that are able to progress toward pluripotency (Polo et al., 2012). Because the RNA levels of the reprogramming factors are similar between these two cell populations, these transcription factors may be prone to increased ubiquitination and degradation specically in refractory cells (Buckley et al., 2012; Polo et al., 2012). Furthermore, the inability to sustain high reprogramming factor expression contributes strongly to the reprogramming block in refractory cells, as a further increase in OSKM expression specically in these cells induces them to convert to the next reprogramming stage and subsequently to iPSCs more efciently (Polo et al., 2012). Although continuity of reprogramming factor expression is essential for driving somatic

cells toward pluripotency, a recent study pointed out that high levels of ectopic OSKM during the nal reprogramming steps may be inhibitory to the efcient induction of the full pluripotency network (Golipour et al., 2012) (Figure 1). This nding is consistent with the observations that retrovirally expressed reprogramming factors are efciently turned off in faithfully reprogrammed cells (Maherali et al., 2007; Okita et al., 2007; Wernig et al., 2007) and that the activation of endogenous pluripotency regulators during reprogramming coincides with transgene independence (Stadtfeld et al., 2008). The reduction of ectopic reprogramming factors at the end of reprogramming may be necessary because even a modest increase in Oct4 levels in ESCs is detrimental to the pluripotent state (Niwa et al., 2000). Not just overall levels and timing, but also the specic balance of the reprogramming factors relative to each other are critical for the outcome of reprogramming (Figure 1). For example, many studies agree that high Oct4 levels and low levels of Sox2 increase the efciency of reprogramming (Nagamatsu et al., 2012; Tiemann et al., 2011; Yamaguchi et al., 2011). High Sox2 levels have been associated with the stronger induction of developmental markers during reprogramming, which may guide cells away from the path to pluripotency (Yamaguchi et al., 2011). Moreover, even though ectopic expression of cMyc enhances reprogramming, it also leads to emergence of a large fraction of partially reprogrammed ESC-like colonies trapped before the upregulation of the pluripotency program (Nakagawa et al., 2008; Wernig et al., 2008). Remarkably, differences in reprogramming factor stoichiometry appear to have consequences for the epigenetic state and developmental potential of the resulting iPSCs (Carey et al., 2011). This is an interesting result in light of the ongoing debate on epigenetic differences between iPSCs and ESCs (for a recent discussion, see Lowry, [2012]) and suggests that at least some (and maybe all) of the observed variations between iPSCs and ESCs are not inherent to the reprogramming process but are due to experimental variables that often are not easy to control, highlighting how a better understanding of the mechanisms underlying reprogramming will benet the production of safer iPSC lines. The efciency of iPSC formation can also be improved by altering media composition and growth factor conditions (Chen et al., 2011; Esteban et al., 2010; Ichida et al., 2009; Li et al., 2010; Samavarchi-Tehrani et al., 2010). Though it is likely that downstream effectors of signaling pathways directly alter the transcriptional output of their target genes, specic culture conditions can also modulate the activity and levels of chromatin regulators, thereby indirectly affecting OSKM functionality (Chen et al., 2013; Marks et al., 2012; Wang et al., 2011a; Zhu et al., 2013). To mention just one example, vitamin C (ascorbic acid) addition to the media increases reprogramming efciency and potentially the quality of resulting iPSCs at least in part by inuencing the functionality of histone demethylases that depend on iron (Esteban et al., 2010; Stadtfeld et al., 2012; Wang et al., 2011a). Notably, by supplementing OSKM-reprogramming cultures with a growth factor cocktail normally required for the establishment and maintenance of epiblast stem cells (EpiSCs), mouse broblasts can be reprogrammed to an EpiSC-like state instead of the ESC-like iPSC state (Han et al., 2011) (Figure 1). Mouse

EpiSCs and ESCs capture two different states of pluripotency, which will be discussed in greater detail in the second part of this Review. During the last couple of years, it has also become clear that OSKM (or a subset of these factors) can even prompt the establishment of various somatic cell fates, including cardiomyocytes, blood progenitors, and neural stem cells, when overexpressed temporally and guided by appropriate extracellular cues, without the transition through the pluripotent state (Figure 1) (reviewed in Sancho-Martinez et al. [2012]). The induction of various developmental regulators at intermediate stages of reprogramming to pluripotency may explain why OSKM can efciently redirect the reprogramming path to other cell identities upon exposure to suitable signaling cues and likely reects a function of Sox2 and Klf4 as critical regulators of various differentiation paths during development (Mikkelsen et al., 2008; Polo et al., 2012; Sridharan et al., 2009). Alternatively, and not mutually exclusive, reprogramming intermediates arising due to OSKM expression may represent normally occurring developmental progenitor states. Though the picture is emerging that signaling cues affect the cell fate choices made during reprogramming and/or lead to the stabilization of particular cell identities that arise during the process, still relatively little is known about the exact role of signaling pathways and their downstream regulators in reprogramming and the intersection with the reprogramming factors. Comparing the molecular dynamics of OSKM-dependent induction of pluripotency and alternative cell fates should demonstrate how cell fate decision processes can be efciently modulated and will facilitate the development of patient-specic somatic cell populations for clinical applications. Dening the Target Repertoire of the Reprogramming Factors One approach toward a better understanding of the cascade of molecular events underlying the establishment of pluripotency is the denition of reprogramming factor targets at different stages of the reprogramming process. It is generally believed that three of the four reprogramming factors, Oct4, Sox2, and Klf4, are necessary for the induction of pluripotency because they are critical components of an intrinsic and highly stable pluripotency network (Boyer et al., 2005; Chen et al., 2008; Jiang et al., 2008; Kim et al., 2008; Loh et al., 2006; Sridharan et al., 2009). Oct4, Sox2, and Klf4 tend to colocalize at many cell-type-specic enhancers in ESCs, often together with additional pluripotency transcription factors like Nanog, Esrrb, Klf2, Sall4, and Zfp42 and signaling pathway regulators such as Smad1 and Stat3 (Chen et al., 2008; Kim et al., 2008), reinforcing the importance of OSK for the pluripotent state and the view that enhancers are sentinels of cell-type-specic gene expression patterns (Visel et al., 2009). The integration of numerous pluripotency transcription factors and signaling cues at these enhancers ensures the expression of many genes with known roles in pluripotency and provides stability to the ESC gene expression program. Another important aspect of the pluripotency network is that many pluripotency transcription factors constitute a transcriptional circuit wired in a feed-forward type of regulation, as they induce their own expression and positively regulate each other (Boyer et al., 2005; Chen et al., 2008; Jiang et al., 2008; Kim et al., 2008) (Figure 2A).
Cell 152, March 14, 2013 2013 Elsevier Inc. 1327

Figure 2. Features of OSKM in ESCs and during Reprogramming


(A) In ESCs, Oct4, Sox2, and Klf4 bind their own and each others promoters and enhancers, as well as those of many additional ESC-specic (pluripotency) genes. Further contributing to the pluripotency circuitry, many of these ESC-specic genes are also bound by various additional pluripotency regulators, including Nanog and Esrrb, such that ESC-specic enhancers represent hot spots of pluripotency transcription factor binding. (B) In ESCs (and many other cell types), cMyc targets most actively transcribed genes at the core promoter by binding high-afnity E box sequences and functions by enhancing transcriptional elongation. Expression levels correlate with cMyc occupancy. Upon overexpression, cMyc does not appear to regulate new target genes but amplies the existing gene expression pattern by binding the same genes at elevated levels and occupying additional, low-afnity E-box-like sequences in both the core promoter and enhancer regions of these genes. (C) Scheme illustrating different contributions of the reprogramming factors to the late phase of reprogramming, highlighting separable engagement of OSK and cMyc. Many genes occupied by cMyc in ESCs/iPSCs are already bound by this transcription factor and are expressed in partially reprogrammed cells, which represent a clonal, late reprogramming intermediate. By contrast, OSK bind the promoter regions of many of their ESC/iPSC-specic target genes only late in reprogramming, accompanying their transcriptional upregulation. This is particularly obvious for those genes that are cobound by OSK in their promoter region in ESCs. (D) Chromatin can affect the ability of transcription factors to bind to their DNA motifs, which is thought to explain why most transcription factors bind to only a small subset of their recognition motifs in the genome. Here, we summarize the chromatin preferences of the four reprogramming factors early in reprogramming.

By contrast, cMyc is unique among the reprogramming factors, as it is neither a component of the core pluripotency network (Chen et al., 2008; Kim et al., 2010) nor absolutely necessary for reprogramming to iPSCs (Nakagawa et al., 2008; Wernig et al., 2008). Indeed, cMyc is a central player in many diverse biological processes, including cell growth and differentiation. Two recent reports (Lin et al., 2012; Nie et al., 2012) strongly support a model in which cMyc is not a transcription factor that is responsible for OFF/ON switches of its target genes as proposed for OSK. Instead, cMyc is a nonlinear amplier of transcriptional outputs that acts universally on active genes containing the E box DNA motif. Mechanistically, cMyc promotes transcription by regulating RNA polymerase II pause-release and by increasing the rate of transcriptional elongation (Rahl et al., 2010). Therefore, cMyc occupies the core promoter regions of many active genes in ESCs/iPSCs and is typically
1328 Cell 152, March 14, 2013 2013 Elsevier Inc.

not present at enhancers (Chen et al., 2008; Kim et al., 2010; Nie et al., 2012; Sou et al., 2012; Sridharan et al., 2009) (Figure 2B). Analysis of cMyc binding across different inducible expression levels in tumor cells demonstrated that cMyc predominantly binds high-afnity E box sites at core promoters of almost all active genes when expressed at low levels but spills over to weaker E box sites within enhancers of the same active genes upon higher expression, likely because promoter sites become saturated (Figure 2B) (Lin et al., 2012; Nie et al., 2012). Thus, the target repertoire of cMyc does not change when cMyc is strongly expressed, but transcriptional output is increased. The signicant differences between OSK and cMyc have important implications for the reprogramming process. Oct4, Sox2, and Klf4 are probably crucial for specifying cell fate change in reprogramming, whereas cMyc may simply act by amplifying arising expression changes due to OSK action at

genes that contain E boxes, potentially helping to trap genes in the ON state. The low efciency of reprogramming makes the application of genome-wide analysis techniques of reprogramming factor binding, such as chromatin immunoprecipitation combined with massive parallel sequencing (ChIP-seq), challenging for cells at intermediate stages of the reprogramming process. To circumvent this problem, our lab initially mapped reprogramming factor binding within promoter regions in iPSCs and in partially reprogrammed cellswhich represent a clonal, trapped late reprogramming intermediate expanded from ESC-like colonies that arise in reprogramming cultures and fail to express pluripotency regulatorsand compared occupancy data with gene expression patterns (Sridharan et al., 2009). In both cell types, genes co-occupied by the reprogramming factors are highly expressed, indicating that an intrinsic property of reprogramming factor cobinding is to activate genes. Interestingly, genes that are more highly expressed in partially reprogrammed cells than in ESCs are often more efciently targeted by the OSKM factors in the intermediate state than in ESCs, whereas genes more highly expressed in ESCs are generally less bound in partially reprogrammed cells than in ESCs. Thus, many genes are more strongly expressed in partially reprogrammed cells compared to ESCs due to targeting of the four factors to promoter regions that they do not normally bind in ESCs, and conversely, the failure to activate ESC-specic genes appears to result from the inability of the factors to bind these genes in the intermediate state. These ndings are consistent with the reprogramming factors being directly responsible for the ectopic expression of developmental genes in reprogramming intermediates, which is known to hinder reprogramming (Mikkelsen et al., 2008). Notably, the widespread lack of ESC-specic promoter binding in partially reprogrammed cells impinges more dramatically on Oct4, Sox2, and Klf4 than on cMyc and particularly affects many pluripotency-related genes that are co-occupied by combinations of Oct4, Sox2, and Klf4 in ESCs (Figure 2C). In the case of these genes, it appears that the OSK promoter engagement occurs only toward the very end of the reprogramming process and is likely required for their transcriptional activation (Figure 2C). These ndings not only demonstrate a separable contribution of cMyc and OSK to the activation of various pluripotency loci and a change in the reprogramming factor target repertoire during the reprogramming process, but also indicate that the promoter engagement of key pluripotency genes is a critical task for reprogramming. Recently, Zaret and colleagues obtained a picture of the initial chromatin engagement of the reprogramming factors by performing ChIP-seq 48 hr after the induction of reprogramming factor expression in human broblasts (Sou et al., 2012), when most cells still undergo very similar expression changes (see above) (Polo et al., 2012). Comparing OSKM-binding patterns between the early reprogramming stage and the pluripotent state, Zaret and colleagues made two interesting observations (Sou et al., 2012). First, many more genes are bound by all four factors early in reprogramming than in the pluripotent state, which could be due to the high expression levels of the induced factors. In addition, OSKM binding of apoptosis-regulating genes early in the process suggests that the extensive cell death

apparent in reprogramming cultures (reviewed in Plath and Lowry [2011]) is a direct consequence of reprogramming factor binding, potentially representing a general cellular defense mechanism against ectopic transcription factor expression (Sou et al., 2012). Furthermore, initial target genes of the reprogramming factors are signicantly enriched for regulators of MET, the critical early reprogramming event discussed above, whereas pluripotency loci such as NANOG and DPPA4 are not yet bound, corroborating that a redistribution of OSKM binding occurs as cells move along the reprogramming path and suggesting that, initially, the reprogramming factors directly target at least some of the genes that transcriptionally change early in the process. The second and more surprising nding is that the reprogramming factors interact extensively with distal genomic sites, including some known enhancers. Indeed, 85% of all initial binding events occur distal to promoter regions (Sou et al., 2012). Because it appears that, in the pluripotent state, the transcription factors have shifted to a binding pattern that includes promoter regions much more strongly, Zaret and colleagues proposed that the binding of the reprogramming factors to distal elements is an early step in reprogramming that precedes promoter binding and transcriptional activation of many target genes (Sou et al., 2012). Reprogramming Factors as Pioneers The next question then is which features anticipate the recruitment of ectopically expressed OSKM? The DNA motifs of the four factors are enriched at their respective binding sites, indicating that they are recruited directly through their sequence motifs rather than randomly targeting or scanning the genome (Sou et al., 2012; Sridharan et al., 2009). However, transcription factors work in a concentration-dependent manner and will, at higher concentration, also occupy DNA sites of lower afnity, which may be important for reprogramming, during which very high levels of ectopic OSKM are expressed (Lin et al., 2012; Nie et al., 2012; Sou et al., 2012) (Figure 2B). Notably, lineage specication factors present in the starting cell type may contribute to the targeting of the reprogramming factors to a subset of their DNA motifs. For example, during lineage development, Sox transcription factors often occupy sites premarked by other Sox proteins that were expressed in the previous developmental stage (Bergsland et al., 2011). If such lineage-specic factors are involved in the initial targeting of the reprogramming factors, one might predict that reprogramming factors will target different genomic locations in different starting cell types. Importantly, chromatin is thought to strongly affect the ability of transcription factors to bind their cognate DNA motifs, and certain chromatin states, characterized for example by the presence of specic combinations of histone modications, may be especially conducive to DNA binding by specic transcription factors (Filion et al., 2010). As expected, binding of the reprogramming factors does occur in open and accessible chromatin, marked by active histone modications such as H3K4 methylation (Sou et al., 2012; Sridharan et al., 2009) (Figure 2D). Among the reprogramming factors, cMYC binding is much more strictly associated with a pre-existing active chromatin state than that of OSK (Sou et al., 2012; Sridharan et al., 2009), which is
Cell 152, March 14, 2013 2013 Elsevier Inc. 1329

consistent with active chromatin being a prerequisite for the binding of cMyc (Guccione et al., 2006) (Figure 2D). An astonishing observation by Zaret and colleagues is that the vast majority (around 70%) of reprogramming factor binding events early in human broblast reprogramming occurs within genomic regions that display a closed chromatin state in the starting broblasts characterized by the absence of DNase hypersensitivity and, surprisingly, any histone modications (Sou et al., 2012). Thus, the reprogramming factors can efciently access their target sequences within genomic regions that are packed with nucleosomes and are probably even further condensed into higher-order structures. This is particularly true for OSK and, to a much lesser extent, for cMYC (Sou et al., 2012) (Figure 2D). Indeed, the ability of cMYC to access target sites in closed chromatin is dependent on OSK occupancy (Sou et al., 2012). OSK can occupy OSKM cobound sites in the absence of ectopic cMYC, but cMYC cannot bind when overexpressed in the absence of ectopic OSK. In turn, ectopic cMYC enhances the initial binding of OSK to these sites when expressed together. These data are in agreement with cMyc potentiating the action of the other three reprogramming factors rather than initiating these events. In comparison to naked DNA, nucleosomal DNA is less accessible for DNA-binding factors (Beato and Eisfeld, 1997), and the majority of transcription factors cannot bind their cognate sites when sequestered within a nucleosome and need a structural change in the associated nucleosome or a nucleosome-free region for binding (Wallrath et al., 1994), highlighting an important functionality of OSK. Cooperative binding or simultaneous engagement of neighboring binding sites could explain the ability of OSK to interact with nucleosomal-binding sites (Adams and Workman, 1995). For instance, binding of one factor might partially destabilize a nucleosome, allowing the other transcription factor(s) to access sites that were previously buried. However, each of the OSK-reprogramming factors alone can also target sites in closed chromatin, i.e., without the other two factors being detected at those sites (Sou et al., 2012). Therefore, Zaret and colleagues proposed that Oct4, Sox2, and Klf4 each can act as pioneer factors that are able to access closed chromatin on their own without the help of additional transcription factors (Sou et al., 2012). There is additional evidence in support of this idea. First, based on three-dimensional (3D) structures, Oct4, Sox2, and Klf4, but not cMyc, interact with one side of the DNA helix when bound to DNA, potentially allowing them to bind DNA in the context of the nucleosome (Beato and Eisfeld, 1997; Sou et al., 2012). Second, a comparison of nucleosome occupancy with binding of Oct4 and Sox2 in ESCs genome-wide suggests that Oct4 and Sox2 can, at least in part, interact with nucleosomal DNA (Teif et al., 2012). Third, Sp1, a transcription factor belonging to the same family of highly related transcription factors as Klf4, can bind nucleosomal DNA in vitro, making it reasonable to anticipate that Klf4 will share SP1s capacity (Li et al., 1994). Fourth, it was found that preexisting nucleosomes at the enhancer and promoter regions of the OCT4 and NANOG gene loci are displaced when OCT4 is ectopically expressed in differentiated cells (i.e., in the absence of any other reprogramming factors) (You et al., 2011). This chromatin reorganization coincided with Oct4 binding, suggesting
1330 Cell 152, March 14, 2013 2013 Elsevier Inc.

that Oct4 is able to directly access DNA sites that are internal to a nucleosome and establish a nucleosome-depleted region (You et al., 2011). The idea of OSK acting as pioneer factors in reprogramming is exciting because it is reminiscent of developmental decisions, wherein pioneer factor binding at enhancers occurs early (Gualdi et al., 1996). The efcient activation of lineage-specic genes during development often requires a cascade of DNA-transcription factor interactions and chromatin changes at their enhancer and promoter regions, which begin long before these genes are transcribed (Zaret and Carroll, 2011). Pioneer transcription factors initiate this series of events by accessing tissue-specic enhancers already at a very early developmental stage and by inducing chromatin decondensation, remodeling, and/or a change in local chromatin modications, thereby priming enhancer and promoter regions for binding by additional transcription factors and transcriptional activation at a later stage of development. Thus, pioneer factors are initiator factors that make regulatory regions competent for activation in response to the right stimulus. In the context of reprogramming, the binding of OSK to closed chromatin early in reprogramming could therefore be a crucial step for events that happen later in the process, particularly considering that some of these distal binding events overlap with known enhancers. One may speculate that Oct4, Sox2, and Klf4 can engage at least some ESC-specic enhancers early in reprogramming even though they are locked up in closed chromatin in the starting broblasts, poising them for promoter binding and transcriptional activation later in the process. In the next section, we will provide additional evidence in support of such epigenetic priming by focusing on chromatin changes that occur early in the reprogramming process. Chromatin Changes in Promoters and Enhancers Early in Reprogramming An analysis of the initial transcriptional and chromatin changes early in mouse cell reprogramming (i.e., 2472 hr after induction of the reprogramming factors) revealed striking parallels to the initial reprogramming factor binding pattern (Koche et al., 2011). First, gene expression changes, both up and down, are largely conned to genes with promoter regions carrying active chromatin marks in the starting broblasts (i.e., in regions marked by enrichment of H3K4me3, a modication associated with the transcriptional start sites of active and poised genes) (Koche et al., 2011). The restriction of expression changes to genes that are already in an open and accessible chromatin conguration is consistent with the fact that the perturbation of the somatic gene expression program is the major response early in the reprogramming process (Koche et al., 2011; Mikkelsen et al., 2008; Polo et al., 2012; Samavarchi-Tehrani et al., 2010; Sridharan et al., 2009). Unexpectedly, changes in histone modications are much more widespread than initial changes in gene expression, indicating that an extensive genome-wide chromatin remodeling takes place as immediate response to reprogramming factor expression (Koche et al., 2011). In addition to chromatin changes associated with gene expression switches, H3K4me2 (a histone mark associated with active or poised promoters and enhancers)

Figure 3. Chromatin Dynamics during Reprogramming


Many broblast-specic promoters and enhancers are decommissioned early in reprogramming (after 2448 hr of reprogramming factor expression) by loss of active H3K4 methylation marks but appear to gain DNA methylation only late in reprogramming. ESC-specic enhancers and promoters can be divided into at least two groups: those with dramatic changes in histone modications already early in reprogramming, long before their transcriptional activation, and those that undergo histone modication changes only much later in the process. One key difference between these groups appears to be the DNA methylation state. For example, the rst group includes many pluripotency genes with CpG-dense promoter elements (indicated by higher density of circles) that are hypomethylated in broblasts.

rapidly emerges de novo in many promoter regions in the absence of transcriptional changes and even before any cell division has taken place (Figure 3). Many of these promoters belong to genes that are transcriptionally activated later in reprogramming, including various pluripotency regulators like Sall4, Pecam1, FoxD3, and Lin28. The gain of H3K4me2 is not accompanied by simultaneous accumulation of the H3K4me3 mark and often occurs on a nucleosome that covers the transcriptional start site. Because nucleosomes at transcriptional start sites are incompatible with the assembly of the basic transcriptional machinery (Lorch et al., 1987), nucleosome depletion must be one of the subsequent steps that allows transactivation of these genes later in reprogramming. Interestingly, promoters with H3K4me2 gain early in reprogramming often display a high CpG density and are enriched for CpG islands (Koche et al., 2011) (Figure 3), which may obviate the need for extensive chromatin remodeling and therefore facilitate quick changes in chromatin structure due to lower nucleosome occupancy (RamirezCarrozzi et al., 2009). Compared to promoters, chromatin changes at enhancers are even more prominent early in reprogramming (Koche et al., 2011), which is consistent with the observations that many enhancers are active in only a single cell type and that the chromatin state of enhancers is more variable across cells types than that of promoters (Heintzman et al., 2009). The systematic mapping of enhancers is now possible genome-wide because specic enhancer-associated chromatin signatures have been

identied that even reveal the activity of the enhancer (Creyghton et al., 2010; Heintzman et al., 2009; Koche et al., 2011; RadaIglesias et al., 2011). In the active state (i.e., when associated with an actively transcribed gene), enhancer elements are demarcated by domains of H3K27Ac and H3K4me1/me2, but not H3K4me3. In association with inactive genes, enhancers can be in one of two states: unmarked (i.e., inactive), lacking all of the features that are associated with the active enhancer state, or poised, carrying H3K4me1/me2 in the absence of H3K27ac. It is thought that poised enhancers are important for the plasticity of developmental decisions, as a subset can acquire the signature of active enhancers upon change in external stimuli. The specic enhancer state therefore appears to strongly inuence the ability of the cell to respond to environmental or developmental stimuli. For example, immediate transcriptional changes to a new signaling cue are often restricted to genes with active and/or poised enhancers, whereas inactive genes with unmarked (inactive) enhancers remain refractory (Ghisletti et al., 2010; Heintzman et al., 2009). In reprogramming, switches in enhancer states occur very rapidly and extensively, even before the rst cell division, highlighting an extremely quick departure from the somatic cell identity (Koche et al., 2011). These changes go in both directions: more than 60% of broblast-specic enhancers are decommissioned, and at least 1,000 ESC-specic enhancers are established de novo within the rst 24 hr of reprogramming factor expression, based on loss or gain of H3K4me1/2,
Cell 152, March 14, 2013 2013 Elsevier Inc. 1331

respectively (Figure 3). Although H3K4me1/2 on its own does not allow one to distinguish between active and poised enhancer states, it is likely that many of the newly marked ESC-specic enhancers are in a poised state that will be activated at later stages of reprogramming. Thus, extensive chromatin remodeling at ESC-specic promoters and enhancers precedes the transcriptional activation of many pluripotency genes. Together, these chromatin dynamics are likely crucial for the shutdown of the somatic expression program and the transition toward pluripotency. During differentiation, pluripotency genes acquire a silent state that is associated with a repressive chromatin environment that can include DNA methylation, histone variants, covalent histone modications, chromatin regulatory proteins, and occupancy of regulatory regions by nucleosomes (Feldman et al., 2006; Mikkelsen et al., 2008; You et al., 2011). To activate pluripotency genes, it seems that the reprogramming factors must surmount at least two separable obstacles: the binding block at upstream regulatory regions (i.e., distal enhancer and promoter elements) and a block in the transactivation of the core promoter, which prevents the assembly and activation of the RNA polymerase-II-containing basal transcription machinery. Therefore, it may not be too surprising that the activation of pluripotency genes in reprogramming is relatively slow and potentially requires a cascade of events. The ndings described above suggest that the formation of poised ESCspecic enhancers early in reprogramming may be a critical rst step to orchestrate the productive engagement of the core promoter and transcriptional activation of ESC-specic genes later in the process when proper signals are available (Taberlay et al., 2011). This likely requires further chromatin remodeling and/or additional transcriptional and signaling regulators that are unavailable early in reprogramming (for more discussion, see the transition section below). Importantly, this epigenetic priming does not affect all pluripotency genes early on, as many only gain an active/poised chromatin signature at their enhancer and promoter regions late in the process (Polo et al., 2012; Sridharan et al., 2009) (Figure 3) (see below). Understanding the regulation of enhancer/promoter pairs of pluripotency genes during reprogramming will be an important task for the future and will increase our general knowledge about the dynamics of promoter and enhancer interactions (Taberlay et al., 2011). Relating the extensive binding of OSK to distal sites in unmarked, closed chromatin early in human cell reprogramming (Sou et al., 2012) to the epigenetic priming of many ESCspecic enhancers early in mouse reprogramming (Koche et al., 2011) implies that the reprogramming factors may cause at least some of these initial epigenetic priming events directly. To test this hypothesis, simultaneous analysis of transcription factor binding, chromatin, and transcription states is required, and detailed studies both in vitro and in vivo need to address whether Oct4, Sox2, and Klf4 can indeed bind regulatory DNA sites packaged in nucleosomes and change chromatin structure. The ability of the reprogramming factors to engage regulatory genomic elements in closed (silent) chromatin may be a critical feature and may explain why OSK are such potent inducers of pluripotency and are effective in many different somatic cell types.
1332 Cell 152, March 14, 2013 2013 Elsevier Inc.

DNA Methylation and H3K9 Methylation Inuence Reprogramming Factor Binding Given that OSK appear to be able to efciently engage closed chromatin regions already early in reprogramming, it may be surprising that many regulatory regions bound by OSKM in the pluripotent state are not occupied early in the process (Sou et al., 2012; Sridharan et al., 2009). What then are the impediments to reprogramming factor binding and action? DNA methylation has arisen as an important factor in restricting early reprogramming events. ESC-specic promoters and enhancers that gain active chromatin modications only late in reprogramming tend to be hypermethylated in the starting broblasts and become demethylated only late in reprogramming (Koche et al., 2011) (Figure 3). For example, hypermethylation of key pluripotency gene promoters, including those of Nanog and Oct4, is observed until late in reprogramming (Mikkelsen et al., 2008; Polo et al., 2012), suggesting that demethylation of these promoters is a rate-limiting step. By contrast, promoters and enhancers that already gain active chromatin marks (H3K4me2) early in reprogramming exhibit hypomethylation throughout the entire reprogramming process (Koche et al., 2011) (Figure 3). Thus, DNA methylation appears to limit where histone modication changes can occur. Furthermore, Oct4 expression can establish a nucleosome-depleted region at the distal enhancers of OCT4 and at the proximal promoter of NANOG in somatic cells but only if these regions are unmethylated (You et al., 2011), indicating that DNA methylation can prevent the recruitment of the reprogramming factors (Figure 3D). In the case of Oct4, DNA methylation must affect binding indirectly, as its DNA motif does not contain a CpG. Jones and colleagues proposed that DNA methylation in anking sequences may stabilize the nucleosome and prevent binding (You et al., 2011). Similarly, binding of cMyc is inhibited by CpG methylation within its CACGTG target site (Prendergast and Ziff, 1991). However, the binding of other transcription factors, such as the Klf4-related transcription factor SP1, is not affected by DNA methylation (Harrington et al., 1988), suggesting that the reprogramming factors may be differentially affected by DNA methylation. Importantly, DNA methylation is functionally recognized as a feature that limits reprogramming to pluripotency because interference with Dnmt1, the enzyme responsible for the maintenance of DNA methylation (Mikkelsen et al., 2008), promotes iPSC formation (Table 1). Interestingly, somatic enhancers that are inactivated quickly upon reprogramming factor expression and are typically methylated in the pluripotent state only gain hypermethylation later in the reprogramming process (Koche et al., 2011) (Figure 3). Thus, both the methylation of somatic genes and the demethylation of some critical pluripotency genes appear to occur only late in reprogramming, establishing the DNA methylation pattern characteristic of the pluripotent state, which is in contrast to the more gradual changes in histone modications and transcriptional states throughout reprogramming (Koche et al., 2011; Polo et al., 2012). This may explain, at least in part, why reprogramming intermediates are unstable when the reprogramming factors are withdrawn, as DNA methylation may be required to permanently lock in a gene expression pattern and cell identity (Koche et al., 2011). However, it needs

to be noted that reprogramming occurs normally even upon the genetic ablation of the de novo DNA methyltransferases Dnmt3a and Dnmt3b, indicating that the gain of DNA methylation in somatic promoters and enhancers may not be essential (Pawlak and Jaenisch, 2011) (Table 1). In any case, it will be interesting to elucidate the mechanisms underlying these bidirectional changes of DNA methylation late in the reprogramming process. In addition to DNA methylation, other repressive chromatin marks affect the ability of the reprogramming factors to engage their target sites. Indeed, Zaret and colleagues uncovered hundreds of large regions of megabase scale that exclude reprogramming factor binding early in human cell reprogramming even though the same regions are bound extensively by the factors in ESCs (Sou et al., 2012). Although gene-poor, these regions contain various well-known pluripotency genes such as NANOG, SOX2, and PRDM14, and almost perfectly overlap with regions of extended H3K9me3 in the starting broblasts that are in close contact with the nuclear lamina (Sou et al., 2012). Importantly, during reprogramming, these broad H3K9me3 domains are erased, consistent with their absence in human ESCs (Hawkins et al., 2010; Sou et al., 2012; Zhu et al., 2013), raising the possibility that the lack of OSKM binding in these large contiguous genomic regions early in reprogramming could be caused by the presence of H3K9me3. There is currently some debate as to whether the H3K9me3 domains arise during lineage specication or are triggered in differentiated cells in response to specic culture conditions in vitro (Hawkins et al., 2010; Zhu et al., 2013). Regardless, the H3K9 methyltransferase SUV39H1 is required for the maintenance of these H3K9me3 domains, and inhibition of TGFb signaling lowers the H3K9me3 domain signal (Sou et al., 2012; Zhu et al., 2013). Notably, both the suppression of SUV39H1 and the inhibition of TGFb signaling enhance reprogramming to pluripotency (Ichida et al., 2009; Onder et al., 2012; Sou et al., 2012) (Table 1), and inhibition of SUV39H1/2 early in human cell reprogramming increases the access of OSKM to sites within H3K9me3 domains (Sou et al., 2012). Thus, H3K9 methylation represents a barrier to the induction of pluripotency, at least in part, by blocking reprogramming factor access (Figure 2D). This conclusion is supported further by the nding that various other H3K9 methyltransferases and H3K9 demethylases control reprogramming efciency (Chen et al., 2013; Onder et al., 2012; Sou et al., 2012) (Table 1). In a fascinating twist, the same regions that display a shift from a broad H3K9me3 pattern to OSKM binding during reprogramming encompass nearly all of the 20 hot spots of aberrant epigenetic reprogramming, which exhibit aberrant DNA methylation patterns in human iPSCs compared to ESCs (Lister et al., 2011; Sou et al., 2012). Thus, the loss of H3K9me3 from these regions may be a very inefcient process that could additionally be inuenced by the exact culture conditions used for reprogramming (Zhu et al., 2013). Transitioning between Reprogramming Steps An important question is what exactly the rate-limiting transition steps at various reprogramming stages are. How do reprogramming cells transition from one step to the next? Though the eld is dening molecules that positively and negatively inuence the re-

programming process (Table 1), this question is still very difcult to address due to the inefciency of the process. Rate-limiting transitions are likely linked to uctuations or inherent noise of gene expression, chromatin state, and transcription factor binding and are further inuenced by cell-cell contacts or extrinsic signals. Single-cell gene expression studies have shown that early reprogramming cultures and intermediate reprogramming populations both display heterogeneity, with considerable variation in gene expression between cells (Buganim et al., 2012; Polo et al., 2012), suggesting that stochastic gene activation events could be an important contributor to reprogramming transitions. Some of these expression differences are likely essential for progression toward pluripotency, whereas others may not have any impact on the reprogramming process or may even be inhibitory (Buganim et al., 2012; Polo et al., 2012). Oct4 physically interacts with various active and repressive chromatin complexes (Pardo et al., 2010; van den Berg et al., 2010), raising the question of whether the activator or repressor function of Oct4 and the other reprogramming factors is more important for reprogramming. Recent reports in which reprogramming factors were fused to strong transcriptional activation domains (TADs) or repressor proteins indicate that activator, but not repressor, fusions promote reprogramming (Hammachi et al., 2012; Hirai et al., 2011; Wang et al., 2011c), suggesting that transcriptional activation is the main action of the reprogramming factors in reprogramming, and may be rate limiting. However, not all TADs can enhance the induction of pluripotency. TADs of MyoD and VP16, but not those of Mef2C and Gata4, increase iPSC formation when fused to Oct4 (Hammachi et al., 2012; Hirai et al., 2011, 2012; Wang et al., 2011c). Because TADs serve as a scaffold to recruit other transcription factors, coactivators, and specic chromatin modiers that are required for transcriptional activation, these ndings suggest the need for specic coregulatory proteins in pluripotency induction. In addition, a strong transcriptional activator may bypass the requirement for extensive chromatin remodeling at the promoter for recruitment of the basic transcriptional machinery and preinitiation complex assembly (Koutroubas et al., 2008). Of note, the ectopic tethering of a strong transcriptional activator (the VP16 TAD) to the silent Oct4 gene in somatic cells is capable of activating this allele within 48 hr. However, this activation only happens in a small number of cells, highlighting the need for additional regulatory events (Hathaway et al., 2012). Given that the reprogramming factors may act predominantly as transcriptional activators, it may be surprising that the initial transcriptional response includes the silencing of the somatic expression program. However, transcriptional activators could amplify or induce the expression of other transcriptional activators as well as repressors, which in turn could secondarily affect gene expression patterns via emergent feedforward and feedback circuitries and could thereby contribute to the cell fate change of reprogramming. High levels of strong transcription factors may also contribute indirectly to the repression of other genes by competing for binding at common sites on the basic transcriptional machinery in a process referred to as squelching (Gill and Ptashne, 1988). Additionally, not only coding genes but also miRNAs are dynamically regulated during reprogramming and have been implicated in the control of the reprogramming
Cell 152, March 14, 2013 2013 Elsevier Inc. 1333

Table 1. List of Selected Chromatin Regulators and Their Role in Reprogramming Chromatin Mark H3K4me H3K9me Chromatin Regulator Wdr5 (MLL-HMTase subunit, H3K4me-binding protein) Suv39h1/2; Setdb1 (ESET); Ehmt2 (G9a) (HMTases) Reprogramming Phenotype required during the initial phase of reprogramming; interacts with Oct4 depletion of Suv39h1, Suv39h2, Setdb1, or Ehmt2 results in efcient conversion of partially reprogrammed cells to iPSCs in the mouse system; depletion of Suv39h1/2 enhances human cell reprogramming and allows for more efcient binding of the reprogramming factors to domains with broad H3K9me3 in the starting cell overexpression enhances reprogramming; knockdown reduces the conversion of partially reprogrammed cells to iPSCs required for reprogramming interacts with reprogramming factors; required for reprogramming; depletion results in aberrant and inefcient resetting of H3K27me and impairs reactivation of pluripotency genes; depletion of Eed rescues the reprogramming defect due to Utx loss of function knockdown impairs reprogramming; overexpression enhances reprogramming (requiring the demethylase activity) by affecting the early reprogramming phase; enhances in part by promoting cell-cycle progression and overcoming senescence through repression of the Ink4/Arf locus and/or facilitating the early transcriptional response to the reprogramming factors depletion in the early phase enhances reprogramming; inhibition results in more efcient loss of H3K79me2 from somatic genes, thereby promoting their downregulation; depletion allows reprogramming without ectopic Klf4 HDAC2 knockout allows reprogramming to be driven by the overexpression of only microRNAs; small-molecule inhibitors of HDACs (such as VPA, TSA, and butyrate) enhance reprogramming and replace ectopic cMyc or Klf4 overexpression enhances reprogramming; overexpression appears to enhance binding of Oct4 to its pluripotency targets during reprogramming essential for reprogramming deletion enhances reprogramming to pluripotency, overexpression prevents efcient reprogramming ve pluripotent cells; recruited to of EpiSCs to na regulatory region of pluripotency genes in mouse embryonic broblasts, but not in ESCs depletion enhances reprogramming of broblasts and partially reprogrammed cells, similar to 5-azacytidine (5-AZA) treatment dispensable for reprogramming References Ang et al., 2011 Chen et al., 2013; Onder et al., 2012; Sou et al., 2012

Kdm3/4 (demethylases)

Chen et al., 2013

H3K27me

PRC2 (Ezh2, Eed) (HMTase) Utx (demethylase)

Onder et al., 2012; Buganim et al., 2012 Mansour et al., 2012

H3K36me

Jhdm1a (Kdm2a); Jhdm1b (Kdm2b) (demethylases)

Wang et al., 2011a; Liang et al., 2012

H3K79me

Dot1 (HMTase)

Onder et al., 2012

Histone acetylation

HDACs (histone deacetylases)

Anokye-Danso et al., 2011; Huangfu et al., 2008; Mali et al., 2010; Liang et al., 2010

Chromatin remodeling

Baf155/Brg1 (ATP-dependent chromatin-remodeling complex)

Singhal et al., 2010

Chd1 Histone variants macroH2A

Gaspar-Maia et al., 2009 Pasque et al., 2012

DNA methylation

Dnmt1 (maintenance methyltransferase) Dnmt3a/b (de novo methyltransferases)

Mikkelsen et al., 2008

Pawlak and Jaenisch, 2011 (Continued on next page)

1334 Cell 152, March 14, 2013 2013 Elsevier Inc.

Table 1. Others

Continued Chromatin Regulator OGT (O-GlcNAc glycosyltransferase) Reprogramming Phenotype blocking O-GlcNAcylation impairs reprogramming; O-GlcNAcylation regulates the transactivation activity of Oct4 and Sox2; O-GlcNAcylationdefective mutant of Oct4 fails to support reprogramming enzymatic function and DNA-binding domain are required for reprogramming; recruited to pluripotency genes (e.g., Nanog promoter) in the early phase to control 5meC levels and control Oct4 recruitment required for efcient reprogramming; required for the global as well as gene-specic (e.g., at pluripotency gene promoters) increase in 5-hydroxymethylcytosine (5hmC) mark during reprogramming References Jang et al., 2012

Chromatin Mark

Parp1 (poly ADP-ribose polymerase)

Doege et al., 2012

Tet2 (FeII and 2-oxoglutaratedependent enzyme)

Doege et al., 2012

process, even allowing for the induction of pluripotency without the ectopic expression of any transcription factor (AnokyeDanso et al., 2011; Judson et al., 2009). miRNA expression inversely correlates with target gene expression during reprogramming (Polo et al., 2012), suggesting that miRNAs may be critically contributing to the silencing of the somatic gene expression program and subsequent reprogramming steps. For example, an increase of miR-130 and miR-301 early in reprogramming enhances the process by repressing the developmental regulator Meox2 (Pfaff et al., 2011), and miRNAs of the miR-200 family are induced early and contribute to the repression of the broblast regulators Zeb1 and Zeb2 (Samavarchi-Tehrani et al., 2010). The experimental depletion of pre-existing lineage factors also promotes reprogramming (Hanna et al., 2008) likely by facilitating the decommissioning of somatic enhancers, thereby enabling the transition to the next reprogramming stage. What leads to the hierarchical pluripotency gene activation late in reprogramming? As discussed before, their efcient transcription requires the combinatorial and synergistic action of multiple activators bound to the enhancer and/or distal promoter. Enhancers can be modular, whereby each transcription factor contributes to the transcriptional output, or nonmodular, whereby each transcription factor is essential such that the target gene is turned on only when all transcription factors are present. Particularly considering that many ESC-specic enhancers are bound by a large number of pluripotency transcription factors in ESCs (Figure 2A), the presence of OSKM alone is likely not sufcient for efcient binding and/or transactivation. One of the factors that needs to act alongside OSK appears to be the pluripotency transcription factor Nanog. Nanog co-occupies many pluripotency genes together with OSK in ESCs and targets promoter regions that fail to bind OSK until the end of the reprogramming process (Sridharan et al., 2009) (Figure 2A). Intriguingly, Nanog is essential for the establishment of iPSCs (Silva et al., 2009) and becomes expressed before many other pluripotency genes during the reprogramming process (Golipour et al., 2012), suggesting that it could be required for their activation. Overexpression of Esrrb,

another pluripotency factor, can rescue OSKM-induced reprogramming in the absence of endogenous Nanog (Festuccia et al., 2012). Fitting with the concept of hierarchical pluripotency activation, Esrrb is a direct target of Nanog in ESCs (Festuccia et al., 2012). Therefore, a critical function of Nanog in reprogramming may be to activate Esrrb, which in turn directly interacts with the general transcriptional machinery and also co-occupies many pluripotency loci with OSK and Nanog (Percharde et al., 2012; van den Berg et al., 2010). Interestingly, a recent RNAi screen identied various chromatin regulators, including Morc1, as regulators of the nal reprogramming steps, which have not yet directly been implicated in the maintenance of pluripotency (Golipour et al., 2012), indicating that in addition to transcriptional activation an extensive chromatin remodeling may be required at the late reprogramming stage. Today, we are just beginning to discover how chromatin limits but also guides reprogramming factors and how the factors overcome chromatin barriers. Direct interactions of the reprogramming factors with chromatin regulators may be important. For example, Oct4 can interact with subunits of the BAF chromatin-remodeling complex (Pardo et al., 2010; van den Berg et al., 2010), which enhances reprogramming and could stimulate the binding of transcription factors to nucleosomal sites (Singhal et al., 2010; Utley et al., 1997). Similarly, the activity of the reprogramming factors can be modulated by posttranslational modications such as O-GlcNAc, which in the case of Oct4 is required for activation of target genes in ESCs and for Oct4s full functionality in reprogramming (Jang et al., 2012). Recent studies have identied additional chromatin regulators that are essential for the process (for a summary, see Table 1). For example, the H3K27me demethylase Utx also interacts with OSK and is critical for the removal of this repressive H3K37me3 from pluripotency loci (Mansour et al., 2012). Similarly, decreasing the levels of histone marks associated with transcriptional elongation promotes the downregulation of the somatic gene expression program and suppression of senescence regulators (Liang et al., 2012; Onder et al., 2012; Wang et al., 2011a). While additional regulatory factors likely need to function alongside OSKM to allow for binding
Cell 152, March 14, 2013 2013 Elsevier Inc. 1335

Figure 4. X Chromosome States in Mouse and Human Pluripotent Cells


(A) X chromosome inactivation and reactivation cycles in the mouse system, highlighting the association of naive pluripotency with the XaXa state and of primed pluripotency with the XiXa state. Xa, active X chromosome; Xi, inactive X chromosome. (B) Drift and hierarchy of X chromosome states in female human ESCs during long-term culture. Xe, eroded Xi. The box marks the only X chromosome state that allows de novo X inactivation upon induction of differentiation. (C) Xi reactivation does not occur when female human somatic cells are reprogrammed to primed iPSCs (under bFGF reprogramming conditions). While broblasts are mosaic for which X is inactivated (Xp, paternal X; Xm, maternal X), each early passage iPSC line carries the X-inactivation state of the differentiated cell that initiated the reprogramming event. This state is subsequently maintained upon differentiation. (D) As in (B) but for the drift and hierarchy of X chromosome states in female human iPSCs during long-term culture.

to repressed pluripotency genes (Doege et al., 2012), such an opportunity may normally arise during every cell division, immediately following DNA replication before nucleosome assembly (Wolffe, 1991). It remains to be determined whether replication (i.e., cell proliferation) is required for changing gene expression patterns at every stage of the reprogramming process. X Chromosome State in Differentiation and Reprogramming in the Mouse Model In the remaining sections of this Review, we will focus on the characterization of the induced pluripotent state in both mouse and human iPSCs, highlighting differences and parallels between these two cell types particularly as they relate to the epigenetic state of the X chromosome. In mammals, X chromosome inactivation (XCI) leads to the transcriptional silencing of one X chromosome in female (XX) cells, equalizing gene dosage to XY males. This epigenetic process has been very powerful in revealing that the typical reprogramming experiment with human and mouse cells leads to different developmental states. XCI
1336 Cell 152, March 14, 2013 2013 Elsevier Inc.

involves several noncoding RNAs and a dramatic reorganization of chromatin with various epigenetic layers of regulation such as DNA methylation, histone modications, and late replication in S phase (reviewed in Wutz [2011]). In the mouse, X chromosome silencing is established very early in embryonic development, in the epiblast cells of the implanting blastocyst, which will give rise to the embryo proper. XCI can therefore be recapitulated in vitro in differentiating mouse ESCs, the in vivo counterpart of the epiblast cells of the preimplantation blastocyst. Differentiation induces expression of the noncoding RNA Xist, which then quickly spreads to coat the chromosome in cis, mediating silencing of X-linked genes and inducing a repressive chromatin character along the entire chromosome (Wutz, 2011) (Figure 4A). This process is random such that the paternally and maternally inherited X chromosome (Xp and Xm, respectively) become silenced with equal chance. However, in the mouse system, two states of pluripotency exist in vivo and in vitro. ESCs and the epiblast cells of the preimplantation blastocyst represent the naive pluripotent state. By contrast,

primed pluripotent cells are isolated from the epithelialized epiblast of the postimplantation embryo as mouse epiblast stem cells (EpiSC) and represent a developmentally advanced pluripotent state (Brons et al., 2007; Tesar et al., 2007). Consequently, EpiSCs are distinct from ESCs in gene expression, growth factor dependence, morphology, and the ability to contribute to blastocyst chimeras, although various core pluripotency regulators are present in both mouse ESCs and EpiSCs and both cell types are capable of multilineage differentiation in vitro (reviewed in Nichols and Smith [2009]). Importantly, EpiSCs are post X-inactivation, i.e., are XiXISTXa, mirroring the state of the epithelialized epiblast in vivo (Pasque et al., 2011) (Figure 4A). Therefore, in the mouse system, the XaXa state appears to be a hallmark specically of naive pluripotency. Because XCI represents one of the most dramatic events of facultative heterochromatin formation in mammalian development, the question arises of how the somatically silent X chromosome is regulated during reprogramming. In the mouse system, the typical reprogramming experiment establishes naive pluripotency, i.e., iPSCs that are equivalent to leukemia inhibitory factor (LIF)-dependent, naive ESCs. Our lab demonstrated that female mouse iPSCs, like female mouse ESCs, carry two active X chromosomes (XaXa), indicating that the Xi is reactivated during reprogramming to naive pluripotency (Maherali et al., 2007) (Figure 4A). The activation of genes on the Xi is accompanied by the loss of all known heterochromatic chromatin marks and the silencing of Xist (Maherali et al., 2007). Together, these events enable random X-inactivation upon induction of differentiation, indicating that there is no epigenetic memory for the prior Xi left behind. Xi reactivation occurs very late in the reprogramming process at around the time of pluripotency gene expression (Stadtfeld et al., 2008). In contrast to iPSCs, induced EpiSCs (iEpiSCs) generated by OSKM expression and culture conditions required for support of the primed pluripotent state (bFGF/activin) are XiXISTXa (Han et al., 2011) (Figure 4A). EpiSCs can be reprogrammed to the ESC-like state with various transcription factors and a switch in culture environment, establishing the XaXa state (Nichols and Smith, 2009) (Figure 4A). Together, these ndings establish the X chromosome state as a sensitive indicator of the developmental state in the mouse system, both in differentiation and reprogramming processes, and demonstrate that the XaXa state is indisputably only associated with the naive state of pluripotency in this system. X Chromosome Status in Human ESCs and iPSCs The analysis of human ESCs led to the puzzling observation that various ESC lines differ in their X chromosome status (Hoffman et al., 2005; Shen et al., 2008; Silva et al., 2008) (Figure 4B). (1) They can be XaXa and undergo XCI upon differentiation, comparable to mouse ESCs. (2) Some human ESC lines have already undergone XCI and display a heterochromatic Xi with XIST RNA coating the undifferentiated state (XiXISTXa). (3) Many human ESCs have a silent Xi that lacks XIST expression (Xiw/oXISTXa). Currently it is thought that newly derived human ESCs start in the XaXa state and subsequently drift toward XCI and later loss of XIST RNA with additional time in culture (Figure 4B). The strongest support for this model comes from the fact that the XaXa state can be stabilized in newly derived

ESCs under physiological oxygen conditions, whereas chronic exposure to atmospheric oxygen concentrations irreversibly induces XCI (Lengner et al., 2010). Regardless of the X chromosome state, human ESCs generally share more features with the primed pluripotent state of the mouse than with mouse ESCs (Nichols and Smith, 2009). Therefore, the XaXa state is not restricted to naive pluripotency in the human system and can also mark the primed pluripotent state. To date, the occurrence of the XaXa state and the instability of the X have not been described for mouse EpiSCs and, in fact, for any other cell type. Given the different states of the X in human ESCs, an interesting question was whether reprogramming of female human cells to iPSCs, which recapitulate the primed pluripotent state of human ESCs, would result in Xi reactivation. Originally, our group demonstrated that female human iPSC lines carry an XIST RNA-coated Xi (XiXISTXa) when they are rst derived (Tchieu et al., 2010) (Figure 4C). In contrast to somatic cell populations, which are mosaic with respect to which X chromosome is inactivated, iPSC lines display a nonrandom pattern of XCI that is maintained upon induction of differentiation (Tchieu et al., 2010). As a result, two types of iPSC lines can be derivedthose expressing only the Xp (XmiXpa) and those expressing only the Xm (XmaXpi) (Tchieu et al., 2010) (Figure 4C). Therefore, reprogramming to human iPSCs does not elicit Xi reactivation, and iPSCs inherit the Xi of the particular somatic cell in the culture dish that underwent a successful reprogramming event (Pomp et al., 2011; Tchieu et al., 2010). Although subsequent reports conrmed this conclusion (Cheung et al., 2011; Pomp et al., 2011), other groups obtained conicting results and argued that Xi reactivation is prevalent in iPSCs (Kim et al., 2011; Marchetto et al., 2010). Recent reports help to reconcile these apparently contradictory conclusions and conrm that the silent state of the X is faithfully maintained through the reprogramming process but unravels with the time that iPSCs spend in culture (Anguera et al., 2012; Mekhoubad et al., 2012; Nazor et al., 2012; Tchieu et al., 2010; Tomoda et al., 2012). Similar to human ESCs, human iPSCs are prone to undergo XIST silencing upon prolonged passaging, yielding Xiw/oXISTXa lines and accordingly losing all XIST RNA-dependent repressive chromatin marks such as H3K27me3 (Pomp et al., 2011; Tchieu et al., 2010) (Figure 4D). Reprogramming experiments with female broblasts heterozygous for a mutation of the X-linked gene HPRT combined with an elegant drug selection system that can distinguish between the expression of wild-type or mutant HPRT revealed that spontaneous loss of XIST RNA coating coincides with re-expression of the HPRT allele from the Xi (Mekhoubad et al., 2012). Thus, XiHPRTwtXaHPRTmut iPSCs express only the mutant HPRT allele at early passage but activate the wild-type HPRT allele upon XIST RNA loss. Importantly, the activation of Xi-linked genes is not limited to this one gene but appears to affect the Xi more broadly, as demonstrated by global expression and DNA methylation proles of female iPSC lines (Anguera et al., 2012; Mekhoubad et al., 2012; Nazor et al., 2012). Specically, in early passage XiXISTXa iPSCs, X-linked genes are expressed at the level of male (XaY) iPSCs and display DNA methylation in promoters of Xi-linked genes (Mekhoubad et al., 2012; Nazor et al., 2012). By contrast, higher-passage female iPSCs with no
Cell 152, March 14, 2013 2013 Elsevier Inc. 1337

XIST RNA (Xiw/oXISTXa) are often characterized by higher expression of various X-linked genes and hypomethylation of a subset of Xi-linked promoters, suggesting that the loss of DNA methylation contributes to the activation of Xi-linked genes. Importantly, the activation of X-linked genes does not appear to affect the entire X chromosome. Eggan and colleagues coined the partial reactivation of the Xi erosion of dosage compensation, yielding an eroded Xi, the Xe (Mekhoubad et al., 2012) (Figure 4D). Even with long-term culturing, none of the female human iPSC lines reach the low DNA methylation level along the entire X that is typical for male iPSCs (with their single Xa), indicating that even in the worst case the activation of genes on the Xi is limited in range (Nazor et al., 2012). Across many female human iPSC lines, the X chromosome is affected to varying degrees, but the loss of DNA methylation appears to target similar large, noncontiguous regions of the X chromosome, indicating that certain parts of the X can effectively maintain proper silencing while others are more prone to reactivation (Nazor et al., 2012). The patchy erasure of DNA methylation along the X, along with loss of gene silencing and XIST RNA coating, cannot be corrected upon differentiation, nor upon a repeated round of reprogramming (Mekhoubad et al., 2012; Nazor et al., 2012). Together, these ndings are most consistent with a model in which reprogramming sustains the XiXISTXa state, but continued passaging of iPSCs results in XIST silencing (Xiw/oXISTXa), which then triggers partial reactivation of the Xi (Xew/oXISTXa) (Figure 4D). Notably, one could argue that these X-related events are a consequence of continued reprogramming processes, particularly given that continuous passaging of iPSCs reduces gene expression differences compared to ESCs (Chin et al., 2010; Polo et al., 2010). However, the erosion of the X has also recently been observed in many human ESC lines upon XIST RNA loss, very similar in extent to iPSCs (Nazor et al., 2012) (Figure 4C). Importantly, iPSCs with an eroded Xi still depend on FGF/Activin signaling to maintain pluripotency (Mekhoubad et al., 2012), conrming that the erosion of the X chromosome occurs in the context of primed pluripotency and is likely not associated with a change in cell identity to naive pluripotency. Thus, for human pluripotent cells (iPSCs and ESCs), dosage compensation erosion appears to be a problem of cell culture, particularly given that it remains a feature of the differentiated progeny, necessitating the development of improved culturing methods for these cell types (see below). Why are XIST expression and the silent state of the X unstable upon long-term culturing? A few relevant observations have been made. iPSC lines obtained from the same reprogramming experiment (i.e., the same broblast population) typically display widely different X states at the same passage, with some lines being able to maintain the XiXISTXa state and others being on the path of erosion (Anguera et al., 2012; Kim et al., 2011; Mekhoubad et al., 2012; Nazor et al., 2012; Tchieu et al., 2010). Similarly, any given iPSC and ESC line can be heterogeneous regarding its X chromosome state (Anguera et al., 2012; Mekhoubad et al., 2012; Silva et al., 2008; Tchieu et al., 2010; Tomoda et al., 2012). These ndings, combined with the fact that no genomic abnormalities were found in iPSC lines with an eroded Xi, suggest that epigenetic, but not genetic, changes are responsible for the instability of the X chromosome (Anguera
1338 Cell 152, March 14, 2013 2013 Elsevier Inc.

et al., 2012; Mekhoubad et al., 2012). Consistently, complete methylation of the XIST promoter correlates with the loss of the RNA in iPSCs (Tchieu et al., 2010), implying that de novo methylation contributes to its silencing. Interestingly, in mouse broblasts, experimentally induced loss of Xist by itself does not induce the reactivation of candidate X-linked genes (Csankovszki et al., 2001). However, when Xist loss is combined with the deletion of Dnmt1 and loss of DNA methylation, a dramatic reactivation of the Xi occurs in mouse somatic cells (Csankovszki et al., 2001). This parallels what happens when the Xi erodes in human iPSCs, suggesting that deregulation of the DNA methylation machinery may directly contribute to this process. An interesting observation is that the propagation of XiXISTXa iPSCs in media containing bFGF and IGF2 and on feeder cells expressing LIF predictably induces XIST RNA loss and activates genes of the Xi after only a few passages. In this case, silencing is reinitiated upon differentiation, suggesting that complete Xi reactivation occurred, establishing an XaXa state in human iPSCs, rather than an Xe (Tomoda et al., 2012) (Figure 4D). Based on cell morphology, it appears that these XaXa cells still maintain the primed pluripotent state (Tomoda et al., 2012). A somewhat surprising observation is that XIST RNA was not detected at the endpoint of differentiation (Tomoda et al., 2012). More work will be needed to test whether XIST is upregulated earlier in the differentiation process, as X inactivation without XIST expression would be a highly unexpected possibility (Figure 4D). In any case, this study re-emphasizes that culture conditions can have a dramatic impact on the epigenetic state of the X in human iPSCs and enhance transition between X chromosome states. A comparison of the X states in female human ESCs and iPSCs highlights two key differences. The XaXa state appears to be the most immature state for primed human ESCs (Lengner et al., 2010) (Figure 4B, boxed), but it is a downstream state in the hierarchy of X states in primed human iPSCs (Tomoda et al., 2012) (Figure 4D, boxed). Hypoxic conditions or the addition of HDAC inhibitors, which appear to promote the generation and maintenance of XaXa hESCs (Lengner et al., 2010; Ware et al., 2009), do not enhance the establishment of XeXa or XaXa iPSCs (Anguera et al., 2012; Kim et al., 2011; Mekhoubad et al., 2012; Pomp et al., 2011; Tchieu et al., 2010). One reason for the difference in X-state hierarchies between human iPSCs and ESCs may be that the cells are of very different originiPSCs are derived from somatic XiXa cells and ESCs from XaXa cells of the female human blastocyst (Okamoto et al., 2011). Understanding the behavior of the human X in ESCs and iPSCs will be an important contribution to the ongoing debate about potential transcriptional, epigenetic, and genetic differences between various iPSC and ESC lines and their relevance (Lowry, 2012). It is important to realize that human pluripotent cells that resemble the naive, mouse ESC state can be established in vitro via transcription-factor-induced reprogramming methods. For example, the overexpression of OCT4 and KLF4 or KLF4 and KLF2 in primed human ESCs/iPSCs or OSKM in broblasts, combined with specic culture conditions that support the naive state, allows the establishment of human naive iPSCs (Hanna et al., 2010). However, the naive state is still relatively difcult to establish and maintain (Hanna et al., 2010; Pomp et al.,

2011; Wang et al., 2011b). When derived from XiXISTXa iPSCs, naive human pluripotent cells become XIST negative but display XIST RNA coating in virtually all cells upon differentiation (Hanna et al., 2010). Despite the fact that the analysis of the X chromosome state in naive human cells is still in its infancy, these data argue strongly that the mouse ESC-like XaXa state, which allows XIST-dependent induction of X inactivation during differentiation, can be established in human cells upon reprogramming to the naive state. Naive human pluripotent cells may therefore represent an excellent model to study the regulation of human XCI and may get around problems associated with the instability of the X in primed pluripotent cells. However, the existence of human naive (mouse ESC-like) pluripotent cells in vivo remains unclear, and their derivation from preimplantation embryos has not yet been accomplished (Kuijk et al., 2012; Roode et al., 2012). Instability of the Human X, Differentiation, and Disease Modeling iPSCs can be derived for specic diseases and can differentiate into any cell type of the human body. Therefore, they offer an unprecedented opportunity to examine disease states and develop novel drugs (Onder and Daley, 2012; Trounson et al., 2012). The nonrandom X inactivation in early passage XiXISTXa iPSCs has an interesting consequence for the modeling of X-linked diseases. Considering females heterozygous for a mutation in an X-linked gene, iPSCs can be derived that express either the wild-type or the mutant form of the protein, which represent an interesting experimental system for the investigation of disease phenotypes, as both wild-type and mutant cell lines are on the same genetic background (Tchieu et al., 2010) (Figure 5). To date, X-linked diseases such as Rett syndrome and Lesch-Nyhan syndrome (LNS) have been modeled by such matched iPSCs (Cheung et al., 2011; Kim et al., 2011; Mekhoubad et al., 2012). For example, mutations in the X-linked gene HPRT cause LNS, which leads to behavioral and neurological symptoms in males but is typically nonsymptomatic in heterozygous females because of random X inactivation (Figure 5). From these heterozygous females, XiHPRTwtXaHPRTmut iPSCs can be obtained that, at early passage, exhibit the LNS phenotype upon differentiation into neurons in vitro, whereas iPSCs with the opposite X-inactivation pattern (XiHPRTmutXaHPRTwt) behave normally (Mekhoubad et al., 2012). However, at higher passage, erosion of the Xi in XiHPRT wtXaHPRTmut iPSCs leads to the expression of the wild-type HPRT allele and loss of the disease phenotype (Mekhoubad et al., 2012) (Figure 5). The interpretation of Xlinked disease studies therefore requires caution and a careful assessment of the X chromosome state. Problems caused by the erosion of the Xi in human iPSCs and ESCs do not only apply to studies of X-linked diseases but should also be taken seriously for the modeling of autosomal diseases or, in fact, any differentiation process, as the erosion of the Xi in long-term culture can also alter the expression of some autosomal genes in addition to increasing X-linked gene expression (Anguera et al., 2012). Furthermore, female iPSC lines without XIST expression grow faster in culture, survive better in routine culturing, and appear to form only poorly differentiating teratomas, which may be associated with the upre-

Figure 5. Effects of X Chromosome Instability on Disease Modeling


Reprogramming of differentiated cells from females heterozygous for an X-linked mutation results in iPSC lines that express either the mutant or the wild-type allele from the Xa at early passage due to nonrandom X inactivation. These cell lines represent pairs of experimental and control cells ideal for modeling X-linked diseases on an isogenic background. However, upon XIST loss and Xi erosion, the allele from the Xi can become re-expressed, resulting in the loss or modulation of the disease phenotype.

gulation of several X-linked oncogenes (Anguera et al., 2012), indicating that the erosion of the X affects the behavior of female iPSCs and ESCs more broadly. Importantly, all recent studies agree that loss of XIST RNA coating is closely associated with the erosion of the Xi under conventional culture conditions (Anguera et al., 2012; Mekhoubad et al., 2012; Nazor et al., 2012; Tchieu et al., 2010; Tomoda et al., 2012). Thus, currently female human iPSCs with XIST RNA coating should be preferentially used for any downstream application, as these cells are in the well-dened XiXa state. Accordingly, Lee and colleagues proposed that XIST RNA coating of the Xi and the accumulation of XIST-dependent chromatin marks such as H3K27me3 can be considered biomarkers, as they appear to directly identify the stable XiXa state (Anguera et al., 2012). Outlook The improved mechanistic understanding of the path to pluripotency has already enabled the establishment of non-OSKcontaining reprogramming cocktails (Buganim et al., 2012; Mansour et al., 2012) and allowed for the replacement of essential endogenous proteins by downstream targets (Festuccia et al., 2012). Currently, we are learning only by analyzing a few
Cell 152, March 14, 2013 2013 Elsevier Inc. 1339

snapshots of the reprogramming process. However, more and more snapshots will eventually become a continuous epigenetic movie of the cell fate change that underlies reprogramming to pluripotency, through which we can virtually watch how the epigenetic landscape is reset. The 2006 era showcased the potency of diverse transcription factors in converting cell fates. It now seems likely that it may eventually be possible to generate any cell type by forced expression of the appropriate transcription factor(s). Continued dissection of the reprogramming process holds the promise that, at some point in the future, we will be able to predict exactly which transcription factors are most potent as reprogramming factors. Finally, other elds such as tumor biology will benet from the insight gained through reprogramming studies given that, for example, mutations that prevent senescence have been shown to increase both reprogramming efciency and tumor development.
ACKNOWLEDGMENTS Our work is supported by the NIH (DP2OD001686 and P01 GM099134), by the CIRM (RN1-00564 and RB3-05080), and by the Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research at UCLA. We apologize to all authors whose work could not be cited due to space limitations. We thank Dr. Sanjeet Patel for critical reading of the manuscript.

Brons, I.G., Smithers, L.E., Trotter, M.W., Rugg-Gunn, P., Sun, B., Chuva de Sousa Lopes, S.M., Howlett, S.K., Clarkson, A., Ahrlund-Richter, L., Pedersen, R.A., and Vallier, L. (2007). Derivation of pluripotent epiblast stem cells from mammalian embryos. Nature 448, 191195. Buckley, S.M., Aranda-Orgilles, B., Strikoudis, A., Apostolou, E., Loizou, E., Moran-Crusio, K., Farnsworth, C.L., Koller, A.A., Dasgupta, R., Silva, J.C., et al. (2012). Regulation of pluripotency and cellular reprogramming by the ubiquitin-proteasome system. Cell Stem Cell 11, 783798. Buganim, Y., Faddah, D.A., Cheng, A.W., Itskovich, E., Markoulaki, S., Ganz, K., Klemm, S.L., van Oudenaarden, A., and Jaenisch, R. (2012). Single-cell expression analyses during cellular reprogramming reveal an early stochastic and a late hierarchic phase. Cell 150, 12091222. Carey, B.W., Markoulaki, S., Hanna, J.H., Faddah, D.A., Buganim, Y., Kim, J., Ganz, K., Steine, E.J., Cassady, J.P., Creyghton, M.P., et al. (2011). Reprogramming factor stoichiometry inuences the epigenetic state and biological properties of induced pluripotent stem cells. Cell Stem Cell 9, 588598. Chen, X., Xu, H., Yuan, P., Fang, F., Huss, M., Vega, V.B., Wong, E., Orlov, Y.L., Zhang, W., Jiang, J., et al. (2008). Integration of external signaling pathways with the core transcriptional network in embryonic stem cells. Cell 133, 11061117. Chen, J., Liu, J., Chen, Y., Yang, J., Chen, J., Liu, H., Zhao, X., Mo, K., Song, H., Guo, L., et al. (2011). Rational optimization of reprogramming culture conditions for the generation of induced pluripotent stem cells with ultra-high efciency and fast kinetics. Cell Res. 21, 884894. Chen, J., Liu, H., Liu, J., Qi, J., Wei, B., Yang, J., Liang, H., Chen, Y., Chen, J., Wu, Y., et al. (2013). H3K9 methylation is a barrier during somatic cell reprogramming into iPSCs. Nat. Genet. 45, 3442. Published online December 2, 2012. http://dx.doi.org/10.1038/ng.2491. Cheung, A.Y., Horvath, L.M., Grafodatskaya, D., Pasceri, P., Weksberg, R., Hotta, A., Carrel, L., and Ellis, J. (2011). Isolation of MECP2-null Rett Syndrome patient hiPS cells and isogenic controls through X-chromosome inactivation. Hum. Mol. Genet. 20, 21032115. Chin, M.H., Mason, M.J., Xie, W., Volinia, S., Singer, M., Peterson, C., Ambartsumyan, G., Aimiuwu, O., Richter, L., Zhang, J., et al. (2009). Induced pluripotent stem cells and embryonic stem cells are distinguished by gene expression signatures. Cell Stem Cell 5, 111123. Chin, M.H., Pellegrini, M., Plath, K., and Lowry, W.E. (2010). Molecular analyses of human induced pluripotent stem cells and embryonic stem cells. Cell Stem Cell 7, 263269. Creyghton, M.P., Cheng, A.W., Welstead, G.G., Kooistra, T., Carey, B.W., Steine, E.J., Hanna, J., Lodato, M.A., Frampton, G.M., Sharp, P.A., et al. (2010). Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc. Natl. Acad. Sci. USA 107, 2193121936. Csankovszki, G., Nagy, A., and Jaenisch, R. (2001). Synergism of Xist RNA, DNA methylation, and histone hypoacetylation in maintaining X chromosome inactivation. J. Cell Biol. 153, 773784. Davis, R.L., Weintraub, H., and Lassar, A.B. (1987). Expression of a single transfected cDNA converts broblasts to myoblasts. Cell 51, 9871000. Doege, C.A., Inoue, K., Yamashita, T., Rhee, D.B., Travis, S., Fujita, R., Guarnieri, P., Bhagat, G., Vanti, W.B., Shih, A., et al. (2012). Early-stage epigenetic modication during somatic cell reprogramming by Parp1 and Tet2. Nature 488, 652655. Esteban, M.A., Wang, T., Qin, B., Yang, J., Qin, D., Cai, J., Li, W., Weng, Z., Chen, J., Ni, S., et al. (2010). Vitamin C enhances the generation of mouse and human induced pluripotent stem cells. Cell Stem Cell 6, 7179. Feldman, N., Gerson, A., Fang, J., Li, E., Zhang, Y., Shinkai, Y., Cedar, H., and Bergman, Y. (2006). G9a-mediated irreversible epigenetic inactivation of Oct-3/4 during early embryogenesis. Nat. Cell Biol. 8, 188194. Festuccia, N., Osorno, R., Halbritter, F., Karwacki-Neisius, V., Navarro, P., Colby, D., Wong, F., Yates, A., Tomlinson, S.R., and Chambers, I. (2012). Esrrb is a direct Nanog target gene that can substitute for Nanog function in pluripotent cells. Cell Stem Cell 11, 477490.

REFERENCES Adams, C.C., and Workman, J.L. (1995). Binding of disparate transcriptional activators to nucleosomal DNA is inherently cooperative. Mol. Cell. Biol. 15, 14051421. Ang, Y.S., Tsai, S.Y., Lee, D.F., Monk, J., Su, J., Ratnakumar, K., Ding, J., Ge, Y., Darr, H., Chang, B., et al. (2011). Wdr5 mediates self-renewal and reprogramming via the embryonic stem cell core transcriptional network. Cell 145, 183197. Anguera, M.C., Sadreyev, R., Zhang, Z., Szanto, A., Payer, B., Sheridan, S.D., Kwok, S., Haggarty, S.J., Sur, M., Alvarez, J., et al. (2012). Molecular signatures of human induced pluripotent stem cells highlight sex differences and cancer genes. Cell Stem Cell 11, 7590. Anokye-Danso, F., Trivedi, C.M., Juhr, D., Gupta, M., Cui, Z., Tian, Y., Zhang, Y., Yang, W., Gruber, P.J., Epstein, J.A., and Morrisey, E.E. (2011). Highly efcient miRNA-mediated reprogramming of mouse and human somatic cells to pluripotency. Cell Stem Cell 8, 376388. Beato, M., and Eisfeld, K. (1997). Transcription factor access to chromatin. Nucleic Acids Res. 25, 35593563. ld, D., Zaouter, C., Klum, S., Sandberg, R., and Muhr, Bergsland, M., Ramsko J. (2011). Sequentially acting Sox transcription factors in neural lineage development. Genes Dev. 25, 24532464. Blau, H.M., Chiu, C.P., and Webster, C. (1983). Cytoplasmic activation of human nuclear genes in stable heterocaryons. Cell 32, 11711180. Bock, C., Kiskinis, E., Verstappen, G., Gu, H., Boulting, G., Smith, Z.D., Ziller, M., Croft, G.F., Amoroso, M.W., Oakley, D.H., et al. (2011). Reference Maps of human ES and iPS cell variation enable high-throughput characterization of pluripotent cell lines. Cell 144, 439452. Boyer, L.A., Lee, T.I., Cole, M.F., Johnstone, S.E., Levine, S.S., Zucker, J.P., Guenther, M.G., Kumar, R.M., Murray, H.L., Jenner, R.G., et al. (2005). Core transcriptional regulatory circuitry in human embryonic stem cells. Cell 122, 947956. Brambrink, T., Foreman, R., Welstead, G.G., Lengner, C.J., Wernig, M., Suh, H., and Jaenisch, R. (2008). Sequential expression of pluripotency markers during direct reprogramming of mouse somatic cells. Cell Stem Cell 2, 151159.

1340 Cell 152, March 14, 2013 2013 Elsevier Inc.

Filion, G.J., van Bemmel, J.G., Braunschweig, U., Talhout, W., Kind, J., Ward, L.D., Brugman, W., de Castro, I.J., Kerkhoven, R.M., Bussemaker, H.J., and van Steensel, B. (2010). Systematic protein location mapping reveals ve principal chromatin types in Drosophila cells. Cell 143, 212224. Gaetz, J., Clift, K.L., Fernandes, C.J., Mao, F.F., Lee, J.H., Zhang, L., Baker, S.W., Looney, T.J., Foshay, K.M., Yu, W.H., et al. (2012). Evidence for a critical role of gene occlusion in cell fate restriction. Cell Res. 22, 848858. Gaspar-Maia, A., Alajem, A., Polesso, F., Sridharan, R., Mason, M.J., Heidersbach, A., Ramalho-Santos, J., McManus, M.T., Plath, K., Meshorer, E., and Ramalho-Santos, M. (2009). Chd1 regulates open chromatin and pluripotency of embryonic stem cells. Nature 460, 863868. Ghisletti, S., Barozzi, I., Mietton, F., Polletti, S., De Santa, F., Venturini, E., Gregory, L., Lonie, L., Chew, A., Wei, C.L., et al. (2010). Identication and characterization of enhancers controlling the inammatory gene expression program in macrophages. Immunity 32, 317328. Gill, G., and Ptashne, M. (1988). Negative effect of the transcriptional activator GAL4. Nature 334, 721724. Golipour, A., David, L., Liu, Y., Jayakumaran, G., Hirsch, C.L., Trcka, D., and Wrana, J.L. (2012). A late transition in somatic cell reprogramming requires regulators distinct from the pluripotency network. Cell Stem Cell 11, 769782. Green, M.R. (2005). Eukaryotic transcription activation: right on target. Mol. Cell 18, 399402. Gualdi, R., Bossard, P., Zheng, M., Hamada, Y., Coleman, J.R., and Zaret, K.S. (1996). Hepatic specication of the gut endoderm in vitro: cell signaling and transcriptional control. Genes Dev. 10, 16701682. Guccione, E., Martinato, F., Finocchiaro, G., Luzi, L., Tizzoni, L., Dall Olio, V., Zardo, G., Nervi, C., Bernard, L., and Amati, B. (2006). Myc-binding-site recognition in the human genome is determined by chromatin context. Nat. Cell Biol. 8, 764770. Gurdon, J.B., Elsdale, T.R., and Fischberg, M. (1958). Sexually mature individuals of Xenopus laevis from the transplantation of single somatic nuclei. Nature 182, 6465. Hammachi, F., Morrison, G.M., Sharov, A.A., Livigni, A., Narayan, S., Papapetrou, E.P., OMalley, J., Kaji, K., Ko, M.S., Ptashne, M., and Brickman, J.M. (2012). Transcriptional activation by Oct4 is sufcient for the maintenance and induction of pluripotency. Cell Rep. 1, 99109. zo-Bravo, M.J., Ko, K., BerneHan, D.W., Greber, B., Wu, G., Tapia, N., Arau ler, H.R. (2011). Direct reprogramming of mann, C., Stehling, M., and Scho broblasts into epiblast stem cells. Nat. Cell Biol. 13, 6671. Hanna, J., Markoulaki, S., Schorderet, P., Carey, B.W., Beard, C., Wernig, M., Creyghton, M.P., Steine, E.J., Cassady, J.P., Foreman, R., et al. (2008). Direct reprogramming of terminally differentiated mature B lymphocytes to pluripotency. Cell 133, 250264. Hanna, J., Saha, K., Pando, B., van Zon, J., Lengner, C.J., Creyghton, M.P., van Oudenaarden, A., and Jaenisch, R. (2009). Direct cell reprogramming is a stochastic process amenable to acceleration. Nature 462, 595601. Hanna, J., Cheng, A.W., Saha, K., Kim, J., Lengner, C.J., Soldner, F., Cassady, J.P., Muffat, J., Carey, B.W., and Jaenisch, R. (2010). Human embryonic stem cells with biological and epigenetic characteristics similar to those of mouse ESCs. Proc. Natl. Acad. Sci. USA 107, 92229227. Harrington, M.A., Jones, P.A., Imagawa, M., and Karin, M. (1988). Cytosine methylation does not affect binding of transcription factor Sp1. Proc. Natl. Acad. Sci. USA 85, 20662070. Hathaway, N.A., Bell, O., Hodges, C., Miller, E.L., Neel, D.S., and Crabtree, G.R. (2012). Dynamics and memory of heterochromatin in living cells. Cell 149, 14471460. Hawkins, R.D., Hon, G.C., Lee, L.K., Ngo, Q., Lister, R., Pelizzola, M., Edsall, L.E., Kuan, S., Luu, Y., Klugman, S., et al. (2010). Distinct epigenomic landscapes of pluripotent and lineage-committed human cells. Cell Stem Cell 6, 479491. Heintzman, N.D., Hon, G.C., Hawkins, R.D., Kheradpour, P., Stark, A., Harp, L.F., Ye, Z., Lee, L.K., Stuart, R.K., Ching, C.W., et al. (2009). Histone modi-

cations at human enhancers reect global cell-type-specic gene expression. Nature 459, 108112. Hirai, H., Tani, T., Katoku-Kikyo, N., Kellner, S., Karian, P., Firpo, M., and Kikyo, N. (2011). Radical acceleration of nuclear reprogramming by chromatin remodeling with the transactivation domain of MyoD. Stem Cells 29, 13491361. Hirai, H., Katoku-Kikyo, N., Karian, P., Firpo, M., and Kikyo, N. (2012). Efcient iPS cell production with the MyoD transactivation domain in serum-free culture. PLoS ONE 7, e34149. Hoffman, L.M., Hall, L., Batten, J.L., Young, H., Pardasani, D., Baetge, E.E., Lawrence, J., and Carpenter, M.K. (2005). X-inactivation status varies in human embryonic stem cell lines. Stem Cells 23, 14681478. Huangfu, D., Maehr, R., Guo, W., Eijkelenboom, A., Snitow, M., Chen, A.E., and Melton, D.A. (2008). Induction of pluripotent stem cells by dened factors is greatly improved by small-molecule compounds. Nat. Biotechnol. 26, 795797. Ichida, J.K., Blanchard, J., Lam, K., Son, E.Y., Chung, J.E., Egli, D., Loh, K.M., Carter, A.C., Di Giorgio, F.P., Koszka, K., et al. (2009). A small-molecule inhibitor of tgf-Beta signaling replaces sox2 in reprogramming by inducing nanog. Cell Stem Cell 5, 491503. Jang, H., Kim, T.W., Yoon, S., Choi, S.Y., Kang, T.W., Kim, S.Y., Kwon, Y.W., Cho, E.J., and Youn, H.D. (2012). O-GlcNAc regulates pluripotency and reprogramming by directly acting on core components of the pluripotency network. Cell Stem Cell 11, 6274. Jiang, J., Chan, Y.S., Loh, Y.H., Cai, J., Tong, G.Q., Lim, C.A., Robson, P., Zhong, S., and Ng, H.H. (2008). A core Klf circuitry regulates self-renewal of embryonic stem cells. Nat. Cell Biol. 10, 353360. Judson, R.L., Babiarz, J.E., Venere, M., and Blelloch, R. (2009). Embryonic stem cell-specic microRNAs promote induced pluripotency. Nat. Biotechnol. 27, 459461. Kim, J., Chu, J., Shen, X., Wang, J., and Orkin, S.H. (2008). An extended transcriptional network for pluripotency of embryonic stem cells. Cell 132, 10491061. Kim, J., Woo, A.J., Chu, J., Snow, J.W., Fujiwara, Y., Kim, C.G., Cantor, A.B., and Orkin, S.H. (2010). A Myc network accounts for similarities between embryonic stem and cancer cell transcription programs. Cell 143, 313324. Kim, K.Y., Hysolli, E., and Park, I.H. (2011). Neuronal maturation defect in induced pluripotent stem cells from patients with Rett syndrome. Proc. Natl. Acad. Sci. USA 108, 1416914174. Koch, P., Breuer, P., Peitz, M., Jungverdorben, J., Kesavan, J., Poppe, D., ting, T., et al. (2011). Excitation-induced Doerr, J., Ladewig, J., Mertens, J., Tu ataxin-3 aggregation in neurons from patients with Machado-Joseph disease. Nature 480, 543546. Koche, R.P., Smith, Z.D., Adli, M., Gu, H., Ku, M., Gnirke, A., Bernstein, B.E., and Meissner, A. (2011). Reprogramming factor expression initiates widespread targeted chromatin remodeling. Cell Stem Cell 8, 96105. Koutroubas, G., Merika, M., and Thanos, D. (2008). Bypassing the requirements for epigenetic modications in gene transcription by increasing enhancer strength. Mol. Cell. Biol. 28, 926938. Kuijk, E.W., van Tol, L.T., Van de Velde, H., Wubbolts, R., Welling, M., Geijsen, N., and Roelen, B.A. (2012). The roles of FGF and MAP kinase signaling in the segregation of the epiblast and hypoblast cell lineages in bovine and human embryos. Development 139, 871882. Lengner, C.J., Gimelbrant, A.A., Erwin, J.A., Cheng, A.W., Guenther, M.G., Welstead, G.G., Alagappan, R., Frampton, G.M., Xu, P., Muffat, J., et al. (2010). Derivation of pre-X inactivation human embryonic stem cells under physiological oxygen concentrations. Cell 141, 872883. Li, B., Adams, C.C., and Workman, J.L. (1994). Nucleosome binding by the constitutive transcription factor Sp1. J. Biol. Chem. 269, 77567763. Li, B., Carey, M., and Workman, J.L. (2007). The role of chromatin during transcription. Cell 128, 707719. Li, R., Liang, J., Ni, S., Zhou, T., Qing, X., Li, H., He, W., Chen, J., Li, F., Zhuang, Q., et al. (2010). A mesenchymal-to-epithelial transition initiates and is required for the nuclear reprogramming of mouse broblasts. Cell Stem Cell 7, 5163.

Cell 152, March 14, 2013 2013 Elsevier Inc. 1341

Liang, G., Taranova, O., Xia, K., and Zhang, Y. (2010). Butyrate promotes induced pluripotent stem cell generation. J. Biol. Chem. 285, 2551625521. Liang, G., He, J., and Zhang, Y. (2012). Kdm2b promotes induced pluripotent stem cell generation by facilitating gene activation early in reprogramming. Nat. Cell Biol. 14, 457466. n, J., Rahl, P.B., Paranal, R.M., Burge, C.B., Bradner, J.E., Lee, Lin, C.Y., Love T.I., and Young, R.A. (2012). Transcriptional amplication in tumor cells with elevated c-Myc. Cell 151, 5667. Lister, R., Pelizzola, M., Kida, Y.S., Hawkins, R.D., Nery, J.R., Hon, G., Antosiewicz-Bourget, J., OMalley, R., Castanon, R., Klugman, S., et al. (2011). Hotspots of aberrant epigenomic reprogramming in human induced pluripotent stem cells. Nature 471, 6873. Loh, Y.H., Wu, Q., Chew, J.L., Vega, V.B., Zhang, W., Chen, X., Bourque, G., George, J., Leong, B., Liu, J., et al. (2006). The Oct4 and Nanog transcription network regulates pluripotency in mouse embryonic stem cells. Nat. Genet. 38, 431440. Lorch, Y., LaPointe, J.W., and Kornberg, R.D. (1987). Nucleosomes inhibit the initiation of transcription but allow chain elongation with the displacement of histones. Cell 49, 203210. Lowry, W.E. (2012). Does transcription factor induced pluripotency accurately mimic embryo derived pluripotency? Curr. Opin. Genet. Dev. 22, 429434. Maherali, N., Sridharan, R., Xie, W., Utikal, J., Eminli, S., Arnold, K., Stadtfeld, M., Yachechko, R., Tchieu, J., Jaenisch, R., et al. (2007). Directly reprogrammed broblasts show global epigenetic remodeling and widespread tissue contribution. Cell Stem Cell 1, 5570. Mali, P., Chou, B.K., Yen, J., Ye, Z., Zou, J., Dowey, S., Brodsky, R.A., Ohm, J.E., Yu, W., Baylin, S.B., et al. (2010). Butyrate greatly enhances derivation of human induced pluripotent stem cells by promoting epigenetic remodeling and the expression of pluripotency-associated genes. Stem Cells 28, 713720. Mansour, A.A., Gafni, O., Weinberger, L., Zviran, A., Ayyash, M., Rais, Y., Krupalnik, V., Zerbib, M., Amann-Zalcenstein, D., Maza, I., et al. (2012). The H3K27 demethylase Utx regulates somatic and germ cell epigenetic reprogramming. Nature 488, 409413. Marchetto, M.C., Carromeu, C., Acab, A., Yu, D., Yeo, G.W., Mu, Y., Chen, G., Gage, F.H., and Muotri, A.R. (2010). A model for neural development and treatment of Rett syndrome using human induced pluripotent stem cells. Cell 143, 527539. Marks, H., Kalkan, T., Menafra, R., Denissov, S., Jones, K., Hofemeister, H., Nichols, J., Kranz, A., Stewart, A.F., Smith, A., and Stunnenberg, H.G. (2012). The transcriptional and epigenomic foundations of ground state pluripotency. Cell 149, 590604. Mekhoubad, S., Bock, C., de Boer, A.S., Kiskinis, E., Meissner, A., and Eggan, K. (2012). Erosion of dosage compensation impacts human iPSC disease modeling. Cell Stem Cell 10, 595609. Mikkelsen, T.S., Hanna, J., Zhang, X., Ku, M., Wernig, M., Schorderet, P., Bernstein, B.E., Jaenisch, R., Lander, E.S., and Meissner, A. (2008). Dissecting direct reprogramming through integrative genomic analysis. Nature 454, 4955. Nagamatsu, G., Saito, S., Kosaka, T., Takubo, K., Kinoshita, T., Oya, M., Horimoto, K., and Suda, T. (2012). Optimal ratio of transcription factors for somatic cell reprogramming. J. Biol. Chem. 287, 3627336282. Nakagawa, M., Koyanagi, M., Tanabe, K., Takahashi, K., Ichisaka, T., Aoi, T., Okita, K., Mochiduki, Y., Takizawa, N., and Yamanaka, S. (2008). Generation of induced pluripotent stem cells without Myc from mouse and human broblasts. Nat. Biotechnol. 26, 101106. Nazor, K.L., Altun, G., Lynch, C., Tran, H., Harness, J.V., Slavin, I., Garitaonan ller, F.J., Wang, Y.C., Boscolo, F.S., et al. (2012). Recurrent dia, I., Mu variations in DNA methylation in human pluripotent stem cells and their differentiated derivatives. Cell Stem Cell 10, 620634. Nichols, J., and Smith, A. (2009). Naive and primed pluripotent states. Cell Stem Cell 4, 487492.

Nie, Z., Hu, G., Wei, G., Cui, K., Yamane, A., Resch, W., Wang, R., Green, D.R., Tessarollo, L., Casellas, R., et al. (2012). c-Myc is a universal amplier of expressed genes in lymphocytes and embryonic stem cells. Cell 151, 6879. Niwa, H., Miyazaki, J., and Smith, A.G. (2000). Quantitative expression of Oct3/4 denes differentiation, dedifferentiation or self-renewal of ES cells. Nat. Genet. 24, 372376. pot, D., Peynot, N., Fauque, P., Daniel, N., DiabanOkamoto, I., Patrat, C., The gouaya, P., Wolf, J.P., Renard, J.P., Duranthon, V., and Heard, E. (2011). Eutherian mammals use diverse strategies to initiate X-chromosome inactivation during development. Nature 472, 370374. Okita, K., Ichisaka, T., and Yamanaka, S. (2007). Generation of germlinecompetent induced pluripotent stem cells. Nature 448, 313317. Onder, T.T., and Daley, G.Q. (2012). New lessons learned from disease modeling with induced pluripotent stem cells. Curr. Opin. Genet. Dev. 22, 500508. Onder, T.T., Kara, N., Cherry, A., Sinha, A.U., Zhu, N., Bernt, K.M., Cahan, P., Marcarci, B.O., Unternaehrer, J., Gupta, P.B., et al. (2012). Chromatinmodifying enzymes as modulators of reprogramming. Nature 483, 598602. Pardo, M., Lang, B., Yu, L., Prosser, H., Bradley, A., Babu, M.M., and Choudhary, J. (2010). An expanded Oct4 interaction network: implications for stem cell biology, development, and disease. Cell Stem Cell 6, 382395. Pasque, V., Gillich, A., Garrett, N., and Gurdon, J.B. (2011). Histone variant macroH2A confers resistance to nuclear reprogramming. EMBO J. 30, 23732387. Pasque, V., Radzisheuskaya, A., Gillich, A., Halley-Stott, R.P., Panamarova, M., Zernicka-Goetz, M., Surani, M.A., and Silva, J.C. (2012). Histone variant macroH2A marks embryonic differentiation in vivo and acts as an epigenetic barrier to induced pluripotency. J. Cell Sci. Published October 17, 2012. http://dx.doi.org/10.1242/jcs.113019. Pawlak, M., and Jaenisch, R. (2011). De novo DNA methylation by Dnmt3a and Dnmt3b is dispensable for nuclear reprogramming of somatic cells to a pluripotent state. Genes Dev. 25, 10351040. Percharde, M., Lavial, F., Ng, J.H., Kumar, V., Tomaz, R.A., Martin, N., Yeo, J.C., Gil, J., Prabhakar, S., Ng, H.H., et al. (2012). Ncoa3 functions as an essential Esrrb coactivator to sustain embryonic stem cell self-renewal and reprogramming. Genes Dev. 26, 22862298. Pfaff, N., Fiedler, J., Holzmann, A., Schambach, A., Moritz, T., Cantz, T., and Thum, T. (2011). miRNA screening reveals a new miRNA family stimulating iPS cell generation via regulation of Meox2. EMBO Rep. 12, 11531159. Plath, K., and Lowry, W.E. (2011). Progress in understanding reprogramming to the induced pluripotent state. Nat. Rev. Genet. 12, 253265. Polo, J.M., Liu, S., Figueroa, M.E., Kulalert, W., Eminli, S., Tan, K.Y., Apostolou, E., Stadtfeld, M., Li, Y., Shioda, T., et al. (2010). Cell type of origin inuences the molecular and functional properties of mouse induced pluripotent stem cells. Nat. Biotechnol. 28, 848855. Polo, J.M., Anderssen, E., Walsh, R.M., Schwarz, B.A., Nefzger, C.M., Lim, S.M., Borkent, M., Apostolou, E., Alaei, S., Cloutier, J., et al. (2012). A molecular roadmap of reprogramming somatic cells into iPS cells. Cell 151, 1617 1632. Pomp, O., Dreesen, O., Leong, D.F., Meller-Pomp, O., Tan, T.T., Zhou, F., and Colman, A. (2011). Unexpected X chromosome skewing during culture and reprogramming of human somatic cells can be alleviated by exogenous telomerase. Cell Stem Cell 9, 156165. Prendergast, G.C., and Ziff, E.B. (1991). Methylation-sensitive sequencespecic DNA binding by the c-Myc basic region. Science 251, 186189. Rada-Iglesias, A., Bajpai, R., Swigut, T., Brugmann, S.A., Flynn, R.A., and Wysocka, J. (2011). A unique chromatin signature uncovers early developmental enhancers in humans. Nature 470, 279283. Rahl, P.B., Lin, C.Y., Seila, A.C., Flynn, R.A., McCuine, S., Burge, C.B., Sharp, P.A., and Young, R.A. (2010). c-Myc regulates transcriptional pause release. Cell 141, 432445. Ramirez-Carrozzi, V.R., Braas, D., Bhatt, D.M., Cheng, C.S., Hong, C., Doty, K.R., Black, J.C., Hoffmann, A., Carey, M., and Smale, S.T. (2009). A unifying

1342 Cell 152, March 14, 2013 2013 Elsevier Inc.

model for the selective regulation of inducible transcription by CpG islands and nucleosome remodeling. Cell 138, 114128. Roode, M., Blair, K., Snell, P., Elder, K., Marchant, S., Smith, A., and Nichols, J. (2012). Human hypoblast formation is not dependent on FGF signalling. Dev. Biol. 361, 358363. Samavarchi-Tehrani, P., Golipour, A., David, L., Sung, H.K., Beyer, T.A., Datti, A., Woltjen, K., Nagy, A., and Wrana, J.L. (2010). Functional genomics reveals a BMP-driven mesenchymal-to-epithelial transition in the initiation of somatic cell reprogramming. Cell Stem Cell 7, 6477. Sancho-Martinez, I., Baek, S.H., and Izpisua Belmonte, J.C. (2012). Lineage conversion methodologies meet the reprogramming toolbox. Nat. Cell Biol. 14, 892899. Shen, Y., Matsuno, Y., Fouse, S.D., Rao, N., Root, S., Xu, R., Pellegrini, M., Riggs, A.D., and Fan, G. (2008). X-inactivation in female human embryonic stem cells is in a nonrandom pattern and prone to epigenetic alterations. Proc. Natl. Acad. Sci. USA 105, 47094714. Silva, S.S., Rowntree, R.K., Mekhoubad, S., and Lee, J.T. (2008). X-chromosome inactivation and epigenetic uidity in human embryonic stem cells. Proc. Natl. Acad. Sci. USA 105, 48204825. Silva, J., Nichols, J., Theunissen, T.W., Guo, G., van Oosten, A.L., Barrandon, O., Wray, J., Yamanaka, S., Chambers, I., and Smith, A. (2009). Nanog is the gateway to the pluripotent ground state. Cell 138, 722737. zo-Bravo, M.J., Han, D.W., Greber, B., Singhal, N., Graumann, J., Wu, G., Arau ler, H.R. (2010). Chromatin-Remodeling Gentile, L., Mann, M., and Scho Components of the BAF Complex Facilitate Reprogramming. Cell 141, 943955. Smith, Z.D., Nachman, I., Regev, A., and Meissner, A. (2010). Dynamic singlecell imaging of direct reprogramming reveals an early specifying event. Nat. Biotechnol. 28, 521526. Sou, A., Donahue, G., and Zaret, K.S. (2012). Facilitators and impediments of the pluripotency reprogramming factors initial engagement with the genome. Cell 151, 9941004. Sridharan, R., Tchieu, J., Mason, M.J., Yachechko, R., Kuoy, E., Horvath, S., Zhou, Q., and Plath, K. (2009). Role of the murine reprogramming factors in the induction of pluripotency. Cell 136, 364377. Stadtfeld, M., and Hochedlinger, K. (2010). Induced pluripotency: history, mechanisms, and applications. Genes Dev. 24, 22392263. Stadtfeld, M., Maherali, N., Breault, D.T., and Hochedlinger, K. (2008). Dening molecular cornerstones during broblast to iPS cell reprogramming in mouse. Cell Stem Cell 2, 230240. Stadtfeld, M., Apostolou, E., Ferrari, F., Choi, J., Walsh, R.M., Chen, T., Ooi, S.S., Kim, S.Y., Bestor, T.H., Shioda, T., et al. (2012). Ascorbic acid prevents loss of Dlk1-Dio3 imprinting and facilitates generation of all-iPS cell mice from terminally differentiated B cells. Nat. Genet. 44, 398405. Taberlay, P.C., Kelly, T.K., Liu, C.C., You, J.S., De Carvalho, D.D., Miranda, T.B., Zhou, X.J., Liang, G., and Jones, P.A. (2011). Polycomb-repressed genes have permissive enhancers that initiate reprogramming. Cell 147, 12831294. Takahashi, K., and Yamanaka, S. (2006). Induction of pluripotent stem cells from mouse embryonic and adult broblast cultures by dened factors. Cell 126, 663676. Tchieu, J., Kuoy, E., Chin, M.H., Trinh, H., Patterson, M., Sherman, S.P., Aimiuwu, O., Lindgren, A., Hakimian, S., Zack, J.A., et al. (2010). Female human iPSCs retain an inactive X chromosome. Cell Stem Cell 7, 329342. fer, T., Teif, V.B., Vainshtein, Y., Caudron-Herger, M., Mallm, J.P., Marth, C., Ho and Rippe, K. (2012). Genome-wide nucleosome positioning during embryonic stem cell development. Nat. Struct. Mol. Biol. 19, 11851192. Tesar, P.J., Chenoweth, J.G., Brook, F.A., Davies, T.J., Evans, E.P., Mack, D.L., Gardner, R.L., and McKay, R.D. (2007). New cell lines from mouse epiblast share dening features with human embryonic stem cells. Nature 448, 196199. ler, H.R., SchamTiemann, U., Sgodda, M., Warlich, E., Ballmaier, M., Scho bach, A., and Cantz, T. (2011). Optimal reprogramming factor stoichiometry

increases colony numbers and affects molecular characteristics of murine induced pluripotent stem cells. Cytometry A 79, 426435. Tomoda, K., Takahashi, K., Leung, K., Okada, A., Narita, M., Yamada, N.A., Eilertson, K.E., Tsang, P., Baba, S., White, M.P., et al. (2012). Derivation conditions impact X-inactivation status in female human induced pluripotent stem cells. Cell Stem Cell 11, 9199. Trounson, A., Shepard, K.A., and DeWitt, N.D. (2012). Human disease modeling with induced pluripotent stem cells. Curr. Opin. Genet. Dev. 22, 509516. te , J., Owen-Hughes, T., and Workman, J.L. (1997). SWI/SNF Utley, R.T., Co stimulates the formation of disparate activator-nucleosome complexes but is partially redundant with cooperative binding. J. Biol. Chem. 272, 1264212649. van den Berg, D.L., Snoek, T., Mullin, N.P., Yates, A., Bezstarosti, K., Demmers, J., Chambers, I., and Poot, R.A. (2010). An Oct4-centered protein interaction network in embryonic stem cells. Cell Stem Cell 6, 369381. Visel, A., Blow, M.J., Li, Z., Zhang, T., Akiyama, J.A., Holt, A., Plajzer-Frick, I., Shoukry, M., Wright, C., Chen, F., et al. (2009). ChIP-seq accurately predicts tissue-specic activity of enhancers. Nature 457, 854858. Wallrath, L.L., Lu, Q., Granok, H., and Elgin, S.C. (1994). Architectural variations of inducible eukaryotic promoters: preset and remodeling chromatin structures. Bioessays 16, 165170. Wang, T., Chen, K., Zeng, X., Yang, J., Wu, Y., Shi, X., Qin, B., Zeng, L., Esteban, M.A., Pan, G., and Pei, D. (2011a). The histone demethylases Jhdm1a/1b enhance somatic cell reprogramming in a vitamin-C-dependent manner. Cell Stem Cell 9, 575587. Wang, W., Yang, J., Liu, H., Lu, D., Chen, X., Zenonos, Z., Campos, L.S., Rad, R., Guo, G., Zhang, S., et al. (2011b). Rapid and efcient reprogramming of somatic cells to induced pluripotent stem cells by retinoic acid receptor gamma and liver receptor homolog 1. Proc. Natl. Acad. Sci. USA 108, 1828318288. Wang, Y., Chen, J., Hu, J.L., Wei, X.X., Qin, D., Gao, J., Zhang, L., Jiang, J., Li, J.S., Liu, J., et al. (2011c). Reprogramming of mouse and human somatic cells by high-performance engineered factors. EMBO Rep. 12, 373378. Ware, C.B., Wang, L., Mecham, B.H., Shen, L., Nelson, A.M., Bar, M., Lamba, D.A., Dauphin, D.S., Buckingham, B., Askari, B., et al. (2009). Histone deacetylase inhibition elicits an evolutionarily conserved self-renewal program in embryonic stem cells. Cell Stem Cell 4, 359369. Wernig, M., Meissner, A., Foreman, R., Brambrink, T., Ku, M., Hochedlinger, K., Bernstein, B.E., and Jaenisch, R. (2007). In vitro reprogramming of broblasts into a pluripotent ES-cell-like state. Nature 448, 318324. Wernig, M., Meissner, A., Cassady, J.P., and Jaenisch, R. (2008). c-Myc is dispensable for direct reprogramming of mouse broblasts. Cell Stem Cell 2, 1012. Wolffe, A.P. (1991). Implications of DNA replication for eukaryotic gene expression. J. Cell Sci. 99, 201206. Wutz, A. (2011). Gene silencing in X-chromosome inactivation: advances in understanding facultative heterochromatin formation. Nat. Rev. Genet. 12, 542553. Yamaguchi, S., Hirano, K., Nagata, S., and Tada, T. (2011). Sox2 expression effects on direct reprogramming efciency as determined by alternative somatic cell fate. Stem Cell Res. (Amst.) 6, 177186. You, J.S., Kelly, T.K., De Carvalho, D.D., Taberlay, P.C., Liang, G., and Jones, P.A. (2011). OCT4 establishes and maintains nucleosome-depleted regions that provide additional layers of epigenetic regulation of its target genes. Proc. Natl. Acad. Sci. USA 108, 1449714502. Zaret, K.S., and Carroll, J.S. (2011). Pioneer transcription factors: establishing competence for gene expression. Genes Dev. 25, 22272241. Zhu, J., Adli, M., Zou, J.Y., Verstappen, G., Coyne, M., Zhang, X., Durham, T., Miri, M., Deshpande, V., De Jager, P.L., et al. (2013). Genome-wide Chromatin State Transitions Associated with Developmental and Environmental Cues. Cell 152, 642654.

Cell 152, March 14, 2013 2013 Elsevier Inc. 1343

Review
Chromatin Remodeling at DNA Double-Strand Breaks
Brendan D. Price1 and Alan D. DAndrea1,*

Leading Edge

1Division of Genomic Stability and DNA Repair, Department of Radiation Oncology, Dana-Farber Cancer Institute, Harvard Medical School, 450 Brookline Avenue, Boston, MA 02215, USA *Correspondence: alan_dandrea@dfci.harvard.edu http://dx.doi.org/10.1016/j.cell.2013.02.011

DNA double-strand breaks (DSBs) can arise from multiple sources, including exposure to ionizing radiation. The repair of DSBs involves both posttranslational modication of nucleosomes and concentration of DNA-repair proteins at the site of damage. Consequently, nucleosome packing and chromatin architecture surrounding the DSB may limit the ability of the DNA-damage response to access and repair the break. Here, we review early chromatin-based events that promote the formation of open, relaxed chromatin structures at DSBs and that allow the DNA-repair machinery to access the spatially conned region surrounding the DSB, thereby facilitating mammalian DSB repair.
DNA Double-Strand Breaks and Cancer Maintaining the integrity of genetic information is critical both for normal cellular functions and for suppressing mutagenic events that can lead to cancer. Damage to DNA can arise from external sources, such as exposure to ionizing radiation (IR), ultraviolet radiation (UV), or environmental toxins, or from endogenous sources, such as reactive oxygen species or errors during DNA replication. These events can generate a wide range of DNA lesions, including modied bases or sugar residues, the formation of DNA adducts, crosslinking of the DNA strands, and production of single- and double-strand breaks (DSBs). Consequently, cells have evolved at least six different DNA-repair pathways to deal with these distinct types of DNA damage (Kennedy and DAndrea, 2006). Among these lesions, DNA DSBs are particularly lethal because they result in physical cleavage of the DNA backbone. DSBs can occur through replication-fork collapse, during the processing of interstrand crosslinks, or following exposure to IR (Ciccia and Elledge, 2010; Jackson and Bartek, 2009; Kennedy and DAndrea, 2006). Because IR (radiation therapy) is widely used to treat cancer, understanding how cells repair DSBs created by IR, and how this process is altered in tumors, is of high signicance. Chromatin Structure and DSB Repair DSB repair takes place within the complex organization of the chromatin, and it is clear from work in many model systems that chromatin structure and nucleosome organization represent a signicant barrier to the efcient detection and repair of DSBs. Mammalian cells contain a diverse array of specialized chromatin structures, such as active genes, telomeres, replication forks, intergenic regions, and compact heterochromatin. These structures are distinguished by specic patterns of histone modications, unique histone variants, arrays of chromatinbinding proteins, and the density of nucleosome packing (de
1344 Cell 152, March 14, 2013 2013 Elsevier Inc.

Wit and van Steensel, 2009; Grewal and Jia, 2007; Peng and Karpen, 2008). This complexity and diversity in chromatin organization present a series of challenges to the DSB-repair machinery. The impact of chromatin on DNA repair was rst described in the access-repair-restore model (Smerdon, 1991; reviewed in Soria et al., 2012). This model proposed the minimal steps needed to reorganize the chromatin and repair DNA damage. Broadly, the DSB-repair machinery must be able to (1) detect DNA damage in different chromatin structures; (2) remodel the local chromatin architecture to provide access to the site of damage; (3) reorganize the nucleosome-DNA template for processing and repair of the damage; and, importantly, (4) restore the local chromatin organization after repair has been completed. Since this model was rst put forward in 1991, we now know many of the remodeling factors and histone-modifying enzymes that act to create open chromatin structures and promote DNA repair, as well as factors such as histone chaperones, deacetylases, and phosphatases that reassemble the chromatin after repair is complete. Here, we will focus on the access component of the access-repair-restore model, reviewing some of the early (secondsminutes) remodeling events that occur after DNA damage and that are required to create open chromatin structures. Although the accessrepair-restore model is likely applicable to the repair of all types of DNA damage, we will focus our discussion specically on the repair of DNA DSBs. In particular, we will examine three broad chromatin-based events that occur during the rst seconds-to-minutes after production of DSBs: (1) the formation of open chromatin structures at DSBs through acetylation of histone H4; (2) the importance of kap-1 in promoting chromatin relaxation in heterochromatin; and (3) the rapid polyADP-ribosylation (PARylation) of the chromatin by the polyADP-ribose polymerase (Parp) family, which promotes the transient recruitment of chromatin-remodeling enzymes and heterochromatin factors to the DSB.

well as extensive posttranslational modication of the nucleosomes. DSB Repair by HR and NHEJ The actual repair of DSBs can proceed through two distinct mechanisms: the error-prone nonhomologous end-joining (NHEJ) pathway and the error-free homologous recombination (HR) pathway (Huertas, 2010; Jackson and Bartek, 2009). NHEJ involves minimal processing of the damaged DNA by nucleases, followed by direct re-ligation of the DNA ends. NHEJ requires the Ku70/80 DNA-binding complex and the DNA-PKcs kinase. In contrast, HR requires the generation of single-stranded DNA (ssDNA) intermediates, which are used for homology searching within adjacent sister chromatids. The production of ssDNA requires the initial nuclease activity of the CtIP-MRN complex (Sartori et al., 2007), followed by further end processing by additional nucleases to produce ssDNA intermediates (Symington and Gautier, 2011). This ssDNA is then used for homology searching in sister chromatids, which then provide the template for accurate repair of DSBs by HR. Importantly, because sister chromatids are only present during the S and G2 phases of the cell cycle, HR repair is restricted to this part of the cell cycle. Consequently, NHEJ predominates in G1 and HR in S and G2 phases. However, how cells regulate the choice between HR and NHEJ repair pathways is not well understood, although both the 53BP1 and brca1 proteins can play a key role in this choice (Bothmer et al., 2010; Bunting et al., 2010). Inuence of Chromatin Organization on Genomic Stability The nucleosome is the basic functional unit of chromatin and consists of 147 bp of DNA wrapped around a histone octamer (Campos and Reinberg, 2009). Nucleosomes form linear 10 nm beads-on-a-string structures that pack together to form 30 nm arrays and other higher-order structures. The core of each nucleosome contains two H3-H4 dimers and two H2A-H2B dimers. The N-terminal tails of histones extend out from the nucleosome and contain conserved lysine residues that can be modied by acetylation, methylation, or ubiquitination. These modications can function to attract specic chromatin complexes that can then alter nucleosome function. In addition to histone posttranslational modications, chromatin organization is also regulated by multisubunit remodeling complexes built around a large motor ATPase. Four major ATPase families, including the SWI/ SNF, CHD, INO80, and ISWI families, have been identied in eukaryotes (Clapier and Cairns, 2009). These remodeling complexes utilize the energy from ATP hydrolysis to (1) remove nucleosomes from the chromatin and create open DNA sequences; (2) shift the position of the nucleosome relative to the DNA by exposing (or burying) a DNA sequence (nucleosome sliding); or (3) exchange pre-existing histones for specialized histone variants. Chromatin-remodeling complexes and histone modications can alter the interaction within or between adjacent nucleosomes and recruit chromatin-binding proteins to specic regions (Cairns, 2005; Campos and Reinberg, 2009). Nucleosomes can therefore be envisaged as dynamic hubs to which chromatin-modifying proteins and specic modications attach
Cell 152, March 14, 2013 2013 Elsevier Inc. 1345

Figure 1. The Mechanism of DSB Repair


Top: ATM phosphorylates H2AX at DSBs, creating a binding site for the mdc1 protein. ATM-MRN complexes then associate with mdc1, promoting the spreading of gH2AX along the chromatin for hundreds of kilobases. Bottom: mdc1 recruits multiple DSB-repair proteins, including the RNF8/ RNF168 ubiquitin ligases, to sites of damage. Chromatin ubiquitination then facilitates loading of the brca1 complex and 53BP1 DSB-repair proteins. P = phosphorylation, Ub = ubiquitination, MRN = mre11-rad50-nbs1 complex.

DSB Repair in Mammalian Cells The mammalian DSB-repair pathway is a complex signaling mechanism that regulates the two key responses to DSBs the rapid activation of cell-cycle checkpoints and the recruitment of DNA-repair proteins onto the chromatin at the DSB (Figure 1). The MRN complex, consisting of the mre11, rad50, and nbs1 proteins, is rst recruited to DSBs, where it functions to recruit and activate the ATM protein kinase (Lavin, 2008; Sun et al., 2010). Activated ATM has been shown to phosphorylate hundreds of proteins (Matsuoka et al., 2007), including proteins involved in checkpoint activation (e.g., p53 and chk2) and DNA-repair proteins such as brca1 and 53BP1 (Ciccia and Elledge, 2010; Jackson and Bartek, 2009; Kennedy and DAndrea, 2006). A critical target for ATM is phosphorylation of the C terminus of the histone variant H2AX. Phosphorylated H2AX (referred to as gH2AX) creates a binding site for the BRCT domains of the mdc1 protein (Lou et al., 2006; Stucki et al., 2005) (Figure 1). Positioning of mdc1 at the DSB creates a docking site for additional DSB-repair proteins, including the MRNATM complex (Chapman and Jackson, 2008; Melander et al., 2008). Consequently, phosphorylation of H2AX by ATM spreads away from the DSB, creating gH2AX domains that extend for hundreds of kilobases along the chromatin from the DSB (Bonner et al., 2008; Rogakou et al., 1999). The mdc1 protein also recruits late-acting effector proteins, including the RNF8 and RNF168 ubiquitin ligases, which ubiquitinate the chromatin and promote loading of the brca1 and 53BP1 proteins (Doil et al., 2009; Kolas et al., 2007). Similar to gH2AX spreading, chromatin ubiquitination can also spread for tens of kilobases from the DSB (Xu et al., 2010). This extension of chromatin ubiquitination is opposed by the activity of the two E3 ligases, TRIP12 and UBR5, which promote the ubiquitin-dependent degradation of RNF168 (Gudjonsson et al., 2012). DSB repair therefore involves the sequential recruitment and concentration of thousands of copies of individual DSB-repair proteins onto the chromatin, as

and that regulate the function and packing of the DNA in the chromatin. The importance of chromatin organization in maintaining genomic stability is underscored by studies demonstrating that mutation rates are not even across the human genome. Sequencing of multiple cancer genomes has revealed that mutations accumulate at much higher levels in compact, H3K9me3 ckler and Lehner, rich heterochromatin domains (Schuster-Bo 2012), consistent with the slower rates of DNA repair reported in heterochromatin (Goodarzi et al., 2008; Noon et al., 2010). Further, inserts and deletions are depleted around nucleosomes, whereas mutations tend to cluster on the nucleosomal DNA (Chen et al., 2012; Sasaki et al., 2009; Tolstorukov et al., 2011), and both can be inuenced by the presence of specic epige ckler and netic modications on the nucleosome (Schuster-Bo Lehner, 2012; Tolstorukov et al., 2011). Some of these differences in mutation rates may accrue by negative selection (for example, selection against mutations in coding regions) or through protection of the DNA from mutagens by association with nucleosomes. However, the elevated mutation rates in compact, transcriptionally silent heterochromatin domains ckler and Lehner, 2012) imply that chromatin (Schuster-Bo packing may impact the detection or repair of damage by the DNA-repair machinery. That is, the ability of the DNA-repair machinery to access the DNA can have a signicant impact on genomic stability within specic regions. DSBs Promote Rapid Histone H4 Acetylation One of the best of the best characterized changes in chromatin organization is the rapid formation of open chromatin structures at DSBs. Several groups have demonstrated that this process is associated with increased acetylation of histones H2A and H4 on nucleosomes at DSBs (Downs et al., 2004; Jha et al., 2008; Kusch et al., 2004; Murr et al., 2006). This acetylation extends for hundreds of kilobases away from the break (Downs et al., 2004; Murr et al., 2006; Xu et al., 2010), similar to the spreading of gH2AX (Figure 1). The acetylation of histone H4 at DSBs is dependent on the Tip60 acetyltransferase, a haploinsufcient tumor-suppressor protein that is required for the repair of DSBs te , 2004; Gorrini et al., 2007; Sun et al., 2010). (Doyon and Co Tip60 is rapidly recruited to DSBs, where it can acetylate multiple DDR proteins, including histones H2A and H4, the ATM kinase, p53, and other repair proteins (Bird et al., 2002; Ikura et al., 2007; Jha et al., 2008; Sun et al., 2005, 2010; Sykes et al., 2006). Tip60 functions in DSB repair as a subunit of the human NuA4 (hNuA4) remodeling complex. hNuA4 contains at least te , 2004), of which 4 posses catalytic 16 subunits (Doyon and Co activitythe Tip60 acetyltransferase, the p400 motor ATPase, and the Ruvbl1 and Ruvbl2 helicase-like proteins. Multiple subunits of hNuA4, including Tip60 (Sun et al., 2009), Trrap (Downs et al., 2004; Kusch et al., 2004; Murr et al., 2006), p400 (Xu et al., 2010), and ruvbl1 and ruvbl2 (Jha et al., 2008) are corecruited to DSBs, suggesting that these proteins are recruited together as components of hNuA4. Interestingly, hNuA4 is a fusion of two separate yeast complexesthe smaller yeast NuA4 (yNuA4) complex, which contains the Tip60 homolog esa1, and the ySWR1 complex, which contains the Swr1 ATPase and the yeast Ruvbl1 and Ruvbl2
1346 Cell 152, March 14, 2013 2013 Elsevier Inc.

te , 2004). homologs (Clapier and Cairns, 2009; Doyon and Co Both yNuA4 (Downs et al., 2004) and ySWR1 complexes (Papamichos-Chronakis et al., 2006; van Attikum et al., 2007) are recruited to enzymatically generated DSBs in yeast. However, whereas yNuA4 and SWR1 are recruited to DSBs through direct interaction with gH2AX (Downs et al., 2004; van Attikum et al., 2007), hNuA4 is loaded onto chromatin through interaction with the mdc1 protein (Xu and Price, 2011; Xu et al., 2010). However, in both yeast and mammalian cells, loading of either yNuA4 or hNuA4 at DSBs leads to the rapid acetylation of the N-terminal tail of histone H4 by Tip60 (Downs et al., 2004; Ikura et al., 2007; Murr et al., 2006; Sun et al., 2009; Xu et al., 2010). Inactivation of Tip60 (Bird et al., 2002; Downs et al., 2004; Ikura et al., 2000; Murr et al., 2006) blocks H4 acetylation and increases sensitivity to DNA damage. Finally, mutation of the Tip60 acetylation sites on H4 in yeast increases sensitivity to DNA damage similar to that seen following Tip60 inactivation (Bird et al., 2002; Downs et al., 2004). Although mutation of the N-terminal tail of H4 is not possible in mammalian cells, the results from both yeast and mammalian systems indicate that the rapid recruitment of NuA4 complexes containing Tip60 to DSBs leads to the increased acetylation of histone H4 and H2A adjacent to the DSB. Histone Acetylation Creates Open Chromatin Structures It is well-established that open chromatin conformations at actively transcribed genes are associated with acetylation of histone H4 (Campos and Reinberg, 2009; de Wit and van Steensel, 2009). The N-terminal tail of histone H4 can interact with the acidic patch on the surface of H2A-H2B dimers of adjacent nucleosomes (Luger et al., 2012). Disruption of this interaction by acetylation of H4 on lysine 16 (Robinson et al., 2008; ShogrenKnaak et al., 2006) inhibits packing of 30 nm bers and leads to chromatin decompaction. The increase in acetylation of histones H2A and H4 at DSBs may therefore promote chromatin unpacking and direct the formation of open, relaxed chromatin structures detected at DSBs (Kruhlak et al., 2006). In fact, several studies have demonstrated that chromatin at DSBs undergoes a transition to a more open, less compact conformation. For example, the sensitivity of DNA to nuclease digestion increases after DNA damage (Smerdon et al., 1978; Ziv et al., 2006), indicating that linker DNA between nucleosomes is more accessible. Depletion of histone H1, which binds to linker DNA and promotes nucleosome packing, promotes chromatin relaxation and facilitates DSB repair (Murga et al., 2007). Histones at DSBs are susceptible to extraction in low salt (Xu et al., 2010), implying a weaker interaction between DNA and histones at DSBs. Further, biophysical studies demonstrate that DSBs lead to a localized chromatin expansion at DSBs (Kruhlak et al., 2006). Finally, inactivation of Tip60 (Murr et al., 2006; Xu et al., 2010, 2012) blocked the formation of open chromatin structures at DSBs, consistent with acetylation of histone H4 by Tip60 playing a central role in creating open, exible chromatin structures at DSBs. The p400 ATPase of hNuA4 Catalyzes H2A.Z Exchange at DSBs In addition to Tip60, the hNuA4 complex also contains the p400 motor ATPase. p400 is a member of the Ino80 family of

Figure 2. H2A.Z Exchange Drives H4 Acetylation


Exchange of H2A for H2A.Z alters interaction between the N-terminal tail of H4 and adjacent nucleosomes, exposing the tail to acetylation by Tip60. The combination of H2A.Z exchange and H4 acetylation functions to shift chromatin into the open, relaxed conformation required for DSB repair. H4 = histone H4 tail, Ac = acetylation.

chromatin-remodeling ATPases, which includes two yeast proteinsyIno80 and ySwr1. yIno80 and ySwr1 are both recruited to DSBs in yeast, and loss of either component leads to signicant defects in both checkpoint activation and DSB repair (Downs et al., 2004; Papamichos-Chronakis et al., 2006; van Attikum et al., 2007). Members of the Ino80 family, including the mammalian p400 ATPase, can exchange histone H2A for vry et al., 2007; the H2A variant H2A.Z (Fuchs et al., 2001; Ge Kusch et al., 2004), suggesting that Ino80 family members may regulate H2A.Z exchange during DSB repair. Indeed, in yeast, loss of H2A.Z leads to increased sensitivity to DNA-damaging agents (Morillo-Huesca et al., 2010; Papamichos-Chronakis et al., 2011) and defective repair of DSBs (Kalocsay et al., 2009). Although a transient increase in H2A.Z deposition at DSBs in yeast has been reported (Kalocsay et al., 2009), other studies suggest that Ino80 and Swr1 may function antagonistically to regulate or maintain H2A.Z at DSBs (Papamichos-Chronakis et al., 2006; van Attikum et al., 2007), with no overall increase in H2A.Z exchange at DSBs in yeast (van Attikum et al., 2007). However, in mammalian cells, the hNuA4 complex promotes not only H4 acetylation by the Tip60 subunit but also the rapid exchange of H2A for H2A.Z at DSBs (Figure 2) (Xu et al., 2012). H2A.Z exchange requires the ATPase activity of the p400 motor protein and creates chromatin domains containing H2A.Z nucleosomes that extend away from the DSB. Surprisingly, H2A.Z precedes, and is required for, both the acetylation of histone H4 by Tip60 and the creation of open chromatin domains at DSBs (Downs et al., 2004; Murr et al., 2007; Xu et al., 2010). The exchange of H2A.Z onto nucleosomes at DSBs leads to an increase in the salt solubility of the histones (Xu et al., 2012), indicating the formation of open chromatin at the site of damage. This is consistent with published work demonstrating that H2A.Z nucleosomes are less stable than H2A nucleosomes and are more sensitive to extraction at low-salt concentrations (Henikoff et al., 2009; Jin and Felsenfeld, 2007; Weber et al., 2010; Zhang et al., 2005). However, other studies have shown that H2A.Z stabilizes nucleosomes (Fan et al., 2004; Park et al., 2004). These opposing effects of H2A.Z on nucleosome struc-

ture have been extensively reviewed by others (Billon and te , 2012; Zlatanova and Thakar, 2008). However, it has been Co noted that the ability of H2A.Z to reduce nucleosome stability is dependent on both histone modications and the presence of additional histone variants, including histone H3.3, on the nucleosome (Henikoff et al., 2009; Jin and Felsenfeld, 2007; Jin et al., 2009). The ability of H2A.Z to destabilize nucleosomes at DSBs may therefore depend on both the presence of additional histone variants (such as H3.3) and histone posttranslational modications on nucleosomes. Consistent with this, the ability of H2A.Z to create open chromatin structures at DSBs requires both the presence of H2A.Z and acetylation of histone H4 tails by the Tip60 acetyltransferase (Xu et al., 2012) (Figure 2). That is, H2A.Z appears to only be capable of destabilizing nucleosomes at DSBs in the context of an acetylated H4 tail. How the presence of H2A.Z promotes the acetylation of the N-terminal tail of H4 by Tip60 is less clear. Nucleosomes containing H2A.Z exhibit only subtle differences in structure from H2A nucleosomes (Suto et al., 2000). The N-terminal tail of histone H4 interacts with an acidic patch on the surface of the nucleosome and promotes packing into 30 nm bers (Robinson et al., 2008; Shogren-Knaak et al., 2006). In H2A.Z, this acidic patch is extended in length, and it has been proposed that this extended acidic region stabilizes the interaction between H2A.Z and H4, promoting packing of nucleosome bers (Fan et al., 2004). This would tend to restrict the ability of Tip60 to acetylate the N-terminal tail of H4. However, as discussed above, the ability of H2A.Z to impact chromatin organization can be modulated by the presence of histone H3.3 or by additional histone modications within the nucleosome (Jin and Felsenfeld, 2007; Jin et al., 2009; Zlatanova and Thakar, 2008) (Figure 2). H2A.Z exchange may therefore be only part of the equation, with the potential for exchange of H3.3, specic acetylation of H2A.Z, or additional remodeling motor ATPases contributing to acetylation of histone H4 in response to DSBs. Unraveling these early events will provide new insight into H2A.Z-mediated shifts in chromatin structure at the DSB.
Cell 152, March 14, 2013 2013 Elsevier Inc. 1347

Figure 3. H2A.Z Exchange Drives Chromatin Changes that Direct Chromatin Modication at DSBs
H2A.Z exchange promotes H4 acetylation by Tip60, which in turn directs ubiquitination of the chromatin by the RNF8/RNF168 ubiquitin ligases. 53BP1 is then recruited to chromatin through interaction with H4K20me2. 53BP1 may utilize pre-existing H4K20me2 or require de novo methylation by MMSET. Whether ubiquitination promotes access to H4K20me2 is not yet known. Association of NuA4-Tip60 with mdc1 omitted for clarity. P = phosphorylation, Ac = H4 acetylation, Ub = ubiquitination, Me = H4K20me2.

Rapid Chromatin Remodeling Promotes Ordered Chromatin Modication The NuA4-driven changes in chromatin organization (Figure 2) have a signicant impact on the mechanism of DSB repair. In particular, the formation of open chromatin domains through H2A.Z exchange and H4 acetylation facilitates further DNAdamage-dependent modication of the chromatin by both ubiquitination and methylation of histone H4 (Figure 3). Inactivation of components of hNuA4, including p400, Tip60, or Trrap, blocks the ubiquitination of histone H2A/H2AX by RNF8/RNF168 and inhibits the subsequent loading of several effector proteins, including brca1, 53BP1, and rad51, onto chromatin (Figure 3) (Courilleau et al., 2012; Murr et al., 2006; Xu et al., 2010, 2012). Brca1 recruitment requires interaction between the RAP80 subunit of the brca1 complex and ubiquitinated chromatin at DSBs (Sobhian et al., 2007). The NuA4-dependent shift in chromatin structure at DSBs may therefore reveal cryptic sites for H2A/H2AX ubiquitination by RNF8/RNF168 and drive loading
1348 Cell 152, March 14, 2013 2013 Elsevier Inc.

of brca1. The recruitment of 53BP1, a DNA-repair protein that regulates NHEJ (Bunting et al., 2010), is complex and can also be regulated by RNF8/RNF168-mediated chromatin ubiquitination (Doil et al., 2009; Huen et al., 2007). However, 53BP1 does not possess an identiable ubiquitin-binding motif. It has also been shown that 53BP1 recruitment to DSBs requires H4 acetylation (Murr et al., 2006; Xu et al., 2010) and H4K20 methylation (Botuyan et al., 2006). In fact, 53BP1s tudor domain can bind to histone H4 dimethylated on lysine 20 (H4K20me2) (Botuyan et al., 2006). Because a signicant fraction (>80%) of H4K20 is dimethylated in mammalian cells, the increased acetylation of histone H4 at DSBs may function to both unpack closely opposed chromatin bers and reveal H4K20me2 for 53BP1 binding. Also, H2A/H2AX ubiquitination by RNF8 and RNF168 may further promote 53BP1 loading by altering the accessibility of 53BP1 to H4K20me2 (Figure 3). Interestingly, mice lacking both of the suv4-20h H4K20me2 methyltransferases have almost no H4K20me2 and display increased genomic instability yet maintain normal recruitment of 53BP1 to DSBs (Schotta et al., 2008). Although this may suggest that H4K20me2 is dispensable for 53BP1 recruitment to DSBs, it has recently been reported that the methyltransferase MMSET is recruited to DSBs and promotes the formation of H4K20me2 (Pei et al., 2011). Recruitment of MMSET may provide the mechanism for methylation of the small fraction of H4K20 that is not constitutively methylated and may partially compensate for loss of constitutive H4K20me2 in the suv4-20h1/suv4-20h2 doubleknockout mice. In fact, 53BP1 has been reported to promote long-range interactions between DNA ends (Dilippantonio et al., 2008), suggesting that 53BP1 binding may itself play a role in regulating or stabilizing chromatin structure after DNA damage (Noon et al., 2010). Thus the initial change in nucleosome function imposed by H2A.Z exchange promotes an ordered series of histone modications, including acetylation of histone H4 and ubiquitination of the chromatin (Figure 3). This may then either unmask H4K20me2 buried within the nucleosome structure and/or promote H4K20 methylation by MMSET and thereby facilitate loading of both 53BP1 and brca1 complexes onto the chromatin. The early remodeling events therefore play a critical role in directing the ordered recruitment of DSB-repair proteins to the site of damage. Impact of H2A.Z on DSB Repair Cells lacking H2A.Z or components of NuA4 are hypersensitive to IR and have defects in both NHEJ- and HR-directed repair (Downs et al., 2004; Ikura et al., 2000; Murr et al., 2006; Xu et al., 2010, 2012). This wide range of defects reects the early and critical role of hNuA4 in promoting access to sites of damage and reects both the failure to create open chromatin structures and the lack of recruitment of brca1, which is essential for HR-mediated DSB repair. Intriguingly, when H2A.Z exchange at DSBs is inhibited, cells undergo unrestricted end resection, leading to accumulation of excess ssDNA and the loss of Ku70/80 binding (Xu et al., 2012). Further, this defect can be reversed by depletion of CtIP, suggesting that H2A.Z exchange functions to restrain or restrict the ability of the CtIP-MRN nuclease complex to initiate end resection of the DSB. In yeast, loss of the ySwr1 ATPase also leads to defects in Ku70

recruitment and defects in error-free NHEJ (van Attikum et al., 2007), although this is not directly linked to H2A.Z exchange. Recent work on the role of H2A.Z at transcriptional start sites (TSS) provides some potential insight into how H2A.Z may restrict end resection. The TSS of many genes are anked by H2A.Z nucleosomes (Jin et al., 2009; Zhang et al., 2005), which may function to x the positions of nucleosomes on either side of the TSS and thereby maintain nucleosome-free DNA for transcription-factor binding. Nucleosomes are also lost at DSBs, creating nucleosome-free regions (Tsukuda et al., 2005). The placement of H2A.Z nucleosomes on either side of nucleosome-free regions at the DSB therefore creates a structure similar to that reported at the TSS of genes. Positioning of H2A.Z on either side of the DSB may therefore dene the limits of the nucleosome-free region and create a chromatin template that restricts or limits end resection by the CtIP-MRN complex. The early remodeling of the chromatin at DSBs through H2A.Z exchange and H4 acetylation is therefore critical for setting the scene for further processing and eventual repair of the DSB through either NHEJ or HR pathways. Accessing DSBs in Heterochromatin How cells access and repair DSBs within the higher-order chromatin environment of heterochromatin has been the subject of recent studies. Heterochromatin is classically described as condensed, densely staining regions of DNA that contain few active genes but are enriched for repetitive sequences. Mammalian heterochromatin is characterized by high levels of the histone modications H3K9me3 and H3K27me3 and low levels of histone acetylation. Heterochromatin is maintained by a dense array of specic chromatin-binding proteins, including members of the HP1 family (which bind to methylated H3K9), kap-1, histone deacetylases (HDACs), and histone methyltransferases. From the perspective of DSB repair, it is important to determine whether the dense packing and unique array of heterochromatin-binding proteins present a specic barrier to the DSB-repair machinery. Further, the presence of repetitive DNA within heterochromatin may provide a signicant challenge for HR-mediated repair, requiring more stringent control of HR to prevent inappropriate recombination events. kap-1 is a repressor protein that interacts with HP1, HDACs, and histone methyltransferases and functions to maintain heterochromatin (Iyengar and Farnham, 2011). In response to DSBs, kap-1 is phosphorylated by ATM (Goodarzi et al., 2008; Ziv et al., 2006), promoting a general relaxation of the chromatin structure. Repair of DSBs (as measured by loss of gH2AX foci) is signicantly slower within heterochromatin regions and is dependent on phosphorylation of kap-1 by ATM. Further, kap-1 phosphorylation promotes release of the CHD3-remodeling ATPase from heterochromatin (Goodarzi et al., 2011), a process required for efcient repair. It is currently unclear how loss of CHD3 or phosphorylation of kap-1 (which remains associated with the DSB regions) impacts overall chromatin structure at DSBs. In addition to kap-1 phosphorylation, HP1 proteins (including HP1a, b, and g) can repress heterochromatin repair. Depletion of HP1 proteins (or depletion of the H3K9 methyltransferases) can decondense heterochromatin and promote repair of

DSBs even in the absence of ATM kinase activity (Chiolo et al., 2011; Goodarzi et al., 2008, 2011). Further, there is some evidence to suggest that HP1 proteins are actively ejected from the chromatin during DNA repair (Ayoub et al., 2008). These observations are consistent with the idea that the dense packing of nucleosomes and the presence of specic heterochromatin-binding complexes are a signicant barrier to repair of heterochromatic DSBs. Further, these results indicate a critical role for phosphorylation of kap-1 by the ATM kinase in promoting the unpacking of heterochromatin and thereby facilitating repair of heterochromatic DSBs. Currently, it is unclear whether, for example, the NuA4-Tip60 complex acetylates histones at heterochromatic DSBs or whether the phosphorylation of kap-1 within heterochromatin is sufcient to create the required open chromatin structure. Further, given that H2A.Z is found at heterochromatin boundaries, it will be interesting to determine whether this histone variant is important for heterochromatic DSB repair as well. Spacing of H2AX Nucleosomes and Heterochromatin Studies on DSB repair in heterochromatin utilize microscopy to monitor the appearance of gH2AX foci and either DAPI (to detect dense chromatin domains) or antibodies to locate regions of heterochromatin (Chiolo et al., 2011; Goodarzi et al., 2008; Noon et al., 2010). Several studies indicate that gH2AX foci preferentially assemble in euchromatin or are predominantly located at the boundary of the heterochromatin (Goodarzi et al., 2008; Kim et al., 2007; Noon et al., 2010). However, studies with enzymatically generated DSBs coupled with chromatin immunoprecipitation indicate that gH2AX does not spread uniformly along the chromosome (Iacovoni et al., 2010; Meier et al., 2007; Savic et al., 2009), and the size of the gH2AX domain varies between different chromatin locations (Xu et al., 2012). Further, in yeast, gH2AX does not spread through heterochromatin regions (Kim et al., 2007). H2AX is unique compared to other DSB-repair proteins because it is prepositioned on nucleosomes rather than recruited to DSBs. To function as a DSB detector, and to allow for gH2AX propagation along the chromatin, it would be expected that H2AX should be evenly deposited along the chromatin. However, the amount of H2AX in cells can vary from 2% to 20% of the total H2A (Rogakou et al., 1998). That is, in some cells, 1 in 2.5 nucleosomes contain H2AX, whereas in other cells, as few as 1 in 25 nucleosomes may contain H2AX. In fact, highresolution microscopy indicates that H2AX is concentrated in specic domains (Bewersdorf et al., 2006), and chromatin immunoprecipitation combined with sequencing (ChIP-Seq) analysis indicates that H2AX is concentrated within gene-rich regions (Iacovoni et al., 2010). This raises the possibility that H2AX density or distribution within heterochromatin is signicantly lower than in other domains. The failure to detect gH2AX foci in heterochromatin with microscopy may therefore reect altered H2AX distribution in heterochromatin and a reduced need for H2AX function in heterochromatin. In addition to differential H2AX distribution in heterochromatin, recent work in Drosophila has provided an alternative explanation for why gH2AX foci are only detected at the periphery of the heterochromatin. This work demonstrates that phosphorylation of H2AX and initial recruitment of DSB-repair
Cell 152, March 14, 2013 2013 Elsevier Inc. 1349

proteins to the break occur normally within the heterochromatin. However, these heterochromatic DSBs rapidly migrate out of the heterochromatin; hence the actual DSB repair is carried out within euchromatin (Chiolo et al., 2011; Jakob et al., 2011). Further, this relocation of the DSB is only partly dependent on ATM, indicating that phosphorylation of kap-1 by ATM does not contribute to this process. Moving the DSB out of the heterochromatin may limit recombination with repetitive sequences and allow increased mobility and easier access to the DSB. However, it should be noted that experiments in mammalian cells have indicated only limited mobility for DSBs, so it will be important to explore DSB mobility in the heterochromatin of mammalian cells (Krawczyk et al., 2012; Soutoglou et al., 2007). Finally, it is interesting to note that, in yeast, exchange of H2A.Z into the chromatin is required for relocalization of persistent DSBs to the nuclear periphery (Kalocsay et al., 2009). The NuA4-mediated exchange of H2A.Z at heterochromatin DSBs (Figure 2) may potentially promote relocation of DSBs out of the heterochromatin. Clearly, our understanding of the mechanism of DSB repair within heterochromatin is limited. Developing new approaches, such as coupling synthetic nucleases to create DSBs in heterochromatin with ChIP-Seq approaches, may provide a more directed approach to understanding DSB repair within specic chromatin domains. Early Recruitment Events: HP1 It is now clear that additional chromatin based events occur prior to the NuA4-mediated chromatin relaxation. In particular, 2 heterochromatin-associated proteins, HP1 and kap-1, participate in the early response to DSBs in euchromatin. HP1a and kap-1 are rapidly recruited to DSBs within seconds to minutes after damage induction ((Baldeyron et al., 2011; Luijsterburg et al., 2009) reviewed in (Soria et al., 2012)). The recruitment of HP1a and kap-1 is essential for loading 53BP1 and brca1 and for HR directed repair. Kap-1 and HP1 proteins may be recruited to DSBs as a single complex, although HP1a loading requires the histone chaperone ASF1 (Baldeyron et al., 2011). Importantly, HP1 and kap-1 recruitment to euchromatin is transient, with both proteins dissociating from the break a few minutes after damage induction (Baldeyron et al., 2011). It is currently unclear if HP1 and kap-1 have distinct roles in heterochromatin and euchromatin during DSB repair, and why transient recruitment and release of HP1 is important remains to be investigated. One potential explanation is that kap-1 exists as a complex with repressive factors including HDACs and H3K9 methyltransferases (Iyengar and Farnham, 2011). Recruitment of repressive kap-1 complexes may rapidly heterochromatinize the DSB region, preventing transcription and stabilizing the chromatin structure. Further, since the Tip60 sub-unit of NuA4 requires interaction with H3K9me3 for stimulation of its acetyltransferase activity (Sun et al., 2009), recruitment of kap-1/HP1 complexes may provide a mechanism for the rapid methylation of H3K9 and therefore facilitate the activity of both Tip60 and the NuA4 complex. The transient accumulation of kap-1 and HP1 complexes may rewrite local histone modication signatures, thereby increasing available H3K9me3 and promoting the activity of the Tip60 sub-unit of NuA4 and other factors.
1350 Cell 152, March 14, 2013 2013 Elsevier Inc.

Figure 4. Creating Access to DSBs


Proposed chronological sequence of steps in remodeling of a DSB. Initial PARylation by PARP1 leads to rapid recruitment of NuRD and ALC1 (through interaction with PAR) and kap-1/HP1 complexes (possibly through interaction with PAR). Deacetylation of histones (including H2A, H3, and H4) by NuRD and proposed H3K9 methylation (by HP1/kap1-associated lysine methyltransferases [KMTs] including suv39h1 and G9a) create a temporary repressive chromatin structure with low histone acetylation and high density of H3K9me3. Subsequently, the HP1/kap1, ALC1, and NuRD complexes are rapidly released from the chromatin, potentially through dePARylation by PARG. Phosphorylation of gH2AX then recruits NuA4-Tip60, promoting the ordered remodeling of the chromatin through H2A.Z exchange, acetylation of histone H4 (H4Ac), chromatin ubiquitination, and modulation of H4K20me2. This creates a common chromatin template for DSB repair by either NHEJ- or HR-mediated repair.

Early Recruitment of NuRD and ALC1 Complexes through PARylation Similarly to recruitment of kap-1/HP1, there is also a rapid and transient accumulation of the NuRD (Chou et al., 2010; Larsen et al., 2010; Polo et al., 2010; Smeenk et al., 2010) and ALC1 (Ahel et al., 2009) remodeling complexes at DSBs (Figure 4). NuRD complexes contain either the CHD3 or CHD4 ATPase, HDAC1 or HDAC2, and associated regulatory subunits (Clapier and Cairns, 2009). NuRD is a repressive complex that maintains higher-order chromatin structure. Inactivation of NuRD or ALC1 leads to defects in DSB repair and increased sensitivity to DNA

damage (Ahel et al., 2009; Chou et al., 2010; Polo et al., 2010; Smeenk et al., 2010). NuRD regulates the acetylation of p53 and thereby controls the extent of G1-S arrest following DNA damage (Larsen et al., 2010; Polo et al., 2010). Second, NuRD, like NuA4, is required for chromatin ubiquitination by RNF8/ RNF168 and for loading of brca1 (Larsen et al., 2010; Smeenk et al., 2010). The recruitment of NuRD complexes to DSBs requires PARylation of the chromatin by PARP1 (Chou et al., 2010; Polo et al., 2010). PARP1 belongs to a family of Parps that play a central role in both transcription and DNA repair (Gibson and Kraus, 2012). Chromatin at DSBs is rapidly and transiently PARylated (Figure 4), and it is this modication, rather than gH2AX or ATM signaling, that localizes NuRD at the DSB (Chou et al., 2010; Polo et al., 2010). Similarly, ALC1, a remodeling ATPase that functions to reposition nucleosomes on the chromatin, is also rapidly recruited to DSBs through direct interaction with PAR chains on the chromatin (Ahel et al., 2009; Gottschalk et al., 2009). ALC1 loading is rapid and transient after DNA damage and may favor the formation of open chromatin (Ahel et al., 2009). Thus at least three remodeling complexes, HP1/kap-1, NuRD, and ALC1, are rapidly, but transiently, recruited to DSBs (Figure 4). Because PARylation of the chromatin is transient yet independent of gH2AX formation, the recruitment of HP1/kap-1, NuRD, and ALC1 likely precedes the recruitment and loading of the NuA4Tip60 complex (Figure 4). However, whether these complexes work sequentially or in parallel is not yet known. For example, whether the recruitment of NuA4-Tip60 or H2A.Z exchange requires prior processing of the chromatin by either ALC1 or NuRD or is dependent on chromatin PARylation is not known. Further, it remains to be seen whether the HP1/kap-1 complex is recruited to DSBs through PARylation or some other mechanism. Finally, the rapid release of ALC1, NuRD, and HP1/kap-1 complexes may be brought about by dePARylation of the chromatin by polyADP-ribose glycohydrolases (PARGs) (Figure 4). Understanding the regulation of PARGs may provide new insight into some of the earliest events occurring during DSB repair. The HP1/kap-1, ALC1, and NuRD complexes deploy a wide range of chromatin-remodeling activities, including HDACs (NuRD), methyltransferases (HP1/kap-1), and remodeling ATPase activities (NuRD and ALC1) at the DSB. Because these complexes are only retained at the DSBs for a short time period (minutes), they must play a critical role in the initial detection and processing of the chromatin at the DSB. This role could include the rapid termination of local transcription by promoting histone deacetylation (NuRD) and/or the formation of repressive chromatin through histone methylation and loading of kap-1/HP1 complexes. By erasing previous histone acetylation marks, NuRD and the other complexes may prime the chromatin for uniform acetylation by the NuA4-Tip60 complex. Further, ALC1 may function to reposition nucleosomes at the DSB and to stabilize the chromatin and facilitate further processing and repair. These events may rapidly and transiently stabilize the local chromatin structure by creating a temporary, compacted, repressive chromatin environment at the DSB. Subsequently, DSB signaling, including gH2AX formation and ATM activation, leads to the ordered recruitment of DSB-repair proteins to the chromatin at DSBs. The transient creation of PAR chains at DSBs by

PARP1, which allows the rapid recruitment of NuRD, ALC1, and potentially kap-1/HP1, is therefore a critical early event in the DNA-damage response. Conclusions and Future Directions A eukaryotic cell must integrate classical DSB-repair signaling and repair by NHEJ and HR pathways with the complexity of the local chromatin architecture. Functional chromatin domains, such as replication forks, genes, or heterochromatin, differ signicantly in the patterns of histone modications, the types of chromatin-binding proteins, and the degree of nucleosome packing. Each of these domains may therefore require unique chromatin-remodeling complexes to alter the local chromatin architecture at individual DSBs. Identifying the protein-remodeling complexes that are essential for repair in specic chromatin structures is therefore of key importance. Such processes may be critical for reshaping the local chromatin structure and for creating a common DNA template that can be presented to the DSB-repair machinery. It is clear that some of the earliest events in DSB repair occurring in the rst few minutes after damage can have a profound impact on processing of the damaged chromatin template. However, in addition to these early events, there are many additional steps in DSB repair that require chromatin remolding, such as homology searching during HR-directed repair or regulation of end resection during repair. In addition, resetting the chromatin structure and restoring the original epigenetic code to the repaired chromatin are vital to ensure that normal functionality is restored to the damaged chromatin.
ACKNOWLEDGMENTS This work was supported by NIH grants CA64585 and CA93602 to B.D.P. and grant RO1-DK43889 to A.D.D. REFERENCES , Z., Wiechens, N., Polo, S.E., Garcia-Wilson, E., Ahel, I., Flynn, Ahel, D., Horejs H., Skehel, M., West, S.C., Jackson, S.P., et al. (2009). Poly(ADP-ribose)dependent regulation of DNA repair by the chromatin remodeling enzyme ALC1. Science 325, 12401243. Ayoub, N., Jeyasekharan, A.D., Bernal, J.A., and Venkitaraman, A.R. (2008). HP1-beta mobilization promotes chromatin changes that initiate the DNA damage response. Nature 453, 682686. Baldeyron, C., Soria, G., Roche, D., Cook, A.J., and Almouzni, G. (2011). HP1alpha recruitment to DNA damage by p150CAF-1 promotes homologous recombination repair. J. Cell Biol. 193, 8195. Bewersdorf, J., Bennett, B.T., and Knight, K.L. (2006). H2AX chromatin structures and their response to DNA damage revealed by 4Pi microscopy. Proc. Natl. Acad. Sci. USA 103, 1813718142. te , J. (2012). Precise deposition of histone H2A.Z in chromatin Billon, P., and Co for genome expression and maintenance. Biochim. Biophys. Acta 1819, 290302. Bird, A.W., Yu, D.Y., Pray-Grant, M.G., Qiu, Q., Harmon, K.E., Megee, P.C., Grant, P.A., Smith, M.M., and Christman, M.F. (2002). Acetylation of histone H4 by Esa1 is required for DNA double-strand break repair. Nature 419, 411415. Bonner, W.M., Redon, C.E., Dickey, J.S., Nakamura, A.J., Sedelnikova, O.A., Solier, S., and Pommier, Y. (2008). GammaH2AX and cancer. Nat. Rev. Cancer 8, 957967. Bothmer, A., Robbiani, D.F., Feldhahn, N., Gazumyan, A., Nussenzweig, A., and Nussenzweig, M.C. (2010). 53BP1 regulates DNA resection and the choice

Cell 152, March 14, 2013 2013 Elsevier Inc. 1351

between classical and alternative end joining during class switch recombination. J. Exp. Med. 207, 855865. Botuyan, M.V., Lee, J., Ward, I.M., Kim, J.E., Thompson, J.R., Chen, J., and Mer, G. (2006). Structural basis for the methylation state-specic recognition of histone H4-K20 by 53BP1 and Crb2 in DNA repair. Cell 127, 13611373. n, E., Wong, N., Chen, H.T., Polato, F., Gunn, A., Bothmer, Bunting, S.F., Calle A., Feldhahn, N., Fernandez-Capetillo, O., Cao, L., et al. (2010). 53BP1 inhibits homologous recombination in Brca1-decient cells by blocking resection of DNA breaks. Cell 141, 243254. Cairns, B.R. (2005). Chromatin remodeling complexes: strength in diversity, precision through specialization. Curr. Opin. Genet. Dev. 15, 185190. Campos, E.I., and Reinberg, D. (2009). Histones: annotating chromatin. Annu. Rev. Genet. 43, 559599. Chapman, J.R., and Jackson, S.P. (2008). Phospho-dependent interactions between NBS1 and MDC1 mediate chromatin retention of the MRN complex at sites of DNA damage. EMBO Rep. 9, 795801. Chen, X., Chen, Z., Chen, H., Su, Z., Yang, J., Lin, F., Shi, S., and He, X. (2012). Nucleosomes suppress spontaneous mutations base-specically in eukaryotes. Science 335, 12351238. Chiolo, I., Minoda, A., Colmenares, S.U., Polyzos, A., Costes, S.V., and Karpen, G.H. (2011). Double-strand breaks in heterochromatin move outside of a dynamic HP1a domain to complete recombinational repair. Cell 144, 732744. Chou, D.M., Adamson, B., Dephoure, N.E., Tan, X., Nottke, A.C., Hurov, K.E., covo, M.P., and Elledge, S.J. (2010). A chromatin localization Gygi, S.P., Colaia screen reveals poly (ADP ribose)-regulated recruitment of the repressive polycomb and NuRD complexes to sites of DNA damage. Proc. Natl. Acad. Sci. USA 107, 1847518480. Ciccia, A., and Elledge, S.J. (2010). The DNA damage response: making it safe to play with knives. Mol. Cell 40, 179204. Clapier, C.R., and Cairns, B.R. (2009). The biology of chromatin remodeling complexes. Annu. Rev. Biochem. 78, 273304. Courilleau, C., Chailleux, C., Jauneau, A., Grimal, F., Briois, S., Boutet-Robinet, E., Boudsocq, F., Trouche, D., and Canitrot, Y. (2012). The chromatin remodeler p400 ATPase facilitates Rad51-mediated repair of DNA doublestrand breaks. J. Cell Biol. 199, 10671081. de Wit, E., and van Steensel, B. (2009). Chromatin domains in higher eukaryotes: insights from genome-wide mapping studies. Chromosoma 118, 2536. Dilippantonio, S., Gapud, E., Wong, N., Huang, C.Y., Mahowald, G., Chen, H.T., Kruhlak, M.J., Callen, E., Livak, F., Nussenzweig, M.C., et al. (2008). 53BP1 facilitates long-range DNA end-joining during V(D)J recombination. Nature 456, 529533. Doil, C., Mailand, N., Bekker-Jensen, S., Menard, P., Larsen, D.H., Pepperkok, R., Ellenberg, J., Panier, S., Durocher, D., Bartek, J., et al. (2009). RNF168 binds and amplies ubiquitin conjugates on damaged chromosomes to allow accumulation of repair proteins. Cell 136, 435446. Downs, J.A., Allard, S., Jobin-Robitaille, O., Javaheri, A., Auger, A., Bouchard, te , J. (2004). Binding of chromatin-modiN., Kron, S.J., Jackson, S.P., and Co fying activities to phosphorylated histone H2A at DNA damage sites. Mol. Cell 16, 979990. te , J. (2004). The highly conserved and multifunctional NuA4 Doyon, Y., and Co HAT complex. Curr. Opin. Genet. Dev. 14, 147154. Fan, J.Y., Rangasamy, D., Luger, K., and Tremethick, D.J. (2004). H2A.Z alters the nucleosome surface to promote HP1alpha-mediated chromatin ber folding. Mol. Cell 16, 655661. Fuchs, M., Gerber, J., Drapkin, R., Sif, S., Ikura, T., Ogryzko, V., Lane, W.S., Nakatani, Y., and Livingston, D.M. (2001). The p400 complex is an essential E1A transformation target. Cell 106, 297307. vry, N., Chan, H.M., Laamme, L., Livingston, D.M., and Gaudreau, L. Ge (2007). p21 transcription is regulated by differential localization of histone H2A.Z. Genes Dev. 21, 18691881.

Gibson, B.A., and Kraus, W.L. (2012). New insights into the molecular and cellular functions of poly(ADP-ribose) and PARPs. Nat. Rev. Mol. Cell Biol. 13, 411424. brich, M., and Goodarzi, A.A., Noon, A.T., Deckbar, D., Ziv, Y., Shiloh, Y., Lo Jeggo, P.A. (2008). ATM signaling facilitates repair of DNA double-strand breaks associated with heterochromatin. Mol. Cell 31, 167177. Goodarzi, A.A., Kurka, T., and Jeggo, P.A. (2011). KAP-1 phosphorylation regulates CHD3 nucleosome remodeling during the DNA double-strand break response. Nat. Struct. Mol. Biol. 18, 831839. Gorrini, C., Squatrito, M., Luise, C., Syed, N., Perna, D., Wark, L., Martinato, F., Sardella, D., Verrecchia, A., Bennett, S., et al. (2007). Tip60 is a haplo-insufcient tumour suppressor required for an oncogene-induced DNA damage response. Nature 448, 10631067. Gottschalk, A.J., Timinszky, G., Kong, S.E., Jin, J., Cai, Y., Swanson, S.K., Washburn, M.P., Florens, L., Ladurner, A.G., Conaway, J.W., and Conaway, R.C. (2009). Poly(ADP-ribosyl)ation directs recruitment and activation of an ATP-dependent chromatin remodeler. Proc. Natl. Acad. Sci. USA 106, 1377013774. Grewal, S.I., and Jia, S. (2007). Heterochromatin revisited. Nat. Rev. Genet. 8, 3546. Gudjonsson, T., Altmeyer, M., Savic, V., Toledo, L., Dinant, C., Grofte, M., Bartkova, J., Poulsen, M., Oka, Y., Bekker-Jensen, S., et al. (2012). TRIP12 and UBR5 suppress spreading of chromatin ubiquitylation at damaged chromosomes. Cell 150, 697709. Henikoff, S., Henikoff, J.G., Sakai, A., Loeb, G.B., and Ahmad, K. (2009). Genome-wide proling of salt fractions maps physical properties of chromatin. Genome Res. 19, 460469. Huen, M.S., Grant, R., Manke, I., Minn, K., Yu, X., Yaffe, M.B., and Chen, J. (2007). RNF8 transduces the DNA-damage signal via histone ubiquitylation and checkpoint protein assembly. Cell 131, 901914. Huertas, P. (2010). DNA resection in eukaryotes: deciding how to x the break. Nat. Struct. Mol. Biol. 17, 1116. Iacovoni, J.S., Caron, P., Lassadi, I., Nicolas, E., Massip, L., Trouche, D., and Legube, G. (2010). High-resolution proling of gammaH2AX around DNA double strand breaks in the mammalian genome. EMBO J. 29, 14461457. Ikura, T., Ogryzko, V.V., Grigoriev, M., Groisman, R., Wang, J., Horikoshi, M., Scully, R., Qin, J., and Nakatani, Y. (2000). Involvement of the TIP60 histone acetylase complex in DNA repair and apoptosis. Cell 102, 463473. Ikura, T., Tashiro, S., Kakino, A., Shima, H., Jacob, N., Amunugama, R., Yoder, K., Izumi, S., Kuraoka, I., Tanaka, K., et al. (2007). DNA damage-dependent acetylation and ubiquitination of H2AX enhances chromatin dynamics. Mol. Cell. Biol. 27, 70287040. Iyengar, S., and Farnham, P.J. (2011). KAP1 protein: an enigmatic master regulator of the genome. J. Biol. Chem. 286, 2626726276. Jackson, S.P., and Bartek, J. (2009). The DNA-damage response in human biology and disease. Nature 461, 10711078. brich, Jakob, B., Splinter, J., Conrad, S., Voss, K.O., Zink, D., Durante, M., Lo M., and Taucher-Scholz, G. (2011). DNA double-strand breaks in heterochromatin elicit fast repair protein recruitment, histone H2AX phosphorylation and relocation to euchromatin. Nucleic Acids Res. 39, 64896499. Jha, S., Shibata, E., and Dutta, A. (2008). Human Rvb1/Tip49 is required for the histone acetyltransferase activity of Tip60/NuA4 and for the downregulation of phosphorylation on H2AX after DNA damage. Mol. Cell. Biol. 28, 26902700. Jin, C., and Felsenfeld, G. (2007). Nucleosome stability mediated by histone variants H3.3 and H2A.Z. Genes Dev. 21, 15191529. Jin, C., Zang, C., Wei, G., Cui, K., Peng, W., Zhao, K., and Felsenfeld, G. (2009). H3.3/H2A.Z double variant-containing nucleosomes mark nucleosome-free regions of active promoters and other regulatory regions. Nat. Genet. 41, 941945. Kalocsay, M., Hiller, N.J., and Jentsch, S. (2009). Chromosome-wide Rad51 spreading and SUMO-H2A.Z-dependent chromosome xation in response to a persistent DNA double-strand break. Mol. Cell 33, 335343.

1352 Cell 152, March 14, 2013 2013 Elsevier Inc.

Kennedy, R.D., and DAndrea, A.D. (2006). DNA repair pathways in clinical practice: lessons from pediatric cancer susceptibility syndromes. J. Clin. Oncol. 24, 37993808. Kim, J.A., Kruhlak, M., Dotiwala, F., Nussenzweig, A., and Haber, J.E. (2007). Heterochromatin is refractory to gamma-H2AX modication in yeast and mammals. J. Cell Biol. 178, 209218. Kolas, N.K., Chapman, J.R., Nakada, S., Ylanko, J., Chahwan, R., Sweeney, F.D., Panier, S., Mendez, M., Wildenhain, J., Thomson, T.M., et al. (2007). Orchestration of the DNA-damage response by the RNF8 ubiquitin ligase. Science 318, 16371640. Krawczyk, P.M., Borovski, T., Stap, J., Cijsouw, T., ten Cate, R., Medema, J.P., Kanaar, R., Franken, N.A., and Aten, J.A. (2012). Chromatin mobility is increased at sites of DNA double-strand breaks. J. Cell Sci. 125, 21272133. ller, W.G., Kruhlak, M.J., Celeste, A., Dellaire, G., Fernandez-Capetillo, O., Mu McNally, J.G., Bazett-Jones, D.P., and Nussenzweig, A. (2006). Changes in chromatin structure and mobility in living cells at sites of DNA double-strand breaks. J. Cell Biol. 172, 823834. Kusch, T., Florens, L., Macdonald, W.H., Swanson, S.K., Glaser, R.L., Yates, J.R., 3rd, Abmayr, S.M., Washburn, M.P., and Workman, J.L. (2004). Acetylation by Tip60 is required for selective histone variant exchange at DNA lesions. Science 306, 20842087. Larsen, D.H., Poinsignon, C., Gudjonsson, T., Dinant, C., Payne, M.R., Hari, F.J., Rendtlew Danielsen, J.M., Menard, P., Sand, J.C., Stucki, M., et al. (2010). The chromatin-remodeling factor CHD4 coordinates signaling and repair after DNA damage. J. Cell Biol. 190, 731740. Lavin, M.F. (2008). Ataxia-telangiectasia: from a rare disorder to a paradigm for cell signalling and cancer. Nat. Rev. Mol. Cell Biol. 9, 759769. Lou, Z., Minter-Dykhouse, K., Franco, S., Gostissa, M., Rivera, M.A., Celeste, A., Manis, J.P., van Deursen, J., Nussenzweig, A., Paull, T.T., et al. (2006). MDC1 maintains genomic stability by participating in the amplication of ATM-dependent DNA damage signals. Mol. Cell 21, 187200. Luger, K., Dechassa, M.L., and Tremethick, D.J. (2012). New insights into nucleosome and chromatin structure: an ordered state or a disordered affair? Nat. Rev. Mol. Cell Biol. 13, 436447. Luijsterburg, M.S., Dinant, C., Lans, H., Stap, J., Wiernasz, E., Lagerwerf, S., Warmerdam, D.O., Lindh, M., Brink, M.C., Dobrucki, J.W., et al. (2009). Heterochromatin protein 1 is recruited to various types of DNA damage. J. Cell Biol. 185, 577586. Matsuoka, S., Ballif, B.A., Smogorzewska, A., McDonald, E.R., 3rd, Hurov, K.E., Luo, J., Bakalarski, C.E., Zhao, Z., Solimini, N., Lerenthal, Y., et al. (2007). ATM and ATR substrate analysis reveals extensive protein networks responsive to DNA damage. Science 316, 11601166. oz, P., Ellis, P., Rigler, D., Langford, C., Blasco, M.A., Meier, A., Fiegler, H., Mun Carter, N., and Jackson, S.P. (2007). Spreading of mammalian DNA-damage response factors studied by ChIP-chip at damaged telomeres. EMBO J. 26, 27072718. Melander, F., Bekker-Jensen, S., Falck, J., Bartek, J., Mailand, N., and Lukas, J. (2008). Phosphorylation of SDT repeats in the MDC1 N terminus triggers retention of NBS1 at the DNA damage-modied chromatin. J. Cell Biol. 181, 213226. jar, E., and Prado, F. (2010). The Morillo-Huesca, M., Clemente-Ruiz, M., Andu SWR1 histone replacement complex causes genetic instability and genomewide transcription misregulation in the absence of H2A.Z. PLoS ONE 5, e12143. Murga, M., Jaco, I., Fan, Y., Soria, R., Martinez-Pastor, B., Cuadrado, M., Yang, S.M., Blasco, M.A., Skoultchi, A.I., and Fernandez-Capetillo, O. (2007). Global chromatin compaction limits the strength of the DNA damage response. J. Cell Biol. 178, 11011108. Murr, R., Loizou, J.I., Yang, Y.G., Cuenin, C., Li, H., Wang, Z.Q., and Herceg, Z. (2006). Histone acetylation by Trrap-Tip60 modulates loading of repair proteins and repair of DNA double-strand breaks. Nat. Cell Biol. 8, 9199.

` re, T., Sawan, C., Shukla, V., and Herceg, Z. (2007). OrchesMurr, R., Vaissie tration of chromatin-based processes: mind the TRRAP. Oncogene 26, 5358 5372. brich, M., Stewart, G.S., Jeggo, P.A., and Noon, A.T., Shibata, A., Rief, N., Lo Goodarzi, A.A. (2010). 53BP1-dependent robust localized KAP-1 phosphorylation is essential for heterochromatic DNA double-strand break repair. Nat. Cell Biol. 12, 177184. Papamichos-Chronakis, M., Krebs, J.E., and Peterson, C.L. (2006). Interplay between Ino80 and Swr1 chromatin remodeling enzymes regulates cell cycle checkpoint adaptation in response to DNA damage. Genes Dev. 20, 2437 2449. Papamichos-Chronakis, M., Watanabe, S., Rando, O.J., and Peterson, C.L. (2011). Global regulation of H2A.Z localization by the INO80 chromatin-remodeling enzyme is essential for genome integrity. Cell 144, 200213. Park, Y.J., Dyer, P.N., Tremethick, D.J., and Luger, K. (2004). A new uorescence resonance energy transfer approach demonstrates that the histone variant H2AZ stabilizes the histone octamer within the nucleosome. J. Biol. Chem. 279, 2427424282. Pei, H., Zhang, L., Luo, K., Qin, Y., Chesi, M., Fei, F., Bergsagel, P.L., Wang, L., You, Z., and Lou, Z. (2011). MMSET regulates histone H4K20 methylation and 53BP1 accumulation at DNA damage sites. Nature 470, 124128. Peng, J.C., and Karpen, G.H. (2008). Epigenetic regulation of heterochromatic DNA stability. Curr. Opin. Genet. Dev. 18, 204211. Polo, S.E., Kaidi, A., Baskcomb, L., Galanty, Y., and Jackson, S.P. (2010). Regulation of DNA-damage responses and cell-cycle progression by the chromatin remodelling factor CHD4. EMBO J. 29, 31303139. Robinson, P.J., An, W., Routh, A., Martino, F., Chapman, L., Roeder, R.G., and Rhodes, D. (2008). 30 nm chromatin bre decompaction requires both H4-K16 acetylation and linker histone eviction. J. Mol. Biol. 381, 816825. Rogakou, E.P., Boon, C., Redon, C., and Bonner, W.M. (1999). Megabase chromatin domains involved in DNA double-strand breaks in vivo. J. Cell Biol. 146, 905916. Rogakou, E.P., Pilch, D.R., Orr, A.H., Ivanova, V.S., and Bonner, W.M. (1998). DNA double-stranded breaks induce histone H2AX phosphorylation on serine 139. J. Biol. Chem. 273, 58585868. Sartori, A.A., Lukas, C., Coates, J., Mistrik, M., Fu, S., Bartek, J., Baer, R., Lukas, J., and Jackson, S.P. (2007). Human CtIP promotes DNA end resection. Nature 450, 509514. Sasaki, S., Mello, C.C., Shimada, A., Nakatani, Y., Hashimoto, S., Ogawa, M., Matsushima, K., Gu, S.G., Kasahara, M., Ahsan, B., et al. (2009). Chromatinassociated periodicity in genetic variation downstream of transcriptional start sites. Science 323, 401404. Savic, V., Yin, B., Maas, N.L., Bredemeyer, A.L., Carpenter, A.C., Helmink, B.A., Yang-Iott, K.S., Sleckman, B.P., and Bassing, C.H. (2009). Formation of dynamic gamma-H2AX domains along broken DNA strands is distinctly regulated by ATM and MDC1 and dependent upon H2AX densities in chromatin. Mol. Cell 34, 298310. n, E., Schotta, G., Sengupta, R., Kubicek, S., Malin, S., Kauer, M., Calle Celeste, A., Pagani, M., Opravil, S., De La Rosa-Velazquez, I.A., et al. (2008). A chromatin-wide transition to H4K20 monomethylation impairs genome integrity and programmed DNA rearrangements in the mouse. Genes Dev. 22, 20482061. ckler, B., and Lehner, B. (2012). Chromatin organization is a major Schuster-Bo inuence on regional mutation rates in human cancer cells. Nature 488, 504507. Shogren-Knaak, M., Ishii, H., Sun, J.M., Pazin, M.J., Davie, J.R., and Peterson, C.L. (2006). Histone H4-K16 acetylation controls chromatin structure and protein interactions. Science 311, 844847. Smeenk, G., Wiegant, W.W., Vrolijk, H., Solari, A.P., Pastink, A., and van Attikum, H. (2010). The NuRD chromatin-remodeling complex regulates signaling and repair of DNA damage. J. Cell Biol. 190, 741749. Smerdon, M.J. (1991). DNA repair and the role of chromatin structure. Curr. Opin. Cell Biol. 3, 422428.

Cell 152, March 14, 2013 2013 Elsevier Inc. 1353

Smerdon, M.J., Tlsty, T.D., and Lieberman, M.W. (1978). Distribution of ultraviolet-induced DNA repair synthesis in nuclease sensitive and resistant regions of human chromatin. Biochemistry 17, 23772386. Sobhian, B., Shao, G., Lilli, D.R., Culhane, A.C., Moreau, L.A., Xia, B., Livingston, D.M., and Greenberg, R.A. (2007). RAP80 targets BRCA1 to specic ubiquitin structures at DNA damage sites. Science 316, 11981202. Soria, G., Polo, S.E., and Almouzni, G. (2012). Prime, repair, restore: the active role of chromatin in the DNA damage response. Mol. Cell 46, 722734. Soutoglou, E., Dorn, J.F., Sengupta, K., Jasin, M., Nussenzweig, A., Ried, T., Danuser, G., and Misteli, T. (2007). Positional stability of single double-strand breaks in mammalian cells. Nat. Cell Biol. 9, 675682. Stucki, M., Clapperton, J.A., Mohammad, D., Yaffe, M.B., Smerdon, S.J., and Jackson, S.P. (2005). MDC1 directly binds phosphorylated histone H2AX to regulate cellular responses to DNA double-strand breaks. Cell 123, 1213 1226. Sun, Y., Jiang, X., Chen, S., Fernandes, N., and Price, B.D. (2005). A role for the Tip60 histone acetyltransferase in the acetylation and activation of ATM. Proc. Natl. Acad. Sci. USA 102, 1318213187. Sun, Y., Jiang, X., Xu, Y., Ayrapetov, M.K., Moreau, L.A., Whetstine, J.R., and Price, B.D. (2009). Histone H3 methylation links DNA damage detection to activation of the tumour suppressor Tip60. Nat. Cell Biol. 11, 13761382. Sun, Y., Jiang, X., and Price, B.D. (2010). Tip60: connecting chromatin to DNA damage signaling. Cell Cycle 9, 930936. Suto, R.K., Clarkson, M.J., Tremethick, D.J., and Luger, K. (2000). Crystal structure of a nucleosome core particle containing the variant histone H2A.Z. Nat. Struct. Biol. 7, 11211124. Sykes, S.M., Mellert, H.S., Holbert, M.A., Li, K., Marmorstein, R., Lane, W.S., and McMahon, S.B. (2006). Acetylation of the p53 DNA-binding domain regulates apoptosis induction. Mol. Cell 24, 841851.

Symington, L.S., and Gautier, J. (2011). Double-strand break end resection and repair pathway choice. Annu. Rev. Genet. 45, 247271. Tolstorukov, M.Y., Volfovsky, N., Stephens, R.M., and Park, P.J. (2011). Impact of chromatin structure on sequence variability in the human genome. Nat. Struct. Mol. Biol. 18, 510515. Tsukuda, T., Fleming, A.B., Nickoloff, J.A., and Osley, M.A. (2005). Chromatin remodelling at a DNA double-strand break site in Saccharomyces cerevisiae. Nature 438, 379383. van Attikum, H., Fritsch, O., and Gasser, S.M. (2007). Distinct roles for SWR1 and INO80 chromatin remodeling complexes at chromosomal double-strand breaks. EMBO J. 26, 41134125. Weber, C.M., Henikoff, J.G., and Henikoff, S. (2010). H2A.Z nucleosomes enriched over active genes are homotypic. Nat. Struct. Mol. Biol. 17, 15001507. Xu, Y., and Price, B.D. (2011). Chromatin dynamics and the repair of DNA double strand breaks. Cell Cycle 10, 261267. Xu, Y., Sun, Y., Jiang, X., Ayrapetov, M.K., Moskwa, P., Yang, S., Weinstock, D.M., and Price, B.D. (2010). The p400 ATPase regulates nucleosome stability and chromatin ubiquitination during DNA repair. J. Cell Biol. 191, 3143. Xu, Y., Ayrapetov, M.K., Xu, C., Gursoy-Yuzugullu, O., Hu, Y., and Price, B.D. (2012). Histone H2A.Z controls a critical chromatin remodeling step required for DNA double-strand break repair. Mol. Cell 48, 723733. Zhang, H., Roberts, D.N., and Cairns, B.R. (2005). Genome-wide dynamics of Htz1, a histone H2A variant that poises repressed/basal promoters for activation through histone loss. Cell 123, 219231. Ziv, Y., Bielopolski, D., Galanty, Y., Lukas, C., Taya, Y., Schultz, D.C., Lukas, J., Bekker-Jensen, S., Bartek, J., and Shiloh, Y. (2006). Chromatin relaxation in response to DNA double-strand breaks is modulated by a novel ATM- and KAP-1 dependent pathway. Nat. Cell Biol. 8, 870876. Zlatanova, J., and Thakar, A. (2008). H2A.Z: view from the top. Structure 16, 166179.

1354 Cell 152, March 14, 2013 2013 Elsevier Inc.

Leading Edge

Review
Chromatin Movement in the Maintenance of Genome Stability
Vincent Dion1 and Susan M. Gasser1,2,*
Miescher Institute for Biomedical Research, Maulbeerstrasse 66, CH-4058 Basel, Switzerland of Natural Sciences, University of Basel, CH-4056 Basel, Switzerland *Correspondence: susan.gasser@fmi.ch http://dx.doi.org/10.1016/j.cell.2013.02.010
2Faculty 1Friedrich

Mechanistic analyses based on improved imaging techniques have begun to explore the biological implications of chromatin movement within the nucleus. Studies in both prokaryotes and eukaryotes have shed light on what regulates the mobility of DNA over long distances. Interestingly, in eukaryotes, genomic loci increase their movement in response to double-strand break induction. Break mobility, in turn, correlates with the efciency of repair by homologous recombination. We review here the source and regulation of DNA mobility and discuss how it can both contribute to and jeopardize genome stability.
The Not-Quite-Random Walk of Chromatin Chromatin is often depicted as a static entity comprising DNA wrapped around histone octamers organized in the form of arrays. Constant changes in the composition of nucleosomes, posttranscriptional modications of histones, and shifts in nucleosome positioning (Campos and Reinberg, 2009; Segal and Widom, 2009) ensure that chromatin is dynamic. In addition, recent studies argue that the physical movement of the chromatin ber itself is an important element of chromatin dynamics. Indeed, chromatin in the interphase nucleus moves constantly, not only due to temperature-dependent Brownian motion. Here, we review new ndings that shed light on the mechanisms that promote DNA movement as well as its biological implications. In the 1990s, the development of a nonmultimerizing green uorescent protein (GFP)-Lac repressor (Lacl) fusion that could bind lacO arrays integrated in the yeast genome opened the door to microscopic analysis of the position of chromosomal loci in living cells (Robinett et al., 1996). The LacI-lacO system was followed by development of a TetR-TetO tagging pair (Michaelis et al., 1997) and the coupling of these to a GFP-pore protein fusion (Heun et al., 2001a, 2001b). This made it possible to track the movement of tagged chromosomal loci accurately, independent of nuclear movement. In these systems, the GFPfused repressors concentrate at their cognate operator sites, generating a visible uorescent spot. Other methods that track chromatin in living cells rely on the incorporation of uorescently labeled deoxy- or ribo-NTP analogs (for example, Zink et al., 1998) or on the expression of photoactivatable uorescent proteins linked to histones (Kruhlak et al., 2006; Wiesmeijer et al., 2008). Although these avoid the use of bacterial operator arrays, they do not allow one to score the dynamics of specic chromosomal loci. Once the movement of a lacO-tagged locus is captured by time-lapse microscopy, the character of the movement can be quantied using a mean-squared displacement (MSD) analysis (Berg, 1993). Multiple time-lapse series of a given locus are acquired and are used to calculate the average of the squared distance covered by that locus, which is, in turn, plotted against increasing time intervals (Figure 1). In brief, MSD = < (xt xt+Dt)2 > where t is time and x is the position of a moving uorescent spot. This method of analysis is highly robust as it averages a large number of data points to generate quantitative movement parameters such as the diffusion coefcient and radius of constraint (Rc). The diffusion coefcient of a particle moving in a random Brownian walk is directly proportional to the initial slope on an MSD graph, and it scales with time (Berg, 1993). However, as time intervals increase, the mean square of the movement (MSD) curve will plateau because of the constraint or connement imposed by the nuclear sphere (that is, a moving chromosomal locus will not move beyond the connement of the nuclear envelope, regardless of the time interval queried) (Figure 1). From the plateau reached by the MSD curve over time, one can calculate the radius of the constrained volume within which the particle moves. Using this model for single genomic locus movement, early experiments suggested that the diffusion coefcient of chromatin movement ranges from 104 to 103 mm2/s, whichremarkablyseemed to hold true for bacteria and yeast as well as Drosophila and mammalian cells, regardless of the precise tracking method used or the range of nuclear sizes (Borneth et al., 1999; Chubb et al., 2002; Heun et al., 2001b; Marshall et al., 1997; Neumann et al., 2012; Vazquez et al., 2001; Weber et al., 2012). In a pioneering work, Marshall et al. (1997) showed that chromatin movement appears to be a constrained random walk. More recent studies indicate that the movement of chromosomal loci in bacteria, yeast, and mammalian cells does not fully recapitulate a Brownian random walk (Borneth et al., 1999; Neumann et al., 2012; Weber et al., 2010, 2012). Both intrinsic and external constraints appear to restrict movement, causing it to appear nonrandom. On the other hand, the movement of an excised, extrachromosomal ring of yeast chromatin is indistinguishable from a random walk trajectory (Neumann et al., 2012).
Cell 152, March 14, 2013 2013 Elsevier Inc. 1355

Figure 1. MSD Analysis


MSD values are derived from determining the distance moved by a particle over increasing time intervals, Dt. In other words, (Xt Xt+Dt), where X is the position at time t. The top depicts a characteristic MSD plot for a random walk where the slope (m) equals the diffusion coefcient (D) times twice the number of dimensions in which movement is measured (d). The middle panel shows the shape of a MSD graph in cases where the motion is directional. The mobility of a particle moving according to Brownian motion within conned space will generate a curve that levels off at larger time intervals (bottom). In this case, the plateau (p) that the curve reaches is equal to the square root of 2/5 times the number of dimensions (d) times the radius of constraint (Rc) (Neumann et al., 2012).

The Stokes-Einstein equation: D = kB T=6pha, in which kB is the Boltzmann constant, h is the viscosity of the liquid, and a is the size of the moving particle, dictates that, if chromatin movement were Brownian, the diffusion coefcient (D) would be directly proportional to the temperature (T) in degrees Kelvin (Weber et al., 2012). This has been tested in both yeast and bacteria by determining the diffusion coefcient at different temperatures and checking for a linear relationship. Movement of a bacterial locus has been determined at temperatures ranging from 10 C (283 K) to 30 C (303 K). The expected change in diffusion coefcient should be around 7% for Brownian motion within this temperature range, yet a 2-fold increase is calculated (Weber et al., 2012). This argues that DNA motion is superthermal. Similarly, unexpectedly large changes in mobility are scored for tagged loci in yeast for temperature shifts from 25 C to 37 C (Neumann et al., 2012). This, coupled with the fact that locus movement in yeast is signicantly affected by the level of glucose in the media and by the presence of protonophores that collapse mitochondrial and plasma membrane potentials (Gartenberg et al., 2004; Heun et al., 2001b; Marshall et al., 1997), argues strongly that ATP is likely to be involved in chromatin movement. This effect of ATP depletion is also observed in mammalian cells (Chubb et al., 2002). In addition to the nonlinear effects of temperature, which argue against pure Brownian motion, studies in bacteria identify a drag on moving genomic loci that cannot be explained by the principles of Brownian motion (Weber et al., 2010). Earlier, eukaryotic loci had been observed to undergo spring-likeand thus nonrandommovements, visualized as large unidirectional steps (>0.5 mm in <10 s) in yeast, that are often followed by similar movement in the opposite direction (Heun et al., 2001b). The analysis of chromatin dynamics in Drosophila spermatocytes revealed that tagged loci have a tendency to move in one direction and then return to their previous location (Vazquez et al., 2001). Based on such observations and on computer simulations, it has been proposed that chromosomal movement is best explained by fractional Langevin motion, in which an elastic, semiviscous milieu (i.e., the nucleoplasm) pushes back on the moving particle, possibly accounting for this irregular, spring-like movement (Weber et al., 2010). A further source of drag on chromatin diffusion comes from the contiguity of the chromatin ber itself. As mentioned above, when a chromosomal domain is excised from a chromosome forming an extrachromosomal ring of chromatin, the diffusion coefcient doubles and the Rc becomes identical to the radius of the nucleus (Gartenberg et al., 2004). It was concluded that the anking chromosomal DNA and the context of a tagged locus within the linear molecule of chromosomal DNA restrict chromatin movement. A comparison of actual movement with computer simulations shows that the MSD curve of the excised particle ts exactly that of a simulated random walk, with a radius the same as that of the nucleus0.9 mmwhereas the integrated locus exhibits additional constraint (Neumann et al., 2012). Some constraint likely arises from natural chromosomal anchorage sites, such as centromere tethering to the membrane-associated spindle pole body, the interaction of telomeres to structural proteins of the nuclear envelope, or the association of stress-induced genes with pores (Cabal et al., 2006;

1356 Cell 152, March 14, 2013 2013 Elsevier Inc.

Gartenberg et al., 2004; Hediger et al., 2002; Taddei and Gasser, 2012; Taddei et al., 2006, 2010; Zimmer and Fabre, 2011). The association of mammalian loci with the nucleolus can also constrain locus movement (Chubb et al., 2002; Wiesmeijer et al., 2008), and, in telomerase-decient ALT cells (alternative lengthening of telomeres ), telomeres appeared to be tethered to promyelocytic leukemia (PML) bodies (Molenaar et al., 2003). In conclusion, the mobility of a DNA locus in the interphase nucleus can be considered as nondirected motion that uctuates with ATP levels and depends disproportionately on temperature. The constraint on DNA stems from the chromatin ber itself, the nature of the nucleoplasm, and protein-protein interactions that anchor loci to nuclear structures. Cellular Mechanisms that Regulate Chromatin Movement DNA mobility changes during the cell cycle and during development, which raises the possibility that it may be regulated. For instance, cultured Drosophila spermatocytes display two modes of movement during premeiotic development. Whereas larger changes in positions are observed early in differentiation, more constrained motion is detected in mature spermatocytes (Vazquez et al., 2001). For tagged loci in yeast, less movement is observed in S phase than in G1 phase nuclei, a drop that correlates inversely with the number of active replication forks and possibly also with dNTP levels (Heun et al., 2001b). In mammals, results obtained by visualizing chromosomal regions using a photoactivatable histone fusion suggest that no change in mobility occurs between cells in mid- and late G1, S, and G2 (Walter et al., 2003; Wiesmeijer et al., 2008). It appeared, however, that there is signicantly more mobility early in G1, as compared to later stages of the cell cycle. Indeed, measuring the distance between chromosome territories labeled with dNTP analogs shows that chromosome territories can move over distances ranging between 0.47 and 4.44 mm in early G1, whereas at later cell-cycle stages, the distances observed are only within 0.25 to 2.11 mm (Walter et al., 2003). Taken together, these results argue that the mobility of a chromosomal locus is under the control of biological, as well as physical, parameters. In some instances, changes in transcriptional activity are correlated with the nuclear position of a locus. For example, yeast telomeres, which silence nearby genes, are found at the nuclear periphery, where they are anchored through an interaction of the silencing machinery with the nuclear envelope (Gartenberg et al., 2004; Gotta et al., 1996; Taddei and Gasser, 2012; Taddei et al., 2004; Zimmer and Fabre, 2011), while active genes can be tethered to nuclear pores (Cabal et al., 2006; Casolari et al., 2004; Egecioglu and Brickner, 2011; Taddei et al., 2006). It was hypothesized that an increase in transcriptional output might enhance the mobility of a locus to facilitate its relocalization to the appropriate nuclear compartment. This agrees with experiments by Chuang et al. (2006), who have shown that the activation of transcription by targeting the viral transactivator VP16 to a heterochromatic transgene array in mammalian cells leads to long-range directional movement perpendicular to the nuclear membrane. This experiment establishes a link between transcriptional activation, decompaction, and the mobility of chromatin and provides a striking

example of non-Brownian motion. Similarly, the targeting of a fusion of LexA-VP16 fusion to a nontelomeric locus in yeast increases both transcriptional activity and movement, scored as the radius of constraint and number of large steps (Neumann et al., 2012). Moreover, the targeting of this same transcriptional activator to an otherwise silent telomeric locus shifts it away from the nuclear envelope (Hediger et al., 2006; Taddei et al., 2006). Although these examples link transcriptional control with movement, there are many examples in which transcription and mobility can be uncoupled. For example, the highly transcribed genes that associate with nuclear pores become anchored and are therefore highly constrained (Cabal et al., 2006; Taddei et al., 2006). In contrast, a transcriptionally silent chromatin ring can diffuse freely throughout the nucleus if the proteins necessary for its anchoring to the nuclear envelope are missing (Gartenberg et al., 2004). Most signicantly, the directed binding of a LexA-Gal4 fusion protein to a promoter can increase its transcriptional output without altering chromatin movement (Neumann et al., 2012), and both genetic and chemical inhibitors of transcriptional elongation failed to alter chromatin mobility in yeast (Neumann et al., 2012; A. Taddei, F.R. Neumann, and S.M.G., unpublished data). Pliss et al. (2009) demonstrated that transcription does not correlate with chromatin movement in cultured mammalian cells, and others have shown that chromatin moves similarly whether or not it binds CFP-SUV39H1, an enzyme that methylates histone H3 lysine 9 (H3K9) in heterochromatin (Wiesmeijer et al., 2008). In brief, transcriptional activation and repression are not obligatorily linked to either movement or tethering, even though transcription can correlate with enhanced movement in specic cases. If transcription can be uncoupled from locus mobility, then what drives chromatin movement and why is it sensitive to ATP levels? One ATP-dependent activity that correlates with transcription in a context-dependent manner is nucleosome remodeling. For example, the activation of the yeast PHO5 promoter coincides with the removal of nucleosomes from the promoter region by two nucleosome remodeling complexes, Swi2/Snf2 and INO80 (Barbaric et al., 2007; Steger et al., 2003). When PHO5 is tracked by the LacI-lacO system during its activation, the open chromatin structure shows an increased diffusion coefcient and a larger Rc (Neumann et al., 2012). In the presence of phosphate, on the other hand, which represses PHO5 transcription by preventing the removal of nucleosomes in the promoter, the diffusion coefcient and Rc were smaller. Furthermore, deleting ARP8, which encodes a subunit of the INO80 nucleosome remodeler, severely reduces transcriptional output, provokes a failure to respond to phosphate levels, and leads to a nucleosomal structure in the promoter that is only partially accessible (Barbaric et al., 2007; Steger et al., 2003). If chromatin structure were responsible for the mobility of this locus, one would predict that the locus would have an intermediate level of motion and would not respond to phosphate levels in an arp8 mutant. This, indeed, was the case (Neumann et al., 2012). These ndings argue that nucleosome remodeling at the endogenous PHO5 locus correlates tightly with induced locus mobility. Nucleosome remodelers are characterized by the presence of a large ATPase subunit of the Snf2 family, which typically
Cell 152, March 14, 2013 2013 Elsevier Inc. 1357

associates with numerous accessory subunits and inuences virtually all DNA-based transactions. Not surprisingly, recent work has begun to examine the impact of remodelers on DNA mobility in contexts other than transcription (Clapier and Cairns, 2009; Dion et al., 2010; Lans et al., 2012). Specically, the recruitment of the INO80 remodeler, which helps remodel nucleosomes at double-strand breaks (Morrison et al., 2004; Tsukuda et al., 2005; van Attikum et al., 2004), increases the Rc of an undamaged locus to which it is targeted without increasing transcription (Neumann et al., 2012). The effect is entirely dependent on the Ino80 catalytic subunit, as the targeting of a mutant that cannot bind ATP fails to increase chromatin mobility (Neumann et al., 2012). Moreover, the targeting of another remodeler, the Swi2/Snf2 ATPase complex, did not promote movement in a similar manner. It is unclear why this is the case, but it may reect differences in biochemical activities of the two enzymes or an absence of cofactors or histone modications at the loci tested. Given that there are 17 Snf2-type ATPases in yeast and 53 in human (Flaus et al., 2006), it is attractive to imagine that different chromatin remodelers alter chromatin mobility in different ways, regulating long-range chromatin movement while they alter local nucleosomal organization. Mobility of Damaged DNA There are several ways of probing for the mobility of damaged DNA. One is to introduce specic patterns of DNA damage, for example, with a linear UV light or ionizing radiation (IR) tracks and xing the cells at several time points after damage induction (e.g., Aten et al., 2004). Immunouorescence against specic DNA repair markers can then be applied to see whether the linear track has changed its shape (e.g., Jakob et al., 2011). Alternatively, live-cell imaging can be used after damage induction to watch the diffusion of a repair factor of choice fused to a uorescent protein. This assay, when coupled to discrete patterns of DNA damage tracks, tends to be qualitative because discrete particles to track are not present. However, randomly induced damage by IR or DNA-damaging drugs lead to discernible repair foci (Haaf et al., 1995), which can be followed using the same single-particle tracking described above for lacO-tagged chromosomal loci. Finally, for site-specic damage, one can label the genomic site to be damaged with bacterial operators (e.g., Nagai et al., 2008; Soutoglou et al., 2007). This is particularly useful as the differences in mobility between the same damaged and undamaged locus can be addressed. It is, however, limited to double-strand breaks and for a specic protein-DNA adduct in yeast (Nielsen et al., 2009). Several recent studies have investigated whether repair foci and, by extension, DNA lesionsshow long-range mobility in mammalian cells. These studies have yielded mixed results. For instance, Nelms et al. (1998) showed in human cells that irradiation-induced damage imaged by an incorporation of the thymidine analog bromodeoxyuridine (BrdU) moves very little. Similar results were obtained using live-cell imaging of a single double-strand break induced by the endonuclease I-SceI (Soutoglou et al., 2007) and by tracking laser-damaged regions in photosensitized cells (Kruhlak et al., 2006). Meanwhile, Jakob et al. (2009a, 2009b) have used IR induced by heavy ion sources to show that repair foci have similar kinetics as undam1358 Cell 152, March 14, 2013 2013 Elsevier Inc.

aged loci, but do not appear to be constrained over several hours, suggesting that damaged DNA could travel large distances given enough time. Finally, damage induced by a particle irradiation is highly mobile and moved over large distances within minutes (Agarwal et al., 2011; Aten et al., 2004; Krawczyk et al., 2012). This latter situation is reminiscent of results obtained with uncapped telomeres in mouse embryonic cells (Dimitrova et al., 2008). To reconcile this wide range of results, we propose that different types of damage, different cell lines, variable growth conditions, the specic marker protein tracked, and/or the method of visualizing movement all contribute to different results. Two recent studies in budding yeast, in which various parameters of damage and imaging could be better controlled, showed that a single double-strand break is more mobile than the same -Hattab and Rothstein, undamaged locus (Dion et al., 2012; Mine 2012). In these studies, MSD analyses show that the genomic locus monitored is constrained to a Rc of about 0.4 mm, whereas after DSB induction, the Rc increases to about 0.7 mm in haploid cells and 0.9 mm in diploids. These values are similar even though one group used a haploid strain and the other used a diploid strain. The change in mobility ranged from 13% to 47% of the nuclear volume in haploid cells or from 3% to 30% in diploid cells -Hattab and Rothstein, 2012). (Dion et al., 2012; Mine In yeast, specic genetic factors that affect the movement of -Hattab and Rothstein broken DNA have been identied. Mine (2012) dened that deletion of SAE2, which codes for an enzyme important in DSB end resection, has no effect on mobility save a delayed time between DSB induction and the increase in movement. These data suggest that resection, which is delayed, but not abolished, in a sae2 mutant, is required for the enhanced mobility of DSBs. Moreover, Rad51 and Rad54, two proteins that work downstream of the resection step, are required for full induction of DSB mobility but have no effect on the mobility -Hattab and of an undamaged locus (Dion et al., 2012; Mine Rothstein, 2012). Rad54 is a SNF2-type ATPase, like INO80, that functions in assisting strand invasion during homologous recombination (Ceballos and Heyer, 2011). As is the case for INO80 targeting, the role of Rad54 requires its remodeling activity; a point mutant that abolishes the Rad54 ATPase activity has the same effects as a full deletion (Dion et al., 2012). Impairing DSB repair is not the only way to decrease the movement of damaged chromatin. Indeed, mutating upstream components of the DNA damage response (DDR), Mec1 and Rad9 (ATR and 53BP1 in mammals), abolish the enhanced movement of DSBs (Dion et al., 2012). In contrast, deletion of the downstream kinase, Rad53 (homologues of mammalian ATR and 53BP1), does not, suggesting that downstream checkpoint functions do not regulate DSB mobility (Dion et al., 2012). It is possible that Mec1/ATR activation is required to modify another protein that acts directly on chromatin to enhance movement. We note that INO80 components are direct targets of Mec1 (Morrison et al., 2007). Another possibility is that the DDR modies chromatin itself, for example, by phosphorylating H2A (H2AX in mammals). This would in turn recruit remodelers, such as INO80, and scaffold proteins, including Rad9. In this way, the checkpoint kinase could change the properties of chromatin and enhance its movement by triggering a cascade of events.

The DDR also seems to affect DSB mobility in mammalian cells. Indeed, Dimitrova et al. (2008) and colleagues have shown that, when telomeres are uncapped and therefore readily confused with DSBs, their movement increases in a 53BP1dependent manner. ATM, a key DDR regulatory kinase, is also involved in this, as ATM null cells have uncapped telomeres that show lower mobility (Dimitrova et al., 2008), and chemical inhibition of ATM results in a similarly reduced Rc in human cells (Krawczyk et al., 2012). Taken together, these results provide evidence that DSB movement requires DNA damage checkpoint kinases in yeast, mouse, and human. Consistent with the idea that chromatin remodeling contributes to DNA mobility, the deletion of arp8, which impairs INO80-dependent remodeling, also leads to decreased mobility of a DSB (Neumann et al., 2012). Although the effect was partial, other studies in cultured human cells nd that inhibition of either histone deacetylases (HDACs) or histone acetyltransferases (HAT) also reduces Rc values for damaged DNA (Krawczyk et al., 2012). Although the exact enzymes responsible for these effects are not known, the results suggest that chromatin structure may be an important regulator of the mobility of a damaged chromosomal locus, both in mammals and in yeast. Chromatin Mobility, Homology Search, and DSB Repair DSBs can be repaired by homologous recombination (HR) or nonhomologous end joining (NHEJ). In yeast, the primary repair pathway is HR, whereas, in mammals, NHEJ predominates. During HR, a DSB needs to search for an identical template for repair (Barzel and Kupiec, 2008; Gehlen et al., 2011). Often, a DSB is repaired by exchange with its identical sister chromatid, which is synthesized during S phase because the damaged site and undamaged template are held together by cohesin. This leads to largely error-free repair. In diploid cells, the homologous chromosome can also be used as a template, although this is riskier, as the cell can lose heterozygosity upon repair. In the rare cases in which the sister chromatid is not available as a template, a long-range search for a homologous sequence may be needed, for example, when a DSB occurs before replication or if the sister is also broken. A well-studied example of this is the repair of a regulated DSB at the budding yeast MAT locus, which encodes mating type information. Gene conversion of the cleaved MAT locus by one of two templates, HML or HMR, found at the ends of the same chromosome, allows yeast to switch its mating type as often as once per cell cycle (Haber, 2012). Although the HM loci are preferentially used as donors, HR can also occur with a template on another chromosome (Agmon et al., 2009; Ira et al., 2003; Keogh et al., 2006). The search for templates on other chromosomes occurs slowly, but approaches 100% efciency over extended periods of time (Aylon and Kupiec, 2003; Dion et al., 2012). The question in all cases, however, is how the DSB nds its homologous partner in a vast excess of nonhomologous sequence. Although the homology search has been established as a major ratelimiting step in HR in yeast (Wilson et al., 1994), the process itself remains poorly understood. It seems likely that chromatin movement is involved, given the requirement for cut site and template to meet (Gehlen et al., 2011).

Recent studies show that the kinetics and efciency of repair by recombination correlated positively with DNA mobility. For instance, targeting INO80 subunits to ectopic recombination substrate in yeast increases the rates of homologous recombination (Neumann et al., 2012). Conversely, in rad9 mutants, which have more restricted DSB mobility, the appearance of recombination intermediates is delayed (Dion et al., 2012). This effect is not due to the role of Rad9 in arresting the cell cycle (Weinert and Hartwell, 1988) or in repressing resection (Chen et al., 2012; Lazzaro et al., 2008; Ngo and Lydall, 2010). Moreover, the delayed kinetics of MAT recombination in rad9 mutants was seen only when recombination templates were found on an unlinked chromosomethat is, not when repair was effected by recombination with templates in cisarguing that the long-range search is specically limited by DNA mobility (Dion et al., 2012). In mouse embryonic stem cells, one of Rad9s orthologs, 53BP1, is required for both telomere fusion (i.e., repair by NHEJ) and the mobility of uncapped telomeres (Dimitrova et al., 2008). Even though these data are largely correlative, we speculate that enhanced mobility facilitates DNA repair in both yeast and higher eukaryotes. The movement of DNA damage could also be harnessed for other purposes. For example, in yeast cells, persistent DSBs are recruited to the nuclear periphery for processing, whereas DSBs that can be repaired by HR are predominantly found in the center of the nucleus (Bystricky et al., 2009; Nagai et al., 2008). The relocalization of DSBs to different compartments of the nucleus requires that chromatin is mobile, although it is unclear whether mobility is rate limiting for the accumulation of DSBs at the nuclear periphery. Indeed, rad9-decient cells have little difculty shifting DSBs to the nuclear periphery (Nagai et al., 2008), even though their mobility is low (Dion et al., 2012). It remains to be seen whether other factors involved in the peripheral recruitment of DSBs impact their mobilityfor instance, the histone variant Htz1 (H2A.Z) (Kalocsay et al., 2009) or the conserved SUN domain protein Mps3 (Oza et al., 2009). In 2007, a yeast study showed that DSBs that occur within the ribosomal DNA (rDNA) accumulate outside of the nucleolus (Torres-Rosell et al., 2007). The exclusion of DSBs from the nucleolus depends on two cohesin-like factors, Smc5 and Smc6. DNA mobility in this case could facilitate the change in nuclear location. Importantly, a similar study using live-cell imaging in Drosophila cells showed that DSBs induced by ionizing radiation are eventually excluded from large heterochromatic domains (Chiolo et al., 2011). Here again, there is a requirement for Smc5 and Smc6, suggesting that a similar mechanism functions in yeast and ies. Strikingly, many of the factors involved in DSB mobility in yeast are also implicated in the movement of damage away from a heterochromatic domain, including the DDR and the HR machinery (Chiolo et al., 2011; Dion et al., 2012). In cultured human cells, ionizing radiation can be delivered in linear tracks, and mobility can be inferred by xing the cells at different time points after damage induction and scoring deformities in the track path. By marking the damage path with an antibody against the phosphorylated form of H2AX, it was shown that the track curves around heterochromatin domains, suggesting that the DNA damage occurs within the domain but is then
Cell 152, March 14, 2013 2013 Elsevier Inc. 1359

Figure 2. Chromatin Movement Driven by Nucleosome Remodeling


A model for how remodeling-based nucleosome eviction might drive chromatin movement to impact both transcription and DSB repair, adapted from (Neumann et al., 2012). Chromatin can be thought of as a polymer chain of stiff segments interspersed by exible linkers. The stiffness of the overall ber is determined by its persistence length, which is dened as the length of the polymer over which there is no apparent change in direction (i.e., no bending). Thus, the larger the persistence length is, the stiffer the ber. We propose that the remodeling that occurs during transcriptional activation or during the processing of DSBs can enhance movement by inserting a exible linker into a stiff chromatin domain. In other words, the persistence length of the chromatin domain will be smaller due to nucleosome removal in the middle of the domain. The extra exibility will, in turn, increase the volume in which a locus can move. This can be harnessed either to enhance HR with an ectopic donor sequence or to reach a nuclear compartment conducive for transcription, repair, splicing, or export.

require a genome-wide search of a template. This model predicts that different types of DNA damage could lead to different modes of movement, depending on how the lesion is sensed and repaired. Indeed, in haploid yeast, spontaneous repair centers marked by Rad52 are conned to 6% of the nuclear volume as compared to 15% for a single proteinDNA adduct and nearly 50% for a DSB (Dion et al., 2012). This model conforms well to the data obtained in yeast, but obvious problems exist in the case of mammalian cells. Clearly, given that the size of an average mammalian nucleus is much larger than a nucleus in yeast (on average, 200- to 400-fold in volume), much more movement would be required to explore nuclear space. Nonetheless, some aspects of the model may hold true. Specically, it was shown in mammalian cells that different damaging agents lead to different degrees of repair center mobilitythat is, topoisomerase-II-dependent DNA breaks move within a larger radius of constraint than IR-induced damage (Krawczyk et al., 2012). This lesion-specic character of repair focus mobility may account for the differences seen in studies of mammalian chromatin movement; each study analyzed a different type of DNA damage. Given that many secondary tumors arise from chromosomal translocations in cancers treated with DNA-damaging drugs, it may be valuable to consider, in the design of therapeutic protocols, the diffusion properties of the different types of DNA lesions induced. Open Questions As the mechanisms behind chromatin mobility start to be untangled, a number of major questions remain. Is There a Cause-Effect Relationship between Chromatin Mobility and DSB Repair or Transcription? The data obtained so far on the relationship between DSB repair and transcription, on one hand, and chromatin mobility, on the other, are largely correlative. One experiment that could establish causation would be to visualize the homology search step live and, at the same time, to target a remodeler that enhances movement to the template site and see whether the pairing itself occurs faster in these conditions. A similar experiment could be done in the case of uncapped telomeres in mammalian systems to ask whether they encounter each other at higher frequencies when there is more movement (for instance, in the absence of 53BP1). In the context of transcription, one can imagine following the mobility of a locus and, at the same time, the transcriptional output. The cell line to do this experiment is available already (Janicki et al., 2004). In this assay, the locus is tagged with a lacO array, and the RNA is tagged with a MS2 binding consensus, which is bound by the bacteriophage protein MS2 fused to a uorescent protein. The mobility of the DNA locus can be visualized while obtaining real-time quantitative data on transcriptional output. Such studies, although challenging, will certainly yield interesting results. If the Induction of Chromatin Movement Is Intrinsic to a Subset of Nucleosome Remodelers, which of Their Functions Actually Drives Locus Mobility? Nucleosome remodeling appears to inuence the movement of both damaged and undamaged chromatin. However, the changes that drive this movement remain unclear. We proposed that the displacement of nucleosomes by means of a chromatin

excluded from heterochromatin during its repair (Jakob et al., 2011). It should be noted that a haploid yeast nucleus has an average diameter of 1.8 mm (Heun et al., 2001b), which is similar to the size of a single heterochromatic domain in human cells. Thus, chromatin movement on the scale seen in budding yeast may be relevant to the exclusion of damaged DNA from densely packaged heterochromatin, as described in higher eukaryotes. Chromatin Mobility: A Double-Edged Sword for Genome Stability Based on the studies summarized here, we propose a model in which chromatin remodeling activities that accompany DSB repair can be harnessed to promote recombination with ectopic sequences and/or to move away from nuclear compartments that are refractory to repair (Figure 2). We propose that this movement derives from chromatin-remodeling enzymes and is regulated by the DNA repair machinery and the DNA damage response. Long-range movement, in the order of 1 mm in yeast, of a troublesome DNA lesion would thus promote its repair by HR and suppress the lethality provoked by an irreparable DSB (Bennett et al., 1993). On the other hand, it could lead to loss of heterozygosity and translocations if not properly regulated. Damage movement is thus a double-edged sword that needs careful regulation to avoid genomic rearrangements. We imagine that this balance is kept by the DDR that will only modify specic downstream targets when the damage is severe enough to
1360 Cell 152, March 14, 2013 2013 Elsevier Inc.

remodeler leads to a more exible chromatin by disrupting the structure of the chromatin ber. This may lead to a smaller persistence length, given that an open linker would be introduced in the midst of a higher-order structure (Neumann et al., 2012) (Figure 2). Targeting different chromatin remodelers with different biochemical activities and interrogating their effects on nucleosome positioning near the target site may help us decipher the mechanism through which remodelers inuence chromatin mobility. The generation or disruption of higher-order chromatin structures may respectively restrain or promote locus mobility. Our model would vary in its details depending on which mode of folding is adopted the nucleosome ber (Grigoryev and Woodcock, 2012). There is not enough information at the moment to identify which specic changes to chromatin structure would enhance or restrain mobility. The simplest biophysical parameter that could account for movement of a polymer ber, however, is the exibility of the ber. What Is the Role of Cohesin in the Movement of Repair Foci? Cohesin holds sister chromatids together (Nasmyth, 2011) and accumulates at sites of DSBs (Unal et al., 2004). These characteristics make it an appealing candidate to help control the movement of DSBs, especially for those that occur spontaneously during replication or when forks encounter protein-DNA adducts. In these cases, the template for HR-mediated repair is readily available in the form of an undamaged sister chromatid held in place by cohesin. Although this may well restrain mobility, cohesin is clearly not the only factor that restricts movement at DSBs. In certain conditions that allow the visualization of damaged DNA in G1 phase cells, the constraint on damage mobility is nearly identical to that observed in S phase cells, even though there is no sister-sister cohesion in G1 (Dion et al., 2012). Determining what effect chromatid-chromatid linkage through cohesin has on chromatin dynamics will go a long way toward elucidating the regulation of chromatin movement and its controlled release. Is Enhanced Mobility Restricted to Sites of Damage in Yeast? If Not, What Drives Genome-wide Changes in Mobility, and What Is Its Purpose during Repair? Is This Found in Other Organisms? -Hattab and Rothstein (2012) showed that, upon DSB Mine induction in budding yeast, there is also an increase in chromatin mobility at unrelated, undamaged loci. This is likely to depend on the dosage of damage incurred because a single DSB does not cause a similar increase at an ectopic locus in haploid cells (Dion -Hattab and Rothstein monitored the et al., 2012). Although Mine increase in movement in diploid cells, it is difcult to imagine mechanisms regulating chromatin mobility that are ploidy specic. Reconciling divergent results, we propose that a threshold of damage is necessary to provoke Mec1/ATR activation. Given that a DNA checkpoint response is necessary for the increase in mobility at the break itself, its propagation to other sites may be dose dependent, requiring activation of the DDR. If, indeed, sufcient damage enhances chromatin movement genome wide, then it is possible that the checkpoint kinase Mec1/ATR and its downstream cascade are directly implicated in this phenomenon. Further work with appropriate mutants is

needed to identify what signals a global increase in chromatin movement in response to DNA damage. In addition to a checkpoint signaling cascade, it is conceivable that ectopic movement might also depend on chromatin remodelers or histone modications. Finally, whatever the mechanism may be, it will be important to examine whether a global increase in chromatin mobility has functional implications for DSB repair, such as promoting the -Hattab and Rothstein, homology search required for HR (Mine 2012). What Is the Contribution of DNA Mobility to the Genesis of Translocations? There have been two models put forth to account for the generation of recurrent carcinogenic translocations in humans, called the breakage rst and contact rst models (Savage, 1996). The rst model posits that breaks must occur rst and then will roam throughout the nucleus until they nd each other, leading to translocation between distant sites. The contact rst model, on the other hand, predicts that the two breaks needed for a translocation will occur preferentially on juxtaposed chromosomes. Quite naturally, after breakage, these two sites would recombine at higher frequencies. Both of these models are extreme scenarios. Although it is unlikely that DSBs can explore an entire mammalian nucleus, given its 200 to 400 times larger volume than a typical yeast nucleus, it is also unlikely that mobility has no inuence whatsoever on which DNA ends are ligated to each other. Richardson and Jasin (2000) showed unambiguously that two DSBs must occur before a translocation can be generated. It seems obvious that, even if DSBs are extremely mobile, they are still more likely to encounter break sites that occur close to their starting point rather than those that are further away. Indeed, recent large-scale studies have conrmed this, demonstrating that translocations tend to occur between sites that are spatially juxtaposed in the nucleus (Hakim et al., 2012; Zhang et al., 2012). Arguing in favor of movement, on the other hand, Spehalski et al. (2012) showed that Myc-Igh translocations in mouse cells occur at the same frequency regardless of where the Igh is placed in the genome. Understanding what regulates DSB movement and its impact on specic recombination events is clearly important for understanding oncogenic translocations. We note, however, that there may be other reasons that damaged sites move. It may be important that a break moves far enough to encounter a nuclear compartment that favors repair or to move away from an environment rich in repetitive elements. Moving too far, on the other hand, may generate deleterious recombination events. The mechanisms that regulate chromatin mobility may thus inuence genome stability. It is an intriguing thought that one might harness these observations on how chromatin movement impacts chromosomal translocations to design cancer therapies that minimize treatmentinduced chromosome exchange.

ACKNOWLEDGMENTS We thank Fisun Hamaratoglu, Michael Hauer, and Andrew Seeber for critical reading of the manuscript. Work in the Gasser laboratory is supported by the Novartis Research Foundation, the Swiss National Science Foundation, and various EU Marie Curie Networks.

Cell 152, March 14, 2013 2013 Elsevier Inc. 1361

REFERENCES nole , A., Eppink, B., Linsen, S.E., Agarwal, S., van Cappellen, W.A., Gue Meijering, E., Houtsmuller, A., Kanaar, R., and Essers, J. (2011). ATP-dependent and independent functions of Rad54 in genome maintenance. J. Cell Biol. 192, 735750. Agmon, N., Pur, S., Liefshitz, B., and Kupiec, M. (2009). Analysis of repair mechanism choice during homologous recombination. Nucleic Acids Res. 37, 50815092. Aten, J.A., Stap, J., Krawczyk, P.M., van Oven, C.H., Hoebe, R.A., Essers, J., and Kanaar, R. (2004). Dynamics of DNA double-strand breaks revealed by clustering of damaged chromosome domains. Science 303, 9295. Aylon, Y., and Kupiec, M. (2003). The checkpoint protein Rad24 of Saccharomyces cerevisiae is involved in processing double-strand break ends and in recombination partner choice. Mol. Cell. Biol. 23, 65856596. rz, W., and Korber, Barbaric, S., Luckenbach, T., Schmid, A., Blaschke, D., Ho P. (2007). Redundancy of chromatin remodeling pathways for the induction of the yeast PHO5 promoter in vivo. J. Biol. Chem. 282, 2761027621. Barzel, A., and Kupiec, M. (2008). Finding a match: how do homologous sequences get together for recombination? Nat. Rev. Genet. 9, 2737. Bennett, C.B., Lewis, A.L., Baldwin, K.K., and Resnick, M.A. (1993). Lethality induced by a single site-specic double-strand break in a dispensable yeast plasmid. Proc. Natl. Acad. Sci. USA 90, 56135617. Berg, H.C. (1993). Random Walks in Biology, Expanded Edition (Princeton, New Jersey: Princeton University Press). Borneth, H., Edelmann, P., Zink, D., Cremer, T., and Cremer, C. (1999). Quantitative motion analysis of subchromosomal foci in living cells using four-dimensional microscopy. Biophys. J. 77, 28712886. Bystricky, K., Van Attikum, H., Montiel, M.D., Dion, V., Gehlen, L., and Gasser, S.M. (2009). Regulation of nuclear positioning and dynamics of the silent mating type loci by the yeast Ku70/Ku80 complex. Mol. Cell. Biol. 29, 835848. Cabal, G.G., Genovesio, A., Rodriguez-Navarro, S., Zimmer, C., Gadal, O., Lesne, A., Buc, H., Feuerbach-Fournier, F., Olivo-Marin, J.C., Hurt, E.C., and Nehrbass, U. (2006). SAGA interacting factors conne sub-diffusion of transcribed genes to the nuclear envelope. Nature 441, 770773. Campos, E.I., and Reinberg, D. (2009). Histones: annotating chromatin. Annu. Rev. Genet. 43, 559599. Casolari, J.M., Brown, C.R., Komili, S., West, J., Hieronymus, H., and Silver, P.A. (2004). Genome-wide localization of the nuclear transport machinery couples transcriptional status and nuclear organization. Cell 117, 427439. Ceballos, S.J., and Heyer, W.D. (2011). Functions of the Snf2/Swi2 family Rad54 motor protein in homologous recombination. Biochim. Biophys. Acta 1809, 509523. Chen, X., Cui, D., Papusha, A., Zhang, X., Chu, C.D., Tang, J., Chen, K., Pan, X., and Ira, G. (2012). The Fun30 nucleosome remodeller promotes resection of DNA double-strand break ends. Nature 489, 576580. Chiolo, I., Minoda, A., Colmenares, S.U., Polyzos, A., Costes, S.V., and Karpen, G.H. (2011). Double-strand breaks in heterochromatin move outside of a dynamic HP1a domain to complete recombinational repair. Cell 144, 732744. Chuang, C.H., Carpenter, A.E., Fuchsova, B., Johnson, T., de Lanerolle, P., and Belmont, A.S. (2006). Long-range directional movement of an interphase chromosome site. Curr. Biol. 16, 825831. Chubb, J.R., Boyle, S., Perry, P., and Bickmore, W.A. (2002). Chromatin motion is constrained by association with nuclear compartments in human cells. Curr. Biol. 12, 439445. Clapier, C.R., and Cairns, B.R. (2009). The biology of chromatin remodeling complexes. Annu. Rev. Biochem. 78, 273304. Dimitrova, N., Chen, Y.C., Spector, D.L., and de Lange, T. (2008). 53BP1 promotes non-homologous end joining of telomeres by increasing chromatin mobility. Nature 456, 524528.

Dion, V., Shimada, K., and Gasser, S.M. (2010). Actin-related proteins in the nucleus: life beyond chromatin remodelers. Curr. Opin. Cell Biol. 22, 383391. Dion, V., Kalck, V., Horigome, C., Towbin, B.D., and Gasser, S.M. (2012). Increased mobility of double-strand breaks requires Mec1, Rad9 and the homologous recombination machinery. Nat. Cell Biol. 14, 502509. Egecioglu, D., and Brickner, J.H. (2011). Gene positioning and expression. Curr. Opin. Cell Biol. 23, 338345. Flaus, A., Martin, D.M., Barton, G.J., and Owen-Hughes, T. (2006). Identication of multiple distinct Snf2 subfamilies with conserved structural motifs. Nucleic Acids Res. 34, 28872905. Gartenberg, M.R., Neumann, F.R., Laroche, T., Blaszczyk, M., and Gasser, S.M. (2004). Sir-mediated repression can occur independently of chromosomal and subnuclear contexts. Cell 119, 955967. Gehlen, L., Gasser, S.M., and Dion, V. (2011). How broken DNA nds its template for repair: a computational approach. Prog. Theor. Phys. 191(Suppl. 191), 2029. Gotta, M., Laroche, T., Formenton, A., Maillet, L., Scherthan, H., and Gasser, S.M. (1996). The clustering of telomeres and colocalization with Rap1, Sir3, and Sir4 proteins in wild-type Saccharomyces cerevisiae. J. Cell Biol. 134, 13491363. Grigoryev, S.A., and Woodcock, C.L. (2012). Chromatin organization - the 30 nm ber. Exp. Cell Res. 318, 14481455. Haaf, T., Golub, E.I., Reddy, G., Radding, C.M., and Ward, D.C. (1995). Nuclear foci of mammalian Rad51 recombination protein in somatic cells after DNA damage and its localization in synaptonemal complexes. Proc. Natl. Acad. Sci. USA 92, 22982302. Haber, J.E. (2012). Mating-type genes and MAT switching in Saccharomyces cerevisiae. Genetics 191, 3364. Hakim, O., Resch, W., Yamane, A., Klein, I., Kieffer-Kwon, K.R., Jankovic, M., Oliveira, T., Bothmer, A., Voss, T.C., Ansarah-Sobrinho, C., et al. (2012). DNA damage denes sites of recurrent chromosomal translocations in B lymphocytes. Nature 484, 6974. Hediger, F., Neumann, F.R., Van Houwe, G., Dubrana, K., and Gasser, S.M. (2002). Live imaging of telomeres: yKu and Sir proteins dene redundant telomere-anchoring pathways in yeast. Curr. Biol. 12, 20762089. Hediger, F., Berthiau, A.S., van Houwe, G., Gilson, E., and Gasser, S.M. (2006). Subtelomeric factors antagonize telomere anchoring and Tel1-independent telomere length regulation. EMBO J. 25, 857867. Heun, P., Laroche, T., Raghuraman, M.K., and Gasser, S.M. (2001a). The positioning and dynamics of origins of replication in the budding yeast nucleus. J. Cell Biol. 152, 385400. Heun, P., Laroche, T., Shimada, K., Furrer, P., and Gasser, S.M. (2001b). Chromosome dynamics in the yeast interphase nucleus. Science 294, 21812186. Ira, G., Malkova, A., Liberi, G., Foiani, M., and Haber, J.E. (2003). Srs2 and Sgs1-Top3 suppress crossovers during double-strand break repair in yeast. Cell 115, 401411. Jakob, B., Splinter, J., Durante, M., and Taucher-Scholz, G. (2009a). Live cell microscopy analysis of radiation-induced DNA double-strand break motion. Proc. Natl. Acad. Sci. USA 106, 31723177. Jakob, B., Splinter, J., and Taucher-Scholz, G. (2009b). Positional stability of damaged chromatin domains along radiation tracks in mammalian cells. Radiat. Res. 171, 405418. brich, Jakob, B., Splinter, J., Conrad, S., Voss, K.O., Zink, D., Durante, M., Lo M., and Taucher-Scholz, G. (2011). DNA double-strand breaks in heterochromatin elicit fast repair protein recruitment, histone H2AX phosphorylation and relocation to euchromatin. Nucleic Acids Res. 39, 64896499. Janicki, S.M., Tsukamoto, T., Salghetti, S.E., Tansey, W.P., Sachidanandam, R., Prasanth, K.V., Ried, T., Shav-Tal, Y., Bertrand, E., Singer, R.H., and Spector, D.L. (2004). From silencing to gene expression: real-time analysis in single cells. Cell 116, 683698.

1362 Cell 152, March 14, 2013 2013 Elsevier Inc.

Kalocsay, M., Hiller, N.J., and Jentsch, S. (2009). Chromosome-wide Rad51 spreading and SUMO-H2A.Z-dependent chromosome xation in response to a persistent DNA double-strand break. Mol. Cell 33, 335343. Keogh, M.C., Kim, J.A., Downey, M., Fillingham, J., Chowdhury, D., Harrison, J.C., Onishi, M., Datta, N., Galicia, S., Emili, A., et al. (2006). A phosphatase complex that dephosphorylates gammaH2AX regulates DNA damage checkpoint recovery. Nature 439, 497501. Krawczyk, P.M., Borovski, T., Stap, J., Cijsouw, T., ten Cate, R., Medema, J.P., Kanaar, R., Franken, N.A., and Aten, J.A. (2012). Chromatin mobility is increased at sites of DNA double-strand breaks. J. Cell Sci. 125, 21272133. ller, W.G., Kruhlak, M.J., Celeste, A., Dellaire, G., Fernandez-Capetillo, O., Mu McNally, J.G., Bazett-Jones, D.P., and Nussenzweig, A. (2006). Changes in chromatin structure and mobility in living cells at sites of DNA double-strand breaks. J. Cell Biol. 172, 823834. Lans, H., Marteijn, J.A., and Vermeulen, W. (2012). ATP-dependent chromatin remodeling in the DNA-damage response. Epigenetics Chromatin 5, 4. Lazzaro, F., Sapountzi, V., Granata, M., Pellicioli, A., Vaze, M., Haber, J.E., Plevani, P., Lydall, D., and Muzi-Falconi, M. (2008). Histone methyltransferase Dot1 and Rad9 inhibit single-stranded DNA accumulation at DSBs and uncapped telomeres. EMBO J. 27, 15021512. Marshall, W.F., Straight, A., Marko, J.F., Swedlow, J., Dernburg, A., Belmont, A., Murray, A.W., Agard, D.A., and Sedat, J.W. (1997). Interphase chromosomes undergo constrained diffusional motion in living cells. Curr. Biol. 7, 930939. Michaelis, C., Ciosk, R., and Nasmyth, K. (1997). Cohesins: chromosomal proteins that prevent premature separation of sister chromatids. Cell 91, 3545. -Hattab, J., and Rothstein, R. (2012). Increased chromosome mobility Mine facilitates homology search during recombination. Nat. Cell Biol. 14, 510517. Molenaar, C., Wiesmeijer, K., Verwoerd, N.P., Khazen, S., Eils, R., Tanke, H.J., and Dirks, R.W. (2003). Visualizing telomere dynamics in living mammalian cells using PNA probes. EMBO J. 22, 66316641. Morrison, A.J., Highland, J., Krogan, N.J., Arbel-Eden, A., Greenblatt, J.F., Haber, J.E., and Shen, X. (2004). INO80 and gamma-H2AX interaction links ATP-dependent chromatin remodeling to DNA damage repair. Cell 119, 767775. Morrison, A.J., Kim, J.A., Person, M.D., Highland, J., Xiao, J., Wehr, T.S., Hensley, S., Bao, Y., Shen, J., Collins, S.R., et al. (2007). Mec1/Tel1 phosphorylation of the INO80 chromatin remodeling complex inuences DNA damage checkpoint responses. Cell 130, 499511. Nagai, S., Dubrana, K., Tsai-Pugfelder, M., Davidson, M.B., Roberts, T.M., Brown, G.W., Varela, E., Hediger, F., Gasser, S.M., and Krogan, N.J. (2008). Functional targeting of DNA damage to a nuclear pore-associated SUMOdependent ubiquitin ligase. Science 322, 597602. Nasmyth, K. (2011). Cohesin: a catenase with separate entry and exit gates? Nat. Cell Biol. 13, 11701177. Nelms, B.E., Maser, R.S., MacKay, J.F., Lagally, M.G., and Petrini, J.H. (1998). In situ visualization of DNA double-strand break repair in human broblasts. Science 280, 590592. Neumann, F.R., Dion, V., Gehlen, L.R., Tsai-Pugfelder, M., Schmid, R., Taddei, A., and Gasser, S.M. (2012). Targeted INO80 enhances subnuclear chromatin movement and ectopic homologous recombination. Genes Dev. 26, 369383. Ngo, H.P., and Lydall, D. (2010). Survival and growth of yeast without telomere capping by Cdc13 in the absence of Sgs1, Exo1, and Rad9. PLoS Genet. 6, e1001072. Nielsen, I., Bentsen, I.B., Lisby, M., Hansen, S., Mundbjerg, K., Andersen, A.H., and Bjergbaek, L. (2009). A Flp-nick system to study repair of a single proteinbound nick in vivo. Nat. Methods 6, 753757. Oza, P., Jaspersen, S.L., Miele, A., Dekker, J., and Peterson, C.L. (2009). Mechanisms that regulate localization of a DNA double-strand break to the nuclear periphery. Genes Dev. 23, 912927.

Pliss, A., Malyavantham, K., Bhattacharya, S., Zeitz, M., and Berezney, R. (2009). Chromatin dynamics is correlated with replication timing. Chromosoma 118, 459470. Richardson, C., and Jasin, M. (2000). Frequent chromosomal translocations induced by DNA double-strand breaks. Nature 405, 697700. Robinett, C.C., Straight, A., Li, G., Willhelm, C., Sudlow, G., Murray, A., and Belmont, A.S. (1996). In vivo localization of DNA sequences and visualization of large-scale chromatin organization using lac operator/repressor recognition. J. Cell Biol. 135, 16851700. Savage, J.R. (1996). Insight into sites. Mutat. Res. 366, 8195. Segal, E., and Widom, J. (2009). What controls nucleosome positions? Trends Genet. 25, 335343. Soutoglou, E., Dorn, J.F., Sengupta, K., Jasin, M., Nussenzweig, A., Ried, T., Danuser, G., and Misteli, T. (2007). Positional stability of single double-strand breaks in mammalian cells. Nat. Cell Biol. 9, 675682. Spehalski, E., Kovalchuk, A.L., Collins, J.T., Liang, G., Dubois, W., Morse, H.C., 3rd, Ferguson, D.O., Casellas, R., and Dunnick, W.A. (2012). Oncogenic Myc translocations are independent of chromosomal location and orientation of the immunoglobulin heavy chain locus. Proc. Natl. Acad. Sci. USA 109, 1372813732. Steger, D.J., Haswell, E.S., Miller, A.L., Wente, S.R., and OShea, E.K. (2003). Regulation of chromatin remodeling by inositol polyphosphates. Science 299, 114116. Taddei, A., and Gasser, S.M. (2012). Structure and function in the budding yeast nucleus. Genetics 192, 107129. Taddei, A., Hediger, F., Neumann, F.R., Bauer, C., and Gasser, S.M. (2004). Separation of silencing from perinuclear anchoring functions in yeast Ku80, Sir4 and Esc1 proteins. EMBO J. 23, 13011312. Taddei, A., Van Houwe, G., Hediger, F., Kalck, V., Cubizolles, F., Schober, H., and Gasser, S.M. (2006). Nuclear pore association confers optimal expression levels for an inducible yeast gene. Nature 441, 774778. Taddei, A., Schober, H., and Gasser, S.M. (2010). The budding yeast nucleus. Cold Spring Harb. Perspect. Biol. 2, a000612. Torres-Rosell, J., Sunjevaric, I., De Piccoli, G., Sacher, M., Eckert-Boulet, N., n, L., and Lisby, M. (2007). The Reid, R., Jentsch, S., Rothstein, R., Arago Smc5-Smc6 complex and SUMO modication of Rad52 regulates recombinational repair at the ribosomal gene locus. Nat. Cell Biol. 9, 923931. Tsukuda, T., Fleming, A.B., Nickoloff, J.A., and Osley, M.A. (2005). Chromatin remodelling at a DNA double-strand break site in Saccharomyces cerevisiae. Nature 438, 379383. Unal, E., Arbel-Eden, A., Sattler, U., Shroff, R., Lichten, M., Haber, J.E., and Koshland, D. (2004). DNA damage response pathway uses histone modication to assemble a double-strand break-specic cohesin domain. Mol. Cell 16, 9911002. van Attikum, H., Fritsch, O., Hohn, B., and Gasser, S.M. (2004). Recruitment of the INO80 complex by H2A phosphorylation links ATP-dependent chromatin remodeling with DNA double-strand break repair. Cell 119, 777788. Vazquez, J., Belmont, A.S., and Sedat, J.W. (2001). Multiple regimes of constrained chromosome motion are regulated in the interphase Drosophila nucleus. Curr. Biol. 11, 12271239. Walter, J., Schermelleh, L., Cremer, M., Tashiro, S., and Cremer, T. (2003). Chromosome order in HeLa cells changes during mitosis and early G1, but is stably maintained during subsequent interphase stages. J. Cell Biol. 160, 685697. Weber, S.C., Spakowitz, A.J., and Theriot, J.A. (2010). Bacterial chromosomal loci move subdiffusively through a viscoelastic cytoplasm. Phys. Rev. Lett. 104, 238102. Weber, S.C., Spakowitz, A.J., and Theriot, J.A. (2012). Nonthermal ATP-dependent uctuations contribute to the in vivo motion of chromosomal loci. Proc. Natl. Acad. Sci. USA 109, 73387343.

Cell 152, March 14, 2013 2013 Elsevier Inc. 1363

Weinert, T.A., and Hartwell, L.H. (1988). The RAD9 gene controls the cell cycle response to DNA damage in Saccharomyces cerevisiae. Science 241, 317322. Wiesmeijer, K., Krouwels, I.M., Tanke, H.J., and Dirks, R.W. (2008). Chromatin movement visualized with photoactivable GFP-labeled histone H4. Differentiation 76, 8390. Wilson, J.H., Leung, W.Y., Bosco, G., Dieu, D., and Haber, J.E. (1994). The frequency of gene targeting in yeast depends on the number of target copies. Proc. Natl. Acad. Sci. USA 91, 177181.

Zhang, Y., McCord, R.P., Ho, Y.J., Lajoie, B.R., Hildebrand, D.G., Simon, A.C., Becker, M.S., Alt, F.W., and Dekker, J. (2012). Spatial organization of the mouse genome and its role in recurrent chromosomal translocations. Cell 148, 908921. Zimmer, C., and Fabre, E. (2011). Principles of chromosomal organization: lessons from yeast. J. Cell Biol. 192, 723733. Zink, D., Cremer, T., Saffrich, R., Fischer, R., Trendelenburg, M.F., Ansorge, W., and Stelzer, E.H. (1998). Structure and dynamics of human interphase chromosome territories in vivo. Hum. Genet. 102, 241251.

1364 Cell 152, March 14, 2013 2013 Elsevier Inc.

Leading Edge

Review
When Lamins Go Bad: Nuclear Structure and Disease
Katherine H. Schreiber1 and Brian K. Kennedy1,2,*
Institute for Research on Aging, 8001 Redwood Boulevard, Novato, CA 94945, USA Research Institute, Guangdong Medical College, Dongguan 523808, Guangdong, China *Correspondence: bkennedy@buckinstitute.org http://dx.doi.org/10.1016/j.cell.2013.02.015
2Aging 1Buck

Mutations in nuclear lamins or other proteins of the nuclear envelope are the root cause of a group of phenotypically diverse genetic disorders known as laminopathies, which have symptoms that range from muscular dystrophy to neuropathy to premature aging syndromes. Although precise disease mechanisms remain unclear, there has been substantial progress in our understanding of not only laminopathies, but also the biological roles of nuclear structure. Nuclear envelope dysfunction is associated with altered nuclear activity, impaired structural dynamics, and aberrant cell signaling. Building on these ndings, small molecules are being discovered that may become effective therapeutic agents.
Introduction Since their discovery more than 35 years ago as constituents of the nuclear lamina (Gerace et al., 1978), the nuclear lamins have been the subject of intense speculation regarding their possible roles in almost everything that happens in the nucleus. Early studies focused on biochemistry and cell biology, with the goal of achieving a basic understanding of the principles governing nuclear organization. The nuclear envelope entered the medical realm in the mid-1990s, when mutations in emerin were identied in patients with Emery-Dreifuss muscular dystrophy (EDMD) (Bione et al., 1994). The LMNA gene, encoding all A-type nuclear lamins, was linked to EDMD a few years later (Bonne et al., 1999), and links between nuclear structure and human disease have been studied extensively since then in labs throughout the world. With around 15 diseases, including a range of dystrophic and progeroid syndromes, attributed to LMNA mutations and mutations in genes encoding associated nuclear envelope proteins causing an overlapping set of diseases, the questions and experimental approaches have evolved. Why do alterations in nuclear envelope proteins confer disease? What are the mechanisms underlying disease pathology? Do A-type lamins have a role in normal aging? Can effective therapies be developed for these debilitating diseases? Though a range of exciting discoveries have been made in the last decade, many unknowns remain. Here, we seek to frame the current questions, propose possible paths toward mechanistic understanding, and briey evaluate the therapeutic possibilities that are starting to emerge. Given the amount of interest and momentum in the lamin eld, it is feasible that therapies to rescue the pathogenic consequences of misbehaved nuclear structural components will be developed in the not-too-distant future. Nuclear Lamins The nuclear envelope is comprised of two membranes: the outer nuclear membrane, which is continuous with the endoplasmic reticulum, and the inner nuclear membrane, which associates with the nuclear lamina. Nuclear pore complexes perforate the nuclear envelope to allow transport between the cytoplasm and nucleus. The nuclear lamina is primarily composed of nuclear lamins, which were originally identied as lamins A, B, and C (Gerace et al., 1978). These proteins constitute the only class of intermediate lament proteins in the nucleus and form associated lamentous structures that underlie the nuclear envelope and interact with neighboring proteins (Gerace and Huber, 2012). Lamins A and C, as well as two other variants (C2 and AD10), are classed as A-type lamins and are encoded by the LMNA gene through alternative splicing. Three different lamin B family members (B-type lamins) are encoded by two genes (lamin B1 by LMNB1 and lamins B2 and B3 by LMNB2). A- and B-type lamins have fundamentally different properties, perhaps most importantly by virtue of their different isoelectric points, which dictate that B-type lamins stay associated with the nuclear envelope during mitosis while A-type lamins become soluble. Expression patterns differ as well, with B-type lamins expressed in most or all cell types and A-type lamins expressed during cell differentiation in many different developmental ber et al., 1989). At the cellular level, both classes lineages (Ro of proteins have been ascribed structural roles in the nucleus as well as a range of other activities, including coordination of transcription and replication. While specic functions of A-type lamins remain somewhat elusive, a number of recent discoveries point to key interactions between lamins and cell proliferation, differentiation, and stress response pathways. Both A- and B-type lamins undergo posttranslational processing based on a C-terminal CaaX motif that dictates a series of modications (Weber et al., 1989); only lamin C avoids this by virtue of alternative splicing of the LMNA transcript that lacks the C terminus. As a rst step, the cysteine residue is farnesylated. Next, proteolytic processing leads to cleavage after the cysteine residue, followed by a carboxymethylation of the new C-terminal
Cell 152, March 14, 2013 2013 Elsevier Inc. 1365

residue. Many membrane-associated proteins, including Ras, undergo this processing event. However, in the case of lamin A, isoprenylation is a transient event, as a second proteolytic event mediated by the zinc metalloproteinase Zmpste24 leads to excision of another 15 amino acids. Due to this cleavage, mature lamin A lacks the modied cysteine. This process is clearly important to pathologic states, as laminopathies are linked to altered processing of lamin A, as well as loss-of-function mutations in ZMPSTE24. The reasons for farnesylation of lamin A remain to be elucidated despite extensive efforts. Until recently, the thinking has been that the transient farnesylation event was needed, through association of the hydrophobic farnesyl group with the nuclear envelope, to provide initial recruitment of lamin A to the nuclear periphery (Hennekes and Nigg, 1994). After assembly into laments, farnesylation of lamin A may no longer be required. Consistent with this hypothesis, the nucleus has been shown to be the site of both lamin A carboxymethylation and proteolytic cleavage by ZMPSTE24. However, several recent studies using mice and/or cells engineered to express mutant forms of lamin A indicate that farnesylation is not required for recruitment (Davies et al., 2011). For instance, when only a nonfarnesylated version of lamin A is expressed, normal localization of the lamin A variant to the nuclear periphery was observed (Davies et al., 2010; Lee et al., 2010), although mice generated in this manner develop cardiomyopathy (see below). In addition, mice expressing only lamin C (not farnesylated) or a mature (preprocessed) lamin A are (surprisingly) normal and have apparently correct localization of the respective protein to the nuclear periphery. Though these studies do not preclude a more subtle role for lamin A processing in lament assembly or envelope association, they raise serious questions about the importance of these events in the mouse and provide an interesting puzzle to be pieced together by future studies. Diseases Linked to Mutations in Nuclear Structure Proteins The number of different diseases linked to mutations in LMNA, at least 15 by now, surpasses that of any other human gene. It is hard to establish absolute numbers because many of the associated syndromes have overlapping pathologies. Nevertheless, the range of tissues and functions that can be adversely affected by mutation in LMNA is striking (Table 1). Diseases include the aforementioned Emery-Dreifuss muscular dystrophy (EDMD2/3) (Bonne et al., 1999) and a second muscular dystrophy (Limb-girdle, LGMD1B) (Muchir et al., 2000) that affects different skeletal muscle groups. Patients with both forms of muscular dystrophy also present with dilated cardiomyopathy, which is often the cause of mortality. Other LMNA mutations lead to dilated cardiomyopathy (CDM1A) without skeletal muscle involvement (Fatkin et al., 1999). Finally, a form of congenital muscular dystrophy has more recently been linked to mutations in LMNA (Quijano-Roy et al., 2008), as well as Heart-hand syndrome, which couples a range of cardiac defects to brachydactyly (Renou et al., 2008). Pathology associated with LMNA mutations is not restricted to striated muscle tissue, as other diseases confer loss of adipose tissue, including Dunnigan-type familial partial lipodystrophy
1366 Cell 152, March 14, 2013 2013 Elsevier Inc.

Table 1. Diseases Caused by Mutations in Genes Encoding Lamins and Lamin-Associated Proteins Gene Mutated Striated Muscle Diseases Emery-Dreifuss muscular dystrophy Limb-girdle muscular dystrophy Dilated cardiomyopathy Congenital muscular dystrophy Heart-hand syndrome Lipodystrophy Syndromes Dunnigan-type familial partial lipodystrophy Mandibuloacral dysplasia Lipoatrophy Partial lipodystrophy Accelerated Aging Disorders Atypical Werner syndrome Hutchinson-Gilford progeria syndrome Restrictive dermopathy Atypical progeria syndrome Peripheral Nerve Disorders Charcot-Marie-Tooth disease Adult-onset leukodystrophy Spinocerebellar ataxia type 8 Bone Diseases Buschke-Ollendorff syndrome Melorheostosis Osteopoikilosis Greenberg skeletal dysplasia Other Pelger-Huet anomaly Arthrogryposis LBR SYNE2 LEMD3 LEMD3 LEMD3 LBR LMNA LMNB1 SYNE1 LMNA LMNA LMNA, ZMPSTE24 BANF1 LMNA LMNA, ZMPSTE24 LMNA LMNB2 LMNA, EDMD, SYNE1, SYNE2, TMEM43, TMPO LMNA LMNA, EDMD, SYNE1, SYNE2, TMEM43, TMPO LMNA LMNA

(FPLD2) (Shackleton et al., 2000), Mandibuloacral dysplasia (MAD) (Novelli et al., 2002), generalized lipoatrophy (Caux et al., 2003), restrictive dermopathy (RD) (Navarro et al., 2004), and other overlapping disorders. Highlighting the importance of lamin A processing, mutations resulting in loss of ZMPSTE24 function, which result in partially processed lamin A, lead to both MAD and RD (Agarwal et al., 2003; Navarro et al., 2005). However, links between lamin A processing and pathology extend beyond mutations in ZMPSTE24 and connect with another set of disorders termed progeroid, which give rise to the appearance of premature aging. The most noted of these is Hutchinson-Gilford progeria syndrome (HGPS), a severe disorder for which symptoms, including cachexia, alopecia, and atherosclerosis, become apparent shortly after birth. Death results from heart attack or stroke usually before the patient reaches the age of 20. The most common LMNA mutation leading to HGPS, G608G, does not affect coding sequence but instead creates a cryptic splice site leading to removal of

50 amino acids in the C terminus of lamin A (De SandreGiovannoli et al., 2003; Eriksson et al., 2003), resulting in a protein named progerin. A similar splicing mutant has been identied that leads to removal of an extra 40 amino acids (90 total) in a patient diagnosed with RD (Navarro et al., 2004), leading to speculation that RD is a more severe version of HGPS, though the two diseases do not overlap entirely. This splicing event removes the cleavage site for ZMPSTE24, creating a permanently farnesylated protein that likely causes a dominant gain-of-function toxicity. Other mutations in LMNA that do not obviously affect C-terminal splicing lead to HGPS as well as other generally less severe progeroid pathologies (Cao and Hegele, 2003; Chen et al., 2003; Verstraeten et al., 2006). Finally, with regard to LMNA mutations, homozygous loss of lamin A function leads to Charcot-Marie-Tooth syndrome, characterized by loss of peripheral nerve myelination (De Sandre-Giovannoli et al., 2002). Before leaving A-type lamins, it is worth noting the interesting connections that have arisen with cancer progression (ButinIsraeli et al., 2012). Most laminopathies are not associated with cancer, but an increasing range of tumors are characterized by downregulation of A-type lamin expression (Broers et al., 1993; Kaufmann, 1992), though results differ in tumor types. Recalling that this family of lamins is expressed in differentiated cells, but not stem cells, speculation has developed that A-type lamins may act as tumor suppressors, perhaps by blocking dedifferentiation into a more stem-cell-like state. A-type lamins have also been ascribed roles in regulating cell proliferation and the DNA damage response, either of which could be linked to cancer progression (Redwood et al., 2011). Among these activities, A-type lamins are required to stabilize the retinoblastoma tumor suppressor protein (Johnson et al., 2004). This may be relevant because the one tumor described in HGPS patients (the sample size is quite small) is an early onset osteosarcoma (Shalev et al., 2007), one of the most common tumors linked to homozygous mutation of the Rb locus (Friend et al., 1986). Although progerin can stabilize pRb levels (Nitta et al., 2006), the HGPS patient with osteosarcoma had a rare T623S LMNA mutation that has not been tested with regard to pRb stability (Shalev et al., 2007). Mutations in genes encoding other nuclear envelope proteins are also associated with disease (described in greater detail ndez-Lo pez and Worman, 2012). In addition to emerin in Me and LMNA, mutations in SYNE1 and SYNE2 (encoding nesprin-1 and nesprin-2), TMEM43 (encoding LUMA), and TMPO (encoding LAP2-a) are all associated with dilated cardiomyopathy and muscular dystrophy (Liang et al., 2011; Taylor et al., 2005; Zhang et al., 2007). These genes encode proteins that all interact as part of the linker of nucleoskeleton and cytoskeleton (LINC) complex, suggesting that altered LINC function may underlie striated muscle pathology (Puckelwartz and McNally, 2011). Unrelated SYNE1 and SYNE2 mutations are also linked to autosomal-recessive spinocerebellar ataxia type 8 and autosomal recessive arthrogryposis, respectively (Attali et al., 2009; Gros-Louis et al., 2007). LEMD3, encoding MAN1, an LEM-domain-containing protein, is also associated with disease, with mutations linked to a series of disorders associated with increased bone density (Hellemans et al., 2004). In addition, mutations in BANF1, encoding the nuclear envelope protein BAF that binds DNA and is involved in chromatin organization and

nuclear envelope assembly, are associated with Atypical progeria (Puente et al., 2011). Not to be left out, LMNB1 and LMNB2 mutations are both linked to rare diseases. Autosomal-dominant mutations in LMNB1 lead to adult-onset leukodystrophy, which is characterized by central nervous system demyelination (Padiath et al., 2006). In the case of LMNB2, individuals with heterozygous mutations are susceptible to acquired partial lipodystrophy, likely triggered by one of several autoimmune diseases (Hegele et al., 2006). Finally, the lamin B receptor (LBR), which interacts with B-type lamins and may serve to help link them to the nuclear envelope and chromatin, is also a target for mutation in two syndromes: homozygous mutations in LBR cause Greenberg skeletal dysplasia (Waterham et al., 2003), whereas heterozygous mutations are associated with Pelger-Huet anomaly, a benign condition characterized by altered chromatin organization in granulocytes (Best et al., 2003; Hoffmann et al., 2002). Given the rate of new discoveries of disease association with nuclear structural factors, it is fair to speculate that new associations between disease and nuclear proteins will continue to emerge. Disease Mechanisms: Mouse Models Lead the Way How could altered function of nuclear structural components lead to such a wide range of diseases? A decade or so ago, there were few connections between lamins and known disease mechanisms; however, lamins were known to be important for a wide range of nuclear functions, including replication and transcription. Many of the initial ideas were based on changes observed at the level of cell biology. For instance, the shape of the nucleus was found to be disrupted in broblasts lacking A-type lamins, with enhanced nuclear deformation and sensitivity to mechanical stress (Lammerding et al., 2004). Emerindecit cells have similar properties, and reduced mechanical stress could explain part of the pathology associated with diseases such as dilated cardiomyopathy and muscular dystrophy, where affected tissues are under regular strain (Lammerding et al., 2005). However, cells isolated from human and mouse tissue from the various laminopathies, all display abnormal nuclear structure. These phenotypes range from abnormal nuclear shape to nuclear blebbing and even dispersal of DNA into the cytoplasm. While these observations may relate to disease, they do not clearly differentiate one laminopathy from another, and researchers have turned to more detailed assessments of cellular function to generate more recent hypotheses. Theories to explain the pathology associated with nuclear structure defects have emerged largely from two areas: an extensive set of mouse models and, more recently, studies of stem cells expressing a range of mutant forms of A-type lamins. An informative starting point for the former was the generation of mice lacking A-type lamins (Sullivan et al., 1999). In addition to being cachexic, these mice present with a subset of the pathologies associated with LMNA mutation, including muscular dystrophy, dilated cardiomyopathy, and Charcot-Marie-Tooth syndrome and succumb to the cardiac phenotype at about 6 weeks of age. Lmna+/ heterozygous mice also develop the cardiac pathology, although at a slower rate, and mice expressing two different LMNA alleles associated with striated muscle
Cell 152, March 14, 2013 2013 Elsevier Inc. 1367

disease recapitulate at least some of the human phenotypes (Arimura et al., 2005; Mounkes et al., 2005). One assertion arising from these ndings is that the muscle and peripheral myelination diseases result from reduced A-type lamin function. This is not surprising for Charcot-Marie-Tooth syndrome, which is a recessive disorder in humans (De Sandre-Giovannoli et al., 2002). However, both dominant and recessive mutations have been identied in the muscle pathologies, and one possibility is that autosomal-dominant alleles have a dominant-negative effect, interfering with intermediate lament assembly or some other property of A-type lamins. Haploinsufciency also likely explains the onset of disease in many cases. It should be noted that the Lmna/ mouse described originally may in fact not be a null allele of the gene (Sullivan et al., 1999). Recent evidence suggests that this mouse expresses a still incompletely characterized, truncated 54 kDa protein derived from a splicing event that bypasses the removed exons (Jahn et al., 2012). While the dust has not settled from this nding, most data suggest that the lamin A variant expressed in this mouse is hypomorphic. Interestingly, another Lmna/ model has been derived through disruption with a reporter gene, and this mouse presents with defective development of heart liver and somites, leading to death before weaning (Kubben et al., 2011). This latter mouse is more consistent with a homozygous LMNA nonsense mutation that resulted in the complete absence of A-type lamins and was associated with the death of a newborn patient (van Engelen et al., 2005). Clearly, these ndings call for some re-evaluation of studies performed in the Lmna/ mouse despite its past and continued value to the eld. One recent highly informative mouse model was engineered to homozygously express a nonfarnesylated version of lamin A in the absence of lamin C (Davies et al., 2010). These mice were expected to resemble the phenotype of mice lacking ZMPSTE24 (see below) but instead present with cardiomyopathy. The investigators sought to determine whether the cardiac pathology was attributable to gain-of-function toxicity or a reduced hypomorphic function of the lamin A variant. To distinguish, they generated a mouse expressing a nonfarnesylated allele over a null, nding that this mouse has a more severe phenotype, consistent with further reduced lamin A function. If the pathology was a result of toxicity, the heterozygous mouse would have had a less severe cardiac phenotype. These ndings are consistent with the data that cardiomyopathy derives from reduced lamin A function. While striated muscle pathology represents one cluster of mouse LMNA models, progeria characterizes the other. In this case, the data are generally supportive of a model whereby lamin A variants with processing defects show dominant onset of a subset of features associated with HGPS. These models are covered in greater detail in a recent review (Zhang et al., 2013). Recall that the primary human lesion associated with HGPS is a heterozygous G608G mutation, which creates a splicing defect and leads to permanently farnesylated lamin A. Much debate centers around which of the many different HGPS models are the best to develop mechanistic explanations and therapies for human patients. Most of the models, including Lmna mutants and Zmpste24/, present with a subset of phenotypes that are characteristic of progeroid mice, including cachexia,
1368 Cell 152, March 14, 2013 2013 Elsevier Inc.

reduced bone density and rib fractures, loss of subcutaneous fat, kyphosis, alopecia, and premature death. However, a model generated to express human progerin from a BAC clone does not exhibit these phenotypes, instead displaying arterial smooth muscle defects (Varga et al., 2006). While the differences are unknown, both types of models may have advantages. For instance, the BAC progerin model mimics atherosclerosis, which by leading to heart attacks and strokes results in mortality in most patients. Therefore, studies in this mouse explore effects on what may be the most important pathology in children with disease. However, the rapid presentation and wider array of phenotypes in the other mice offer clear advantages as well. Of note, some of the progeria model mice display cardiac defects that are more consistent with dilated cardiomyopathy (Davies et al., 2010; Yang et al., 2011). One point worth considering is that a LMNA mutation could lead to gain-of-function toxicity for some phenotypes and loss-of-function for others. In the next two sections, we focus on the two classes of LMNA-associated disease about which we understand the most: striated muscle disease and progeroid disorders. The exciting progress in these two areas has led to possible therapeutic approaches. Disease Mechanisms and Possible Therapies for LMNA-Associated Striated Muscle Diseases Interesting ndings have emerged on several fronts with respect to LMNA-associated dilated cardiomyopathy with conduction defects and muscular dystrophies. While these ndings do not yet come together in a neat package, continued studies may begin to generate such a composite understanding. The fact that LMNA mutants leading to EDMD2/3 so closely resemble X-linked EDMD that is caused by Emerin mutations is a critical consideration for any mechanistic disease model. Unlike A-type lamins, emerins reside in the inner and outer nuclear membranes, interacting with lamins in the former case and with microtubules in the latter. Lamin A/C binding to emerin is required for its localization to the nuclear envelope (Vaughan et al., 2001). This raises the possibility that emerins might be a conduit by which the nuclear lamina communicates with the cytoskeleton. However, no clear understanding has emerged as to how and why the lamin A/C-emerin interaction is important in skeletal and cardiac muscle. The linker of nucleoskeleton and cytoskeleton (LINC) protein complex, consisting of SUN1 and -2 as well as Nesprin 1 and -2, also connects A-type lamins to the cytoskeleton with Sun proteins directly interacting with lamin A/C at the inner nuclear jat and Misteli, 2010). membrane and Nesprins in the lumen (Me Nesprins cross the outer nuclear membranes and connect to the cytoskeleton in the cytoplasm. In addition to linking the nucleo- and cytoskeleton, LINC complexes have a wide range of cellular functions, including in cell division, in centrosomenucleus association, in nuclear migration, and in positioning. Disruption of any of these activities could contribute to disease progression. A recent study has implicated SUN1 in disease progression, albeit through an unexpected mechanism. In Lmna/ mice, SUN1 is dramatically overexpressed and directed to the Golgi, presumably after nuclear occupancy sites are saturated (Chen et al., 2012a). RNAi-mediated knockdown of

SUN1 rescued nuclear defects in cell culture, and knockout of SUN1 signicantly extended the survival of Lmna/ mice. While many questions remain unresolved, this report suggests that one signicant problem associated with reduced A-type lamin function is SUN1-mediated toxicity in the Golgi. Another intermediate lament factor, desmin, serves as a linking factor between lamins and many cytoplasmic structures in striated muscle cells. Desmin mutations can result in desminrelated myopathies (DRM), which are characterized by cardiac and skeletal muscle weakness with a highly variability of presentation. Inherent in DRM at the cellular level is disruption of desmin laments and accumulation of desmin-containing protein aggregates. Interestingly, cardiomyoctes from Lmna/ mice display disrupted desmin networks and elevated protein levels (Nikolova et al., 2004). This may not be the case for skeletal muscle, as electron micrographs of muscle biopsies from human patients failed to detect abnormal desmin localization (Frock et al., 2012; Piercy et al., 2007). The authors of this study also looked at murine embryonic stem cells transfected with a human EDMD mutation and differentiated into cardiomyocytes, nding no defects in desmin localization. These latter ndings appear to differ from the in vivo studies described above and may suggest that knockout of A-type lamins, as opposed to expression of an EDMD missense mutation, is required to induce abnormal desmin localization. Alternatively, the cell culture model may not recapitulate events regarding desmin. Myoblasts generated from Lmna/ mice are reported to have differentiation defects, suggesting that reduced regenerative potential of adult stem cells may combine with increased damage to myobers from mechanical stress sensitivity to explain the rapid onset of dystrophic pathology (Frock et al., 2006). Interestingly, a small percentage of Lmna/ myoblasts responds normally to differentiation signals, whereas a majority fails to induce the differentiation program. The majority of proliferating Lmna/ myoblasts also display reduced levels of both MyoD and desmin. Stable transfection of desmin rescues the differentiation defects of these cells, implying that reduced desmin levels during the proliferation phase may, in part, be responsible for the inability of cells to respond to differentiation cues. Stable expression of MyoD also rescues differentiation defects. With respect to EDMD mutations, MyoD-transformed human patient broblasts were reported to differentiate normally (Piercy et al., 2007). Again, the differences may be attributable to the relative severity of the LMNA mutation, or they may have been suppressed in the latter case due to articially high MyoD levels (Frock et al., 2006; Piercy et al., 2007). In recent years, it has become apparent that Lmna mutations can lead to altered activation of major signal transduction pathways in the cell (Figure 1). While the mechanisms connecting the nuclear envelope to cell signaling have not been fully elucidated, the ndings are important because (1) altered signaling can be linked to pathological progression, and (2) in some cases, small molecules are available as therapeutic options to correct signaling defects. In cardiac tissue, three different branches of the MAP-kinase-signaling pathways have been found to be aberrantly activated in a mouse model homozygously expressing the human LMNA H222P mutant associated with dilated cardiomyopathy (Muchir et al., 2007b, 2012). One of these, the

Figure 1. Signaling Pathways Disrupted by LMNA Mutations


Recent years have seen several discoveries of signal transduction pathways that are altered in LMNA mutant backgrounds associated with gain-offunction toxicity (those involved in Progeria), loss-of-function toxicity (i.e., hypomorphic), or both. A list of pathways is provided, as described in detail in the text.

extracellular signal-regulated kinase 1/2 (ERK1/2) pathway, was also upregulated in Emerin-decient mice, while the Jun N-terminal kinase (JNK) pathway was not elevated and the p38a pathway remains to be tested (Muchir et al., 2007a). Elevated ERK1/2 phosphorylation has also been detected in human cancer cells lines in which A-type lamin or Emerin expression was inhibited by an siRNA approach and in cardiac tissue from Lmna/ mice (Frock et al., 2012; Muchir et al., 2009b). In Lmna/ hearts, aberrant phosphorylation could be corrected by restoration of lamin A expression specically in cardiomyocytes, indicating that the defects are cell autonomous (Frock et al., 2012). Finally, elevated p38a phosphorylation has been detected in heart tissue from human dilated cardiomyopathy patients (Muchir et al., 2012). A variety of MAP kinase inhibitors have been generated, and many have been tested in the clinic for other disease indications. Worman and colleagues have tested several of these in LmnaH222p/H222p mice, nding that inhibition of each branch of the Map kinase pathway delays either onset or progression of cardiac symptoms (Muchir et al., 2009a, 2012; Wu et al., 2010). Given that some of these inhibitors appear to be relatively well tolerated in humans, these ndings lead to a potential therapeutic route for dilated cardiomyopathies associated with LMNA mutation. Potential benets for muscular dystrophy have not been assessed. How does LMNA mutation lead to activation of the MAP kinase pathways? While the answer to this question remains to be determined, ideas have emerged. For instance, MAP kinases are known to be activated by mechanical stress, and reduced A-type lamin function is associated with impaired activation of mechanosensitive genes in cardiomyocytes. A second more direct model has potentially emerged that involves direct
Cell 152, March 14, 2013 2013 Elsevier Inc. 1369

interaction between ERK1/2 and A-type lamins in the nucleus. ERK1/2 is reported to interact with lamin A and the retinoblastoma protein (pRb) at the nuclear periphery. Stabilization of pRb by A-type lamins is important to maintain normal cell-cycle control (Nitta et al., 2006). Upon serum stimulation of quiescent cells, ERK1/2 phosphorylates c-Fos, releasing it to stimulate Ap-1 activation, and also dislodges pRb from A-type lamins, lez leading to pRb phosphorylation and E2F activation (Gonza guez et al., 2010). It is et al., 2008; Ivorra et al., 2006; Rodr unclear presently how disruption of the ERK1/2-A-type lamin interaction by LMNA mutation affects ERK1/2 activation, but this question needs to be investigated. Equally unclear are the pathways downstream of MAP kinases that mediate cardiac pathology. Two possibilities have emerged. The rst involves an observation that connexins are mislocalized in mice expressing a different mutant associated with DCM (N195K). Here, connexin 43 was found to be mislocalized and not associated with gap junctions, a nding that could explain conduction defects associated with altered A-type lamin function (Mounkes et al., 2005). Expression of another DCM mutant (E82K) was found to lead to downregulation and mislocalization of connexin 43 in neonatal myocytes (Sun et al., 2010). Finally, a recent study has demonstrated mislocalization of connexin 43 in heart cardiomyocytes of Lmna/ mice (Frock et al., 2012). Re-expression of lamin A rescued aberrant ERK1/2 phosphorylation and restored connexin 43 localization. Given that connexins are known substrates of ERK1/2, the possibility exists that aberrant activity of this pathway disrupts normal connexin 43 localization and interferes with cardiac conduction (Chen et al., 2012b). Two recent studies point to the involvement of another major signal transduction pathway in LMNA-related cardiac and skeletal muscle disease. In Lmna/ mice, the mTORC1 pathway was found to be upregulated in cardiac and skeletal muscle, leading at least in the heart to impaired autophagy (Ramos et al., 2012). Reduced mTORC1 signaling by the specic kinase inhibitor rapamycin led to enhanced cardiac function and survival, with indications of improved skeletal muscle function; the latter possibility needs to be more fully explored. A similar study conducted in the LmnaH222P/H222P mouse led to highly overlapping ndings, suggesting that aberrant mTORC1 signaling may be a common feature of this class of laminopathies (Choi et al., 2012). Among several upstream activators of mTORC1 are ERK1/2 MAP kinases, and one possibility is that increased mTORC1 signaling occurs by this mechanism. However, there are numerous upstream activators of mTORC1 that need to be more fully explored. The possibility of testing rapamycin as a treatment for LMNA-associated DCM is intriguing because the drug has been tested in a wide range of clinical trials and is approved for multiple disease indications. However, there are side effects such as dyslipidemia and impaired insulin signaling that, while generally manageable, must be considered for treatment of cardiac disease. Elevated mTORC1 signaling, which is classically associated with increased protein translation and cell growth, is already linked to forms of cardiac hypertrophy. However, general levels of translation do not appear to be elevated in the Lmna/ heart (Ramos et al., 2012), suggesting that other pathways are offsetting the translational effects of
1370 Cell 152, March 14, 2013 2013 Elsevier Inc.

Figure 2. Potential Therapeutic Approaches to Laminopathies


Several small molecules have been proposed as treatments for laminopathies. The major ones are listed with arrows indicating the diseases to which they may have efcacy. Question marks indicate that animal data have yet to be presented. Notably, FTIs have been tested in human children with HGPS, with promising initial results (Gordon et al., 2012).

mTORC1 in this scenario. Interestingly, rapamycin has been reported to improve autophagic ux and suppress nuclear blebbing in broblasts expressing progerin, indicating that suppression of the mTOR pathway may be efcacious in LMNA-associated progeria models as well (Cao et al., 2011). Given the remarkable progress in this cluster of LMNA-associated diseases, it has been possible to move from identication of LMNA mutations in EDMD and DCM to possible therapeutic approaches in less than two decades (Figure 2). Whether the current drugs will prove efcacious in humans remains to be seen. Even if this is not the case, new candidate therapeutic approaches will surely continue to emerge. Disease Mechanisms and Possible Therapies for LMNA-Associated Progerias Although very rare, progeria syndromes have long been of great interest, based in part on the hypothesis that, by learning the mechanisms underlying their pathology, insights will be made into the normal aging process. This assumption is yet to be validated, and researchers in the aging eld have a wide range of viewpoints. One thing is clear. The studies into LMNA-associated progerias have yielded major biological insights and have provided hope that therapeutic approaches can be developed to slow the impact of these very severe syndromes. In this section, the latest ndings in progeria and lamin A processing will be discussed. A large body of work suggests that HGPS mutants in LMNA at least in part confer toxicity by virtue of being permanently farnesylated. Several deformations of the nucleus were found in cells expressing progerin or other nonfarnesylated versions of lamin A, and several studies indicated that these phenotypes could be rescued by a class of drugs that inhibit farnesyltransferases (Young et al., 2006). These drugs were initially generated based on their ability to block Ras farnesylation and the promise that that would inhibit tumor progression. Though cancer studies continue, their development has been fortuitous

to the study of HGPS. Not only do they rescue cellular defects, but they have benecial properties when delivered to HGPS mouse models, extending survival and improving other physiological readouts, including bone and cardiovascular defects (Capell et al., 2008; Yang et al., 2008b). These ndings, together with the fact that FTIs have good safety proles in the clinic, were cause for great optimism, leading to the rst clinical trial in human patients with HGPS. Initial ndings were recently reported showing variable rates of improvement in vascular function, enhanced bone rigidity, and improved sensorineural hearing in 25 patients treated with Ionafarnib for at least 2 years (Gordon et al., 2012). One reason FTIs may have limited potency is that lamin A variants can become geranylgeranylated, especially when farnesyltransferase activity is blocked (Varela et al., 2008). This has led to the assumption that blocking the HMG-CoA reductase pathway upstream in a manner that inhibits both lamin A modications might have enhanced efcacy. Consistently, combined treatment of Zmpste24/ mice with two such agents, statins and aminobisphosphonates, enhances survival and improves several pathologies. Another potential approach has emerged in a mouse that is genetically engineered to have the exact G608G mutation (G609G in mice) (Osorio et al., 2011). As in the human case, alternative splicing leads to progerin production and progeroid phenotypes. Interestingly, treatment of the mice with a morpholino-based therapy that prevents pathogenic splicing delays pathology and extends survival, suggesting an alternative therapeutic approach. Genetic studies support the toxicity of farnesylated lamin A in progerias. For instance, mice lacking Zmpste24 develop progeroid features linked to the toxicity of an unprocessed lamin A, as deletion of one copy of LMNA in this background improves the range of phenotypes (Fong et al., 2004). Extensive studies by Young and colleagues have further elucidated the role of farnesylation in vivo. Mice engineered to express a nonfarnesylated version of progerin still develop progeroid features, albeit at a slower rate (Yang et al., 2008a). However, mice expressing a nonfarnesylated version of prelamin A do not develop progeroid features, as described earlier (Davies et al., 2010). One possible interpretation of these studies in that farnesylation may be required for toxicity in the case of prelamin A but that the 50 amino acid deletion in progerin also contributes to disease progression. Several lines of evidence implicate enhanced DNA damage and/or an impaired DNA damage response pathway in the etiology of HGPS. HGPS cells have higher levels of reactive oxygen species and greater rates of basal DNA damage (Viteri et al., 2010). These ndings are likely connected, as a reduction in ROS by exposure to n-acetylcysteine reduces double-strand break formation. These alterations lead, in part, to enhanced activation of DNA response pathways, including enhanced ATM and RAD3-associated foci, which may adversely affect cell-cycle proliferation. An interesting and unusual feature of HGPS cells is persistent basal levels of phosphorylated gH2AX foci marking double-strand breaks that also stain positive for Xeroderma pigmentosum group A protein (XPA) (Liu et al., 2008), a component of nucleotide excision repair. No other related factors are upregulated, suggesting that the foci have

an abnormal set of repair proteins and the type of DNA damage in HGPS cells may have unique features. Cells from mice lacking Zmpste24 also exhibit a signicant delay in recruitment of 53BP1 to sites of DNA repair after induction of double-strand breaks (Liu et al., 2005). p53 targets such as GADD45, p21, and ATF3 were also elevated, and deletion of p53 was sufcient to rescue some of the progeroid phenotypes of the Zmpste24/ mouse (Varela et al., 2005). Though p53 targets were not elevated in HGPS broblasts, inactivation of the transcription factor was sufcient to suppress premature senescence (Kudlow et al., 2008). More recent data indicate that ATM and NEMO pathways become activated and promote NF-kB-dependent inammation in both Zmpste24/ and LmnaG609G/G609G mice (Osorio et al., 2012). Genetic and pharmacological interventions of these pathways slow progeroid pathology and enhance survival. These ndings are particularly interesting because (1) they suggest that NF-kB inhibitors may be effective therapeutic agents and (2) enhanced inammation may be a major driver of normal aging processes. Furthermore, the tissues affected by altered lamin A processing have remained unresolved. Progeria involves systemic pathology, and one possibility is that defects in every tissue cause cell autonomous phenotypes. More likely, defects in a smaller set of tissues lead to systemic responses that impact the whole organism. Enhanced NF-kB signaling could mediate such a systemic effect. A more straightforward approach to understanding the role of A-type lamins in DNA damage responses may involve loss-offunction studies. In contrast to progeroid models, loss of A-type lamins leads to 53BP1 degradation by the proteasome (Gonzalez-Suarez et al., 2009). In its absence, repair of doublestrand breaks proceeds more slowly, hindering effective nonhomologous end joining (Redwood et al., 2011). Homologous recombination is also compromised through a transcriptional mechanism by which enhanced proteasome-dependent degradation of pRb and p107 leads to repression of RAD51 and BRCA1 (Redwood et al., 2011). It remains unclear why enhanced protein turnover of pRb and 53BP1 occur in the absence of A-type lamins, but the hypothesis has been put forward that A-type lamins may have a general role in promoting the stability of several nuclear regulatory factors through keeping proteasome-dependent degradation in check (Parnaik et al., 2011). It should also be noted that many of these properties may explain why loss of A-type lamin expression could have tumor-promoting properties. In addition to impaired DNA damage response pathways, telomere dysregulation may also contribute to progeroid pathology. In culture, HGPS broblasts experience faster telomere shortening, and normal broblasts expressing progerin recapitulate this phenotype and also enhance formation of signal-free ends (Decker et al., 2009). Enhanced telomere attrition may contribute to proliferation defects and early senescence, as telomerase expression restores both properties in broblasts (Benson et al., 2010; Kudlow et al., 2008). One role of telomerase may be to enhance resolution of DNA damage foci, which were found to localize in regions near telomeres (Benson et al., 2010). The mechanisms by which this might occur and the extent to which altered telomere dynamics promotes progeroid pathology remains to be determined.
Cell 152, March 14, 2013 2013 Elsevier Inc. 1371

Given that HGPS (and other laminopathies) primarily affect tissues of mesenchymal origin, altered mesenchymal stem cell function may be a major site of progerin-induced dysfunction. Gene expression proling in broblasts expressing progerin provides support for this assertion, as Notch signaling was found to be highly enhanced (Scafdi and Misteli, 2008). Elevated Notch activity associated with progerin expression was found to promote expression of a range of differentiation markers in human mesenchymal stem cells. As a possible mechanism, progerin was found to disrupt nuclear matrix association of SKIP, a coactivator of Notch genes, leading to its release into the nucleoplasm and activation of targets. Reduced mesenchymal stem cell function could promote a subset of progeroid phenotypes in vivo, but this remains to be tested. The Wnt/b-catenin pathway is altered in a variety of laminopathies as well. In both Zmpste24/ and HGPS mice, reduced b-catenin levels were detected, and cell proliferation defects could be rescued by inhibition of Gsk-3b, leading to b-catenin stabilization (Espada et al., 2008; Hernandez et al., 2010). Of note, the Wnt pathway may be disrupted in mice lacking emerin (Markiewicz et al., 2006; Tilgner et al., 2009). Given that the Wnt pathway may have critical roles in maintaining adult stem cell function with age, the role of this pathway in laminopathies requires further interrogation. Adult stem cells may also be impaired in progeroid laminopathies due to impaired SIRT1 function. A recent study has demonstrated that A-type lamins interact with the protein deacetylase and that preprocessed lamin A disrupts this association in Zmpste24/ cells, leading to reduced deacetylase activity and rapid in vivo stem cell depletion (Liu et al., 2012). Treatment of mice with resveratrol restores SIRT1 activity, reduces the pathology, and extends survival, indicating that enhancing SIRT1 activity may be another therapeutic approach in progeroid disorders associated with LMNA mutation. Conclusions Since the identication, in 1999, of diseases caused by mutations in genes encoding for nuclear lamina proteins, research has been dedicated toward understanding the molecular mechanisms leading to these specic phenotypes. Understanding how the nuclear lamina interacts with structural proteins, chromatin, transcription factors, and other signaling partners will likely give us an understanding of mechanistic links to disease. At this moment, the puzzle is starting to come together, but the overall picture of how lamins regulate all of these pathways and how this regulation leads to disease is still developing. Understanding the mechanisms by which mutations in lamins cause these rare diseases will provide molecular insight into other common conditions that laminopathies model, such as muscle diseases and cardiomyopathy. Additionally, because mutations in the nuclear lamina result in rapid aging-like disease, dening the role of the nuclear lamina in regulating normal human longevity will be of great importance.
ACKNOWLEDGMENTS The authors would like to apologize to those scientists whose studies were not referenced due to space limitations and also acknowledge the editorial

contributions of Juniper Pennypacker. Lamin-related research in the lab of B.K.K. is supported by a grant from the National Institute of Aging (R01 AG024287). K.H.S. is supported by a Ruth L. Kirschstein NRSA Postdoctoral Fellowship. B.K.K. is an Ellison Medical Foundation Senior Scholar in Aging.

REFERENCES Agarwal, A.K., Fryns, J.P., Auchus, R.J., and Garg, A. (2003). Zinc metalloproteinase, ZMPSTE24, is mutated in mandibuloacral dysplasia. Hum. Mol. Genet. 12, 19952001. ` ne, E., Arimura, T., Helbling-Leclerc, A., Massart, C., Varnous, S., Niel, F., Lace Fromes, Y., Toussaint, M., Mura, A.M., Keller, D.I., et al. (2005). Mouse model carrying H222P-Lmna mutation develops muscular dystrophy and dilated cardiomyopathy similar to human striated muscle laminopathies. Hum. Mol. Genet. 14, 155169. Attali, R., Warwar, N., Israel, A., Gurt, I., McNally, E., Puckelwartz, M., Glick, B., Nevo, Y., Ben-Neriah, Z., and Melki, J. (2009). Mutation of SYNE-1, encoding an essential component of the nuclear lamina, is responsible for autosomal recessive arthrogryposis. Hum. Mol. Genet. 18, 34623469. Benson, E.K., Lee, S.W., and Aaronson, S.A. (2010). Role of progerin-induced telomere dysfunction in HGPS premature cellular senescence. J. Cell Sci. 123, 26052612. Best, S., Salvati, F., Kallo, J., Garner, C., Height, S., Thein, S.L., and Rees, D.C. t anomaly. Br. J. Haematol. (2003). Lamin B-receptor mutations in Pelger-Hue 123, 542544. Bione, S., Maestrini, E., Rivella, S., Mancini, M., Regis, S., Romeo, G., and Toniolo, D. (1994). Identication of a novel X-linked gene responsible for Emery-Dreifuss muscular dystrophy. Nat. Genet. 8, 323327. cane, H.M., Hammouda, E.H., Bonne, G., Di Barletta, M.R., Varnous, S., Be Merlini, L., Muntoni, F., Greenberg, C.R., Gary, F., Urtizberea, J.A., et al. (1999). Mutations in the gene encoding lamin A/C cause autosomal dominant Emery-Dreifuss muscular dystrophy. Nat. Genet. 21, 285288. Broers, J.L., Raymond, Y., Rot, M.K., Kuijpers, H., Wagenaar, S.S., and Ramaekers, F.C. (1993). Nuclear A-type lamins are differentially expressed in human lung cancer subtypes. Am. J. Pathol. 143, 211220. Butin-Israeli, V., Adam, S.A., Goldman, A.E., and Goldman, R.D. (2012). Nuclear lamin functions and disease. Trends Genet. 28, 464471. Cao, H., and Hegele, R.A. (2003). LMNA is mutated in Hutchinson-Gilford progeria (MIM 176670) but not in Wiedemann-Rautenstrauch progeroid syndrome (MIM 264090). J. Hum. Genet. 48, 271274. Cao, K., Graziotto, J.J., Blair, C.D., Mazzulli, J.R., Erdos, M.R., Krainc, D., and Collins, F.S. (2011). Rapamycin reverses cellular phenotypes and enhances mutant protein clearance in Hutchinson-Gilford progeria syndrome cells. Sci. Transl. Med. 3, 89ra58. Capell, B.C., Olive, M., Erdos, M.R., Cao, K., Faddah, D.A., Tavarez, U.L., Conneely, K.N., Qu, X., San, H., Ganesh, S.K., et al. (2008). A farnesyltransferase inhibitor prevents both the onset and late progression of cardiovascular disease in a progeria mouse model. Proc. Natl. Acad. Sci. USA 105, 15902 15907. ` res, O., Cohen, Caux, F., Dubosclard, E., Lascols, O., Buendia, B., Chazouille A., Courvalin, J.-C., Laroche, L., Capeau, J., Vigouroux, C., and ChristinMaitre, S. (2003). A new clinical condition linked to a novel mutation in lamins A and C with generalized lipoatrophy, insulin-resistant diabetes, disseminated leukomelanodermic papules, liver steatosis, and cardiomyopathy. J. Clin. Endocrinol. Metab. 88, 10061013. Chen, L., Lee, L., Kudlow, B.A., Dos Santos, H.G., Sletvold, O., Shafeghati, Y., Botha, E.G., Garg, A., Hanson, N.B., Martin, G.M., et al. (2003). LMNA mutations in atypical Werners syndrome. Lancet 362, 440445. Chen, C.Y., Chi, Y.H., Mutalif, R.A., Starost, M.F., Myers, T.G., Anderson, S.A., Stewart, C.L., and Jeang, K.T. (2012a). Accumulation of the inner nuclear envelope protein Sun1 is pathogenic in progeric and dystrophic laminopathies. Cell 149, 565577.

1372 Cell 152, March 14, 2013 2013 Elsevier Inc.

Chen, S.C., Kennedy, B.K., and Lampe, P.D. (2012b). Phosphorylation of connexin43 on S279/282 may contribute to laminopathy-associated conduction defects. Exp. Cell Res. Published online December 21, 2012. http://dx.doi. org/10.1016/j.yexcr.2012.12.014. Choi, J.C., Muchir, A., Wu, W., Iwata, S., Homma, S., Morrow, J.P., and Worman, H.J. (2012). Temsirolimus activates autophagy and ameliorates cardiomyopathy caused by lamin A/C gene mutation. Sci. Transl. Med. 4, 144ra102. Davies, B.S., Barnes, R.H., 2nd, Tu, Y., Ren, S., Andres, D.A., Spielmann, H.P., Lammerding, J., Wang, Y., Young, S.G., and Fong, L.G. (2010). An accumulation of non-farnesylated prelamin A causes cardiomyopathy but not progeria. Hum. Mol. Genet. 19, 26822694. Davies, B.S., Cofnier, C., Yang, S.H., Barnes, R.H., 2nd, Jung, H.J., Young, S.G., and Fong, L.G. (2011). Investigating the purpose of prelamin A processing. Nucleus 2, 49. De Sandre-Giovannoli, A., Chaouch, M., Kozlov, S., Vallat, J.M., Tazir, M., Kassouri, N., Szepetowski, P., Hammadouche, T., Vandenberghe, A., Stewart, C.L., et al. (2002). Homozygous defects in LMNA, encoding lamin A/C nuclearenvelope proteins, cause autosomal recessive axonal neuropathy in human (Charcot-Marie-Tooth disorder type 2) and mouse. Am. J. Hum. Genet. 70, 726736. De Sandre-Giovannoli, A., Bernard, R., Cau, P., Navarro, C., Amiel, J., Boccaccio, I., Lyonnet, S., Stewart, C.L., Munnich, A., Le Merrer, M., and vy, N. (2003). Lamin a truncation in Hutchinson-Gilford progeria. Science Le 300, 2055. Decker, M.L., Chavez, E., Vulto, I., and Lansdorp, P.M. (2009). Telomere length in Hutchinson-Gilford progeria syndrome. Mech. Ageing Dev. 130, 377383. Eriksson, M., Brown, W.T., Gordon, L.B., Glynn, M.W., Singer, J., Scott, L., Erdos, M.R., Robbins, C.M., Moses, T.Y., Berglund, P., et al. (2003). Recurrent de novo point mutations in lamin A cause Hutchinson-Gilford progeria syndrome. Nature 423, 293298. s, A.M., anos, J., Penda Espada, J., Varela, I., Flores, I., Ugalde, A.P., Cadin pez-Ot n, C. Stewart, C.L., Tryggvason, K., Blasco, M.A., Freije, J.M., and Lo (2008). Nuclear envelope defects cause stem cell dysfunction in prematureaging mice. J. Cell Biol. 181, 2735. Fatkin, D., MacRae, C., Sasaki, T., Wolff, M.R., Porcu, M., Frenneaux, M., Atherton, J., Vidaillet, H.J.J., Jr., Spudich, S., De Girolami, U., et al. (1999). Missense mutations in the rod domain of the lamin A/C gene as causes of dilated cardiomyopathy and conduction-system disease. N. Engl. J. Med. 341, 17151724. , N., Yang, S.H., Stewart, C.L., Sullivan, T., Fong, L.G., Ng, J.K., Meta, M., Cote Burghardt, A., Majumdar, S., Reue, K., et al. (2004). Heterozygosity for Lmna deciency eliminates the progeria-like phenotypes in Zmpste24-decient mice. Proc. Natl. Acad. Sci. USA 101, 1811118116. Friend, S.H., Bernards, R., Rogelj, S., Weinberg, R.A., Rapaport, J.M., Albert, D.M., and Dryja, T.P. (1986). A human DNA segment with properties of the gene that predisposes to retinoblastoma and osteosarcoma. Nature 323, 643646. Frock, R.L., Kudlow, B.A., Evans, A.M., Jameson, S.A., Hauschka, S.D., and Kennedy, B.K. (2006). Lamin A/C and emerin are critical for skeletal muscle satellite cell differentiation. Genes Dev. 20, 486500. Frock, R.L., Chen, S.C., Da, D.F., Frett, E., Lau, C., Brown, C., Pak, D.N., Wang, Y., Muchir, A., Worman, H.J., et al. (2012). Cardiomyocyte-specic expression of lamin a improves cardiac function in Lmna-/- mice. PLoS ONE 7, e42918. Gerace, L., and Huber, M.D. (2012). Nuclear lamina at the crossroads of the cytoplasm and nucleus. J. Struct. Biol. 177, 2431. Gerace, L., Blum, A., and Blobel, G. (1978). Immunocytochemical localization of the major polypeptides of the nuclear pore complex-lamina fraction. Interphase and mitotic distribution. J. Cell Biol. 79, 546566. lez, J.M., Navarro-Puche, A., Casar, B., Crespo, P., and Andre s, V. Gonza (2008). Fast regulation of AP-1 activity through interaction of lamin A/C, ERK1/2, and c-Fos at the nuclear envelope. J. Cell Biol. 183, 653666.

Gonzalez-Suarez, I., Redwood, A.B., Perkins, S.M., Vermolen, B., Lichtensztejin, D., Grotsky, D.A., Morgado-Palacin, L., Gapud, E.J., Sleckman, B.P., Sullivan, T., et al. (2009). Novel roles for A-type lamins in telomere biology and the DNA damage response pathway. EMBO J. 28, 24142427. Gordon, L.B., Kleinman, M.E., Miller, D.T., Neuberg, D.S., Giobbie-Hurder, A., Gerhard-Herman, M., Smoot, L.B., Gordon, C.M., Cleveland, R., Snyder, B.D., et al. (2012). Clinical trial of a farnesyltransferase inhibitor in children with Hutchinson-Gilford progeria syndrome. Proc. Natl. Acad. Sci. USA 109, 1666616671. , N., Dion, P., Fox, M.A., Laurent, S., Verreault, S., Sanes, Gros-Louis, F., Dupre J.R., Bouchard, J.P., and Rouleau, G.A. (2007). Mutations in SYNE1 lead to a newly discovered form of autosomal recessive cerebellar ataxia. Nat. Genet. 39, 8085. Hegele, R.A., Cao, H., Liu, D.M., Costain, G.A., Charlton-Menys, V., Rodger, N.W., and Durrington, P.N. (2006). Sequencing of the reannotated LMNB2 gene reveals novel mutations in patients with acquired partial lipodystrophy. Am. J. Hum. Genet. 79, 383389. Hellemans, J., Preobrazhenska, O., Willaert, A., Debeer, P., Verdonk, P.C., Costa, T., Janssens, K., Menten, B., Van Roy, N., Vermeulen, S.J., et al. (2004). Loss-of-function mutations in LEMD3 result in osteopoikilosis, Buschke-Ollendorff syndrome and melorheostosis. Nat. Genet. 36, 1213 1218. Hennekes, H., and Nigg, E.A. (1994). The role of isoprenylation in membrane attachment of nuclear lamins. A single point mutation prevents proteolytic cleavage of the lamin A precursor and confers membrane binding properties. J. Cell Sci. 107, 10191029. Hernandez, L., Roux, K.J., Wong, E.S., Mounkes, L.C., Mutalif, R., Navasankari, R., Rai, B., Cool, S., Jeong, J.W., Wang, H., et al. (2010). Functional coupling between the extracellular matrix and nuclear lamina by Wnt signaling in progeria. Dev. Cell 19, 413425. Hoffmann, K., Dreger, C.K., Olins, A.L., Olins, D.E., Shultz, L.D., Lucke, B., , A., et al. (2002). Mutations in the gene ller, D., Vaya Karl, H., Kaps, R., Mu encoding the lamin B receptor produce an altered nuclear morphology in t anomaly). Nat. Genet. 31, 410414. granulocytes (Pelger-Hue lez, J.M., Sanz-Gonza lez, S.M., AlvarezIvorra, C., Kubicek, M., Gonza s, V. (2006). A mechaBarrientos, A., OConnor, J.E., Burke, B., and Andre nism of AP-1 suppression through interaction of c-Fos with lamin A/C. Genes Dev. 20, 307320. lzer, M., Heilmann, C.J., de Koster, C.G., Jahn, D., Schramm, S., Schno tz, W., Benavente, R., and Alsheimer, M. (2012). A truncated lamin A in Schu the Lmna -/- mouse line: implications for the understanding of laminopathies. Nucleus 3, 463474. Johnson, B.R., Nitta, R.T., Frock, R.L., Mounkes, L., Barbie, D.A., Stewart, C.L., Harlow, E., and Kennedy, B.K. (2004). A-type lamins regulate retinoblastoma protein function by promoting subnuclear localization and preventing proteasomal degradation. Proc. Natl. Acad. Sci. USA 101, 96779682. Kaufmann, S.H. (1992). Expression of nuclear envelope lamins A and C in human myeloid leukemias. Cancer Res. 52, 28472853. Kubben, N., Voncken, J.W., Konings, G., van Weeghel, M., van den Hoogenhof, M.M., Gijbels, M., van Erk, A., Schoonderwoerd, K., van den Bosch, B., Dahlmans, V., et al. (2011). Post-natal myogenic and adipogenic developmental: defects and metabolic impairment upon loss of A-type lamins. Nucleus 2, 195207. Kudlow, B.A., Stanfel, M.N., Burtner, C.R., Johnston, E.D., and Kennedy, B.K. (2008). Suppression of proliferative defects associated with processingdefective lamin A mutants by hTERT or inactivation of p53. Mol. Biol. Cell 19, 52385248. Lammerding, J., Schulze, P.C., Takahashi, T., Kozlov, S., Sullivan, T., Kamm, R.D., Stewart, C.L., and Lee, R.T. (2004). Lamin A/C deciency causes defective nuclear mechanics and mechanotransduction. J. Clin. Invest. 113, 370378.

Cell 152, March 14, 2013 2013 Elsevier Inc. 1373

Lammerding, J., Hsiao, J., Schulze, P.C., Kozlov, S., Stewart, C.L., and Lee, R.T. (2005). Abnormal nuclear shape and impaired mechanotransduction in emerin-decient cells. J. Cell Biol. 170, 781791. Lee, R., Chang, S.Y., Trinh, H., Tu, Y., White, A.C., Davies, B.S., Bergo, M.O., Fong, L.G., Lowry, W.E., and Young, S.G. (2010). Genetic studies on the functional relevance of the protein prenyltransferases in skin keratinocytes. Hum. Mol. Genet. 19, 16031617. Liang, W.C., Mitsuhashi, H., Keduka, E., Nonaka, I., Noguchi, S., Nishino, I., and Hayashi, Y.K. (2011). TMEM43 mutations in Emery-Dreifuss muscular dystrophy-related myopathy. Ann. Neurol. 69, 10051013. Liu, B., Wang, J., Chan, K.M., Tjia, W.M., Deng, W., Guan, X., Huang, J.D., Li, K.M., Chau, P.Y., Chen, D.J., et al. (2005). Genomic instability in laminopathybased premature aging. Nat. Med. 11, 780785. Liu, Y., Wang, Y., Rusinol, A.E., Sinensky, M.S., Liu, J., Shell, S.M., and Zou, Y. (2008). Involvement of xeroderma pigmentosum group A (XPA) in progeria arising from defective maturation of prelamin A. FASEB J. 22, 603611. Liu, B., Ghosh, S., Yang, X., Zheng, H., Liu, X., Wang, Z., Jin, G., Zheng, B., Kennedy, B.K., Suh, Y., et al. (2012). Resveratrol rescues SIRT1-dependent adult stem cell decline and alleviates progeroid features in laminopathy-based progeria. Cell Metab. 16, 738750. Markiewicz, E., Tilgner, K., Barker, N., van de Wetering, M., Clevers, H., Dorobek, M., Hausmanowa-Petrusewicz, I., Ramaekers, F.C., Broers, J.L., Blankesteijn, W.M., et al. (2006). The inner nuclear membrane protein emerin regulates beta-catenin activity by restricting its accumulation in the nucleus. EMBO J. 25, 32753285. jat, A., and Misteli, T. (2010). LINC complexes in health and disease. Me Nucleus 1, 4052. ndez-Lo pez, I., and Worman, H.J. (2012). Inner nuclear membrane proteins: Me impact on human disease. Chromosoma 121, 153167. Mounkes, L.C., Kozlov, S.V., Rottman, J.N., and Stewart, C.L. (2005). Expression of an LMNA-N195K variant of A-type lamins results in cardiac conduction defects and death in mice. Hum. Mol. Genet. 14, 21672180. Muchir, A., Bonne, G., van der Kooi, A.J., van Meegen, M., Baas, F., Bolhuis, P.A., de Visser, M., and Schwartz, K. (2000). Identication of mutations in the gene encoding lamins A/C in autosomal dominant limb girdle muscular dystrophy with atrioventricular conduction disturbances (LGMD1B). Hum. Mol. Genet. 9, 14531459. Muchir, A., Pavlidis, P., Bonne, G., Hayashi, Y.K., and Worman, H.J. (2007a). Activation of MAPK in hearts of EMD null mice: similarities between mouse models of X-linked and autosomal dominant Emery Dreifuss muscular dystrophy. Hum. Mol. Genet. 16, 18841895. Muchir, A., Pavlidis, P., Decostre, V., Herron, A.J., Arimura, T., Bonne, G., and Worman, H.J. (2007b). Activation of MAPK pathways links LMNA mutations to cardiomyopathy in Emery-Dreifuss muscular dystrophy. J. Clin. Invest. 117, 12821293. Muchir, A., Shan, J., Bonne, G., Lehnart, S.E., and Worman, H.J. (2009a). Inhibition of extracellular signal-regulated kinase signaling to prevent cardiomyopathy caused by mutation in the gene encoding A-type lamins. Hum. Mol. Genet. 18, 241247. Muchir, A., Wu, W., and Worman, H.J. (2009b). Reduced expression of A-type lamins and emerin activates extracellular signal-regulated kinase in cultured cells. Biochim. Biophys. Acta 1792, 7581. Muchir, A., Wu, W., Choi, J.C., Iwata, S., Morrow, J., Homma, S., and Worman, H.J. (2012). Abnormal p38a mitogen-activated protein kinase signaling in dilated cardiomyopathy caused by lamin A/C gene mutation. Hum. Mol. Genet. 21, 43254333. Navarro, C.L., De Sandre-Giovannoli, A., Bernard, R., Boccaccio, I., Boyer, A., ` ve, D., Hadj-Rabia, S., Gaudy-Marqueste, C., Smitt, H.S., Vabres, P., Genevie et al. (2004). Lamin A and ZMPSTE24 (FACE-1) defects cause nuclear disorganization and identify restrictive dermopathy as a lethal neonatal laminopathy. Hum. Mol. Genet. 13, 24932503. anos, J., De Sandre-Giovannoli, A., Bernard, R., Courrier, Navarro, C.L., Cadin S., Boccaccio, I., Boyer, A., Kleijer, W.J., Wagner, A., Giuliano, F., et al. (2005).

Loss of ZMPSTE24 (FACE-1) causes autosomal recessive restrictive dermopathy and accumulation of Lamin A precursors. Hum. Mol. Genet. 14, 1503 1513. Nikolova, V., Leimena, C., McMahon, A.C., Tan, J.C., Chandar, S., Jogia, D., Kesteven, S.H., Michalicek, J., Otway, R., Verheyen, F., et al. (2004). Defects in nuclear structure and function promote dilated cardiomyopathy in lamin A/C-decient mice. J. Clin. Invest. 113, 357369. Nitta, R.T., Jameson, S.A., Kudlow, B.A., Conlan, L.A., and Kennedy, B.K. (2006). Stabilization of the retinoblastoma protein by A-type nuclear lamins is required for INK4A-mediated cell cycle arrest. Mol. Cell. Biol. 26, 53605372. Novelli, G., Muchir, A., Sangiuolo, F., Helbling-Leclerc, A., DApice, M.R., Massart, C., Capon, F., Sbraccia, P., Federici, M., Lauro, R., et al. (2002). Mandibuloacral dysplasia is caused by a mutation in LMNA-encoding lamin A/C. Am. J. Hum. Genet. 71, 426431. Osorio, F.G., Navarro, C.L., Cadinanos, J., Lopez-Mejia, I.C., Quiros, P.M., Bartoli, C., Rivera, J., Tazi, J., Guzman, G., Varela, I., et al. (2011). Splicingdirected therapy in a new mouse model of human accelerated aging. Sci. Transl. Med. 3, 106ra107. rcena, C., Soria-Valles, C., Ramsay, A.J., de Carlos, F., Cobo, Osorio, F.G., Ba pez-Ot n, C. (2012). Nuclear lamina defects J., Fueyo, A., Freije, J.M., and Lo cause ATM-dependent NF-kB activation and link accelerated aging to a systemic inammatory response. Genes Dev. 26, 23112324. Padiath, Q.S., Saigoh, K., Schiffmann, R., Asahara, H., Yamada, T., Koeppen, cek, L.J., and Fu, Y.H. (2006). Lamin B1 duplications cause A., Hogan, K., Pta autosomal dominant leukodystrophy. Nat. Genet. 38, 11141123. Parnaik, V.K., Chaturvedi, P., and Muralikrishna, B. (2011). Lamins, laminopathies and disease mechanisms: possible role for proteasomal degradation of key regulatory proteins. J. Biosci. 36, 471479. Piercy, R.J., Zhou, H., Feng, L., Pombo, A., Muntoni, F., and Brown, S.C. (2007). Desmin immunolocalisation in autosomal dominant Emery-Dreifuss muscular dystrophy. Neuromuscul. Disord. 17, 297305. Puckelwartz, M., and McNally, E.M. (2011). Emery-Dreifuss muscular dystrophy. Handb. Clin. Neurol. 101, 155166. anos, J., Fraile, Puente, X.S., Quesada, V., Osorio, F.G., Cabanillas, R., Cadin n rrez-Ferna ndez, A., Fanjul-Ferna ndez, ez, G.R., Puente, D.A., Gutie J.M., Ordo M., et al. (2011). Exome sequencing and functional analysis identies BANF1 mutation as the cause of a hereditary progeroid syndrome. Am. J. Hum. Genet. 88, 650656. nnemann, C.G., Jeannet, P.Y., Colomer, J., Quijano-Roy, S., Mbieleu, B., Bo Clarke, N.F., Cuisset, J.M., Roper, H., De Meirleir, L., DAmico, A., et al. (2008). De novo LMNA mutations cause a new form of congenital muscular dystrophy. Ann. Neurol. 64, 177186. Ramos, F.J., Chen, S.C., Garelick, M.G., Dai, D.F., Liao, C.Y., Schreiber, K.H., Mackay, V.L., An, E.H., Strong, R., Ladiges, W.C., et al. (2012). Rapamycin reverses elevated mTORC1 signaling in lamin A/C-decient mice, rescues cardiac and skeletal muscle function, and extends survival. Sci. Transl. Med. 4, 144ra103. Redwood, A.B., Perkins, S.M., Vanderwaal, R.P., Feng, Z., Biehl, K.J., Gonzalez-Suarez, I., Morgado-Palacin, L., Shi, W., Sage, J., Roti-Roti, J.L., et al. (2011). A dual role for A-type lamins in DNA double-strand break repair. Cell Cycle 10, 25492560. Renou, L., Stora, S., Yaou, R.B., Volk, M., Sinkovec, M., Demay, L., Richard, P., Peterlin, B., and Bonne, G. (2008). Heart-hand syndrome of Slovenian type: a new kind of laminopathy. J. Med. Genet. 45, 666671. ber, R.A., Weber, K., and Osborn, M. (1989). Differential timing of nuclear Ro lamin A/C expression in the various organs of the mouse embryo and the young animal: a developmental study. Development 105, 365378. lez, J.M., Casar, B., Andre s, V., and Crespo, P. guez, J., Calvo, F., Gonza Rodr (2010). ERK1/2 MAP kinases promote cell cycle entry by rapid, kinaseindependent disruption of retinoblastoma-lamin A complexes. J. Cell Biol. 191, 967979. Scafdi, P., and Misteli, T. (2008). Lamin A-dependent misregulation of adult stem cells associated with accelerated ageing. Nat. Cell Biol. 10, 452459.

1374 Cell 152, March 14, 2013 2013 Elsevier Inc.

Shackleton, S., Lloyd, D.J., Jackson, S.N., Evans, R., Niermeijer, M.F., Singh, B.M., Schmidt, H., Brabant, G., Kumar, S., Durrington, P.N., et al. (2000). LMNA, encoding lamin A/C, is mutated in partial lipodystrophy. Nat. Genet. 24, 153156. Shalev, S.A., De Sandre-Giovannoli, A., Shani, A.A., and Levy, N. (2007). An association of Hutchinson-Gilford progeria and malignancy. Am. J. Med. Genet. A. 143A, 18211826. Sullivan, T., Escalante-Alcalde, D., Bhatt, H., Anver, M., Bhat, N., Nagashima, K., Stewart, C.L., and Burke, B. (1999). Loss of A-type lamin expression compromises nuclear envelope integrity leading to muscular dystrophy. J. Cell Biol. 147, 913920. Sun, L.P., Wang, L., Wang, H., Zhang, Y.H., and Pu, J.L. (2010). Connexin 43 remodeling induced by LMNA gene mutation Glu82Lys in familial dilated cardiomyopathy with atrial ventricular block. Chin. Med. J. (Engl.) 123, 1058 1062. Taylor, M.R., Slavov, D., Gajewski, A., Vlcek, S., Ku, L., Fain, P.R., Carniel, E., Di Lenarda, A., Sinagra, G., Boucek, M.M., et al.; Familial Cardiomyopathy Registry Research Group. (2005). Thymopoietin (lamina-associated polypeptide 2) gene mutation associated with dilated cardiomyopathy. Hum. Mutat. 26, 566574. Tilgner, K., Wojciechowicz, K., Jahoda, C., Hutchison, C., and Markiewicz, E. (2009). Dynamic complexes of A-type lamins and emerin inuence adipogenic capacity of the cell via nucleocytoplasmic distribution of beta-catenin. J. Cell Sci. 122, 401413. van Engelen, B.G., Muchir, A., Hutchison, C.J., van der Kooi, A.J., Bonne, G., and Lammens, M. (2005). The lethal phenotype of a homozygous nonsense mutation in the lamin A/C gene. Neurology 64, 374376. s, A.M., Gutie rrez-Ferna ndez, A., Folgueras, anos, J., Penda Varela, I., Cadin nchez, L.M., Zhou, Z., Rodr guez, F.J., Stewart, C.L., Vega, J.A., A.R., Sa et al. (2005). Accelerated ageing in mice decient in Zmpste24 protease is linked to p53 signalling activation. Nature 437, 564568. rez, M.F., Cau, P., Varela, I., Pereira, S., Ugalde, A.P., Navarro, C.L., Sua anos, J., Osorio, F.G., Foray, N., Cobo, J., et al. (2008). Combined treatCadin ment with statins and aminobisphosphonates extends longevity in a mouse model of human premature aging. Nat. Med. 14, 767772. Varga, R., Eriksson, M., Erdos, M.R., Olive, M., Harten, I., Kolodgie, F., Capell, B.C., Cheng, J., Faddah, D., Perkins, S., et al. (2006). Progressive vascular smooth muscle cell defects in a mouse model of Hutchinson-Gilford progeria syndrome. Proc. Natl. Acad. Sci. USA 103, 32503255. Vaughan, A., Alvarez-Reyes, M., Bridger, J.M., Broers, J.L., Ramaekers, F.C., Wehnert, M., Morris, G.E., Hutchison, C.J., and Hutchison, C.J.; Whiteld WGF. (2001). Both emerin and lamin C depend on lamin A for localization at the nuclear envelope. J. Cell Sci. 114, 25772590.

Verstraeten, V.L., Broers, J.L., van Steensel, M.A., Zinn-Justin, S., Ramaekers, F.C., Steijlen, P.M., Kamps, M., Kuijpers, H.J., Merckx, D., Smeets, H.J., et al. (2006). Compound heterozygosity for mutations in LMNA causes a progeria syndrome without prelamin A accumulation. Hum. Mol. Genet. 15, 25092522. Viteri, G., Chung, Y.W., and Stadtman, E.R. (2010). Effect of progerin on the accumulation of oxidized proteins in broblasts from Hutchinson Gilford progeria patients. Mech. Ageing Dev. 131, 28. Waterham, H.R., Koster, J., Mooyer, P., Noort Gv, Gv., Kelley, R.I., Wilcox, W.R., Wanders, R.J., Hennekam, R.C., and Oosterwijk, J.C. (2003). Autosomal recessive HEM/Greenberg skeletal dysplasia is caused by 3 beta-hydroxysterol delta 14-reductase deciency due to mutations in the lamin B receptor gene. Am. J. Hum. Genet. 72, 10131017. Weber, K., Plessmann, U., and Traub, P. (1989). Maturation of nuclear lamin A involves a specic carboxy-terminal trimming, which removes the polyisoprenylation site from the precursor; implications for the structure of the nuclear lamina. FEBS Lett. 257, 411414. Wu, W., Shan, J., Bonne, G., Worman, H.J., and Muchir, A. (2010). Pharmacological inhibition of c-Jun N-terminal kinase signaling prevents cardiomyopathy caused by mutation in LMNA gene. Biochim. Biophys. Acta 1802, 632638. Yang, S.H., Andres, D.A., Spielmann, H.P., Young, S.G., and Fong, L.G. (2008a). Progerin elicits disease phenotypes of progeria in mice whether or not it is farnesylated. J. Clin. Invest. 118, 32913300. Yang, S.H., Qiao, X., Fong, L.G., and Young, S.G. (2008b). Treatment with a farnesyltransferase inhibitor improves survival in mice with a Hutchinson-Gilford progeria syndrome mutation. Biochim. Biophys. Acta 1781, 3639. Yang, S.H., Chang, S.Y., Ren, S., Wang, Y., Andres, D.A., Spielmann, H.P., Fong, L.G., and Young, S.G. (2011). Absence of progeria-like disease phenotypes in knock-in mice expressing a non-farnesylated version of progerin. Hum. Mol. Genet. 20, 436444. Young, S.G., Meta, M., Yang, S.H., and Fong, L.G. (2006). Prelamin A farnesylation and progeroid syndromes. J. Biol. Chem. 281, 3974139745. Zhang, Q., Bethmann, C., Worth, N.F., Davies, J.D., Wasner, C., Feuer, A., Ragnauth, C.D., Yi, Q., Mellad, J.A., Warren, D.T., et al. (2007). Nesprin-1 and -2 are involved in the pathogenesis of Emery Dreifuss muscular dystrophy and are critical for nuclear envelope integrity. Hum. Mol. Genet. 16, 28162833. Zhang, H., Kieckhaefer, J.E., and Cao, K. (2013). Mouse models of laminopathies. Aging Cell 12, 210. Published online November 26, 2012. http://dx.doi. org/10.1111/acel.12021.

Cell 152, March 14, 2013 2013 Elsevier Inc. 1375

Review
Nuclear Positioning
Gregg G. Gundersen1,* and Howard J. Worman1,2,*
of Pathology and Cell Biology of Medicine College of Physicians and Surgeons, Columbia University, 630 West 168th Street, New York, NY 10032, USA *Correspondence: ggg1@columbia.edu (G.G.G.), hjw14@columbia.edu (H.J.W.) http://dx.doi.org/10.1016/j.cell.2013.02.031
2Department 1Department

Leading Edge

The nucleus is the largest organelle and is commonly depicted in the center of the cell. Yet during cell division, migration, and differentiation, it frequently moves to an asymmetric position aligned with cell function. We consider the toolbox of proteins that move and anchor the nucleus within the cell and how forces generated by the cytoskeleton are coupled to the nucleus to move it. The signicance of proper nuclear positioning is underscored by numerous diseases resulting from genetic alterations in the toolbox proteins. Finally, we discuss how nuclear position may inuence cellular organization and signaling pathways.

Introduction Diagrams in biology textbooks usually depict the nucleus as a spheroid in the center of the cell. However, the position of nuclei varies dramatically from this simple view. Nuclei are frequently positioned asymmetrically depending on cell type, stage of the cell cycle, migratory state, and differentiation status. For example, during cell division in budding yeast, nuclei are moved into the bud neck so that each daughter cell receives one (Figure 1A). Nuclei are actively positioned in the middle of the ssion yeast S. pombe, ensuring that the division plane produces two equal daughter cells. In fertilized mammalian and invertebrate eggs, male and female pronuclei move toward each other and fuse near the middle of the zygote, ensuring that the ensuing cell division creates two equal daughter blastomeres. Asymmetric divisionstypical of early embryos and stem cellsfrequently reect a prepositioning of the nucleus. Though nuclear positioning to affect the cell division plane makes intuitive sense, asymmetric positioning occurs in nondividing cells, where the purpose is less obvious. For example, in the developing optic epithelium in Drosophila, nuclei move basally and then apically to establish the characteristic arrangement of cells in the ommatidium (Figure 1A). An analogous movement of nuclei occurs over the cell cycle in the developing vertebrate neuroepithelium. In most migrating cells, the nucleus is positioned in the rear, well removed from the protruding front (Figure 1B). Nuclei in numerous differentiated animal tissues, such as skeletal muscle, many epithelia, and neurons, are also asymmetrically positioned (Figure 1C and Table1). These examples suggest that nuclei are positioned for specialized cellular functions and that abnormal positioning could lead to dysfunction and disease. Position of nuclei can be modied secondarily to changes in cytoplasmic organization. For example, when macrovesicular fat accumulates in hepatocytes in alcoholic or nonalcoholic steatosis, nuclei are forced to the cells periphery. Similar changes in nuclear position may occur in cells with abundant secretory granules. However, recent research has discovered regulated,
1376 Cell 152, March 14, 2013 2013 Elsevier Inc.

cytoplasmic mechanical systems that function primarily to exert forces on the nucleus via connections to the nuclear envelope. These systems maintain the position of the nucleus or move it during processes such as cell migration and differentiation. Though their role in homeostatic nuclear positioning is poorly understood, mechanistic details are being deciphered in cases where nuclei move. We review systems in which progress is being made in understanding nuclear movement and positioning, and we identify the molecular toolbox that cells use for these processes. This toolbox includes specic nuclear envelope connections to cytoskeletal force-generating systems. We then evaluate how this toolbox is employed and identify conserved mechanisms that use microtubules (MTs) and actin laments as force generators. Genes encoding toolbox proteins are targets of mutations that cause disease, raising the possibility that inappropriate nuclear positioning contributes to pathogenesis. As active nuclear movement suggests that its relative position may inuence other cellular systems, we consider the signicance of nuclear positioning for cytoskeletal organization, signaling, and transcriptional control. The Nuclear Positioning Toolbox The molecular toolbox for nuclear positioning contains: (1) elements of the cytoskeleton and (2) protein complexes of the nuclear envelope. The cytoskeletal elements generate forces to move the nucleus. The protein complexes spanning the nuclear membranes mediate attachment of cytoskeletal elements to the nucleoskeleton (Figure 2). Cytoskeletal Elements Actin laments, MTs, and associated motor proteins are the principal cytoskeletal elements of the nuclear positioning toolbox. Cytoplasmic intermediate laments may also play a role, but this is currently poorly dened. In some cases, a single cytoskeletal element drives nuclear movement, as in MT-dependent movement of male and female pronuclei after fertilization and actin-dependent rearward movement of nuclei in broblasts

polarizing for migration. In other cases, MTs and actin laments collaborate to move nuclei, as in migrating neuronal cells. The role of these cytoplasmic elements in different systems is discussed in detail below. Protein Complexes in the Nuclear Envelope An exciting advance in the past few years has been the identication of the linker of nucleoskeleton and cytoskeleton (LINC) complex in the nuclear envelope that mediates connections to both MTs and actin laments (Crisp et al., 2006). LINC complexes are composed of outer nuclear membrane KASH (klarsicht, Anc1, and Syne homology) proteins and inner nuclear membrane SUN (Sad1 and Unc-83) proteins, both of which are type II membrane proteins with a single transmembrane segment (Starr and Fridolfsson, 2010) (Figure 2A). KASH and SUN proteins have been described in metazoan, fungi, and recently plants (Razafsky and Hodzic, 2009; Zhou et al., 2012a). KASH proteins are characterized by a conserved 60 residue KASH domain at their C terminus, which includes a transmembrane segment and up to 30 residues that project into the perinuclear space between inner and outer nuclear membranes. KASH domains in fungi and plants are less conserved than those in metazoans. SUN proteins contain a conserved SUN domain located within the perinuclear space. Five genes encode SUN proteins in mammals, although only two of these (SUN1 and SUN2) are widely expressed; lower eukaryotes have one or two SUN proteins (Starr and Fridolfsson, 2010). The crystal structure of SUN2 reveals an interesting mushroom-like trimer with a cap composed of SUN domains and a triple coiled-coil stalk, which is required for trimer formation (Figure 2B) (Sosa et al., 2012; Zhou et al., 2012b). Predictions of the length of this stalk suggest that the SUN protein could span the nearly 50 nm between inner and outer nuclear membranes (Sosa et al., 2012). Each SUN protein binds three KASH peptides in deep grooves between adjacent SUN domains in the trimer (Figure 2B). A KASH-SUN disulde bond may further stabilize the complex. The trimeric SUN-KASH structure raises intriguing questions about higher-ordered KASH-SUN protein assemblies, particularly if KASH proteins are indeed dimeric molecules as predicted. The binding pocket between SUN2 subunits suggests that it will accommodate related KASH domains and that SUN1 and SUN2 bind KASH proteins promiscuously (Starr and Fridolfsson, 2010). Yet there is an example in cells in which a specic KASHSUN pair assembles to move the nucleus (Luxton et al., 2011). The apparent tight packing within the SUN-KASH complex also raises questions about its assembly and regulation. KASH and SUN proteins have diffusional mobilities similar to other nuclear membrane proteins, indicating that they are likely in stlund et al., 2009). TorsinA is a potential dynamic complexes (O regulator of the LINC complex, as it localizes to the ER lumen and perinuclear space and shows afnity for KASH domains (Nery et al., 2008; Tanabe et al., 2009). TorsinAs homology to AAA ATPases suggests that it may chaperone assembly or disassembly of LINC complexes (Tanabe et al., 2009). Specicity of LINC complexes is determined by the N termini of KASH proteins, which are variable in size and ability to bind cytoskeletal elements (Figure 2C). In mammals, KASH proteins (termed nesprins) are encoded by ve genes, some of which

generate multiple isoforms by alternative RNA splicing. The giant isoforms nesprin-1G and nesprin-2G (>800 kDa) encoded by SYNE1 and SYNE2, respectively, bind actin through calponin homology (CH) domains near their N termini (Luxton et al., 2011; Zhang et al., 2001). Much of their large cytoplasmic region is predicted to be composed of spectrin repeats, suggesting a structure reminiscent of dystrophin with an extended but exible core and the potential for dimerization. Nesprin-1 and nesprin-2 isoforms also interact with the MT motors kinesin-1 and dynein, although whether binding is direct is unknown (Yu et al., 2011; Zhang et al., 2009). In C. elegans, the KASH protein Unc-83 interacts directly with kinesin-1, dynein, and dynein regulators, including BicaudalD and NudE homologs (Fridolfsson et al., 2010; Fridolfsson and Starr, 2010). Nesprin-3a, an isoform encoded by SYNE3, binds the crosslinking protein plectin, which binds cytoplasmic intermediate laments (Wilhelmsen et al., 2005). Nesprin-4 encoded by SYNE4 has a short N terminus that associates with MTs through kinesin-1 and is restricted in expression to highly secretory cells and hair cells of the cochlea (Horn et al., 2013; Roux et al., 2009). Aside from spectrin repeats, there are no other recognizable domains in nesprins 14. A meiosis-specic nesprin termed KASH5 binds the dynein regulator dynactin (Morimoto et al., 2012). Lower eukaryotes express actin- and MT motor-binding KASH proteins, although there is less genetic complexity in these organisms. For example, there are two KASH proteins in Drosophila and four in C. elegans (Figure 2C) (Starr and Fridolfsson, 2010). At the intranuclear side of the LINC complex, SUN proteins bind to nuclear lamins (Crisp et al., 2006; Haque et al., 2006). Lamins are intermediate lament proteins that polymerize to form the nuclear lamina, a meshwork underlying the inner nuclear membrane. Lamins A and C (A-type lamins), which are alternative splice isoforms of the same gene, and lamins B1 and B2 are the predominant lamins expressed in differentiated mammalian somatic cells. N termini of SUN1 and SUN2 bind to lamin A, mediating their interaction with the lamina. Hence, the LINC complex, via KASH protein interactions with cytoskeletal proteins and SUN protein interactions with lamins, connects the nucleoskeleton to the cytoskeleton. In mammalian cells lacking A-type lamins, SUN proteins still localize to the nucleus (Crisp et al., 2006; Haque et al., 2006), although they and their nesprin partners have increased mem stlund et al., 2009). This suggests brane diffusional mobility (O that other factors contribute to LINC complex anchoring. Indeed, yeast lack lamins but still employ KASH and SUN proteins to attach the nucleus to the cytoskeleton. In S. pombe, the heterochromatin-binding protein Ima1 anchors the SUN protein Sad1, a component of the spindle pole body (King et al., 2008). SAMP1, the mammalian lma1 ortholog, localizes to LINC complex assemblies that attach actin to the nucleus (Borrego-Pinto et al., 2012). Emerin, which is an integral protein predominantly localized to the inner nuclear membrane, binds to lamins and nesprins (Mislow et al., 2002; Zhang et al., 2005). Additionally, SUN1 associates with nuclear pore complexes (Liu et al., 2007). LINC complex components constitute the major tools for connecting the nucleus to the cytoskeleton, yet they may not be the only ones. Dynein interacts with Bicaudal2, which in turn binds to
Cell 152, March 14, 2013 2013 Elsevier Inc. 1377

Figure 1. Diversity of Nuclear Positioning


(A) Schematics of nuclear positioning in dividing cells and developing epithelium. Arrows indicate movements of nuclei (blue). The nucleus is positioned relative to the plane of division in yeast and fertilized eggs. The diagram of insect optic epithelium (adapted from Patterson et al., 2004; Tomlinson and Ready, 1986) represents a longitudinal section of a larval eye disc; two nuclei are shown. Nuclei that are anterior (A) to the morphogenetic furrow (mf), which moves anteriorly, move basally. Nuclei that are posterior (P) to the furrow move apically as cells are recruited into clusters comprising ommatidium (white cells, cones; gray cells, R cells). The diagram of vertebrate neuroepithelium represents a longitudinal section of the developing cerebral cortex. Nuclei move basally during G1 and apically during G2. Mitosis (M) occurs near the apical surface. Adapted from Buchman and Tsai (2008) with permission. (B) Rearward nuclear position is typical of migrating cells. (Left) Schematic of a migrating cell with protruding leading edge and contracting tail. (Red) Actin laments. (Right) Montage of migrating cells with front-back dimensions normalized. Dotted line represents the midpoint between the front and back. Nuclei are positioned along the front-back axis but always rearward of the cell center. Images reproduced with permission from: broblast (Gomes

(legend continued on next page)

1378 Cell 152, March 14, 2013 2013 Elsevier Inc.

Table 1. Nuclear Positions in Mammalian Cells and Tissues Cell Tissue Proliferating Cells Somatic cells Stem cells Germ cells, oocytes Migrating Cells 1D (cultured broblast) 2D (cultured; many types) 3D (cultured broblast) 3D (dermal sarcoma cells) 3D (neurons in cortex) Macrophages, neutrophils Tissues Muscle, skeletal Muscle, cardiac Muscle, smooth Epithelia, squamous Epithelia, cuboidal Epithelia, columnar Epithelia, pseudostratied Epithelia, secretory Neurons Astrocytes/oligodendricytes Connective Tissue Osteoblasts/osteocytes Osteocytes, actively secreting Osteoclasts Chondroblasts/chondrocytes Chondrocytes, actively secreting Fibrocytes, resting Adipocytes Hematopoetic Macrophages T cells, migrating or contacting target cell B cells, plasma cells
a

Nuclear Position

Axis Alignmenta

Comments

central usually asymmetric asymmetric

NA various; niche related NA front-rear front-rear front-rear front-rear front-rear front-rear peripheral-central NA NA NA NA apical-basal apical-basal apical-basal proximal-distal NA cell-cycle dependent aligned with secretory axis clustered at neuromuscular junction see Figure 1 moves centrally after fertilization

asymmetric asymmetric asymmetric asymmetric asymmetric asymmetric

asymmetric, complex central central central central asymmetric asymmetric asymmetric asymmetric central

central asymmetric asymmetric central asymmetric central asymmetric

NA front-rear front-rear NA front-rear NA NA front-rear front-rear front-rear relative to secretory axis relative to secretory axis

asymmetric asymmetric asymmetric

Position in italics.

RANBP2 at the cytoplasmic face of the nuclear pore complex (Splinter et al., 2010). This association targets dynein to the nucleus during G2 and may contribute to nuclear envelope breakdown. However, it could be an alternative means to target dynein for nuclear movement. Certain muscle-specic nuclear membrane proteins accumulate along MTs, suggesting that the nuclear positioning toolbox may also contain tissue-specic tools (Wilkie et al., 2011).

Initiation of Nuclear Movement Specic sets of tools become activated to move the nucleus in response to stimuli. In pronuclear migrations in fertilized eggs, formation of MTs by the sperm centrosome initiates movement of both male and female pronuclei. Activation of the Rho GTPase Cdc42 by the serum factor lysophosphatidic acid (LPA) initiates nuclear movement in migrating broblasts by activating actin retrograde ow (Gomes et al., 2005; Palazzo et al., 2001).

et al., 2005), breast carcinoma (McNiven, 2013), keratocyte (Barnhart et al., 2010), endothelial cell (Tsai and Meyer, 2012), astrocyte (Osmani et al., 2006), and neuron (Godin et al., 2012). (C) Nuclear positioning in mammalian tissues. Cross-sections of kidney cortex and skeletal muscle stained with hematoxylin and eosin. Nuclei are positioned centrally in the distal (D) convoluted tubules and basally in proximal (P) convoluted tubules. Nuclei are positioned at the periphery of normal skeletal muscle bers but are found centrally in dystrophic tissue.

Cell 152, March 14, 2013 2013 Elsevier Inc. 1379

Cdc42 is also essential for nuclear movements in neuronal migration (Solecki et al., 2004) and neuronal precursors in the neuroepithelium (Cappello et al., 2006). Nuclear movement in the neuroepithelium is under cell-cycle control, and interference with cell-cycle progression prevents it (Taverna and Huttner, 2010). These examples indicate that initiating nuclear movements involves the de novo assembly of cytoskeletal components of the toolbox. However, this is a edgling area of inquiry, and other processes such as activation of motors or relaxation of nuclear anchoring may contribute to initiating nuclear movement. Almost nothing is known about factors terminating nuclear movement. Characteristics of Nuclear Movements Nuclear movements occur in different cellular contexts and are powered by different cytoskeletal elements. It is therefore not surprising that they have different characteristics (Table 2). Velocities vary between 0.1 and 1.0 mm/min, although peak rates can be >10 mm/min for sperm pronuclei in Xenopus eggs. Distances transversed during single episodes are generally one nuclear diameter (510 mm) or less, although they are longer in fertilized eggs and in the neuroepithelium. Nuclear movements are usually continuous and unidirectional. However, hightemporal-resolution imaging of nuclei in C. elegans hypodermal cells revealed short pauses and bidirectional movements, suggesting additional complexity (Fridolfsson and Starr, 2010). During basal movement in the rat neuroepithelium, nuclei pause for hour-long intervals before continuing in the same direction, suggesting complex regulation. This diversity of nuclear movements provided an early clue that there is more than one mechanism responsible. MT-Mediated Nuclear Movement Pioneering studies on invertebrate and vertebrate eggs revealed that there are distinct mechanisms by which MTs connect to the nczy, 1998). The nucleus to move it (reviewed in Reinsch and Go male pronucleus, which forms after sperm entry into the egg, nucleates MTs from its centrosome and moves toward the middle of the cell. The female pronucleus laterally engages MTs emanating from the male pronuclear-centrosome complex and moves along them to join the male nucleus near the cell center. Male pronuclear movement is generated by MT growth and pushing along cortical sites and/or sites within the cyto nczy, 1998). Force is transmitted to the plasm (Reinsch and Go nucleus through its intimate association with the centrosome and centrosomal MTs. Female pronuclear movement is generated by attached cytoplasmic dynein motors that move it toward MT minus ends at the sperm centrosome. Research on nuclear
Figure 2. Molecular Toolbox for Nuclear Movement/Positioning
(A) Schematic of an idealized LINC complex in nuclear envelope. The inner nuclear membrane (INM) SUNs bind within the perinuclear space to outer nuclear membrane (ONM) KASH proteins. KASH proteins bind directly or indirectly to cytoskeletal laments, including MTs, actin microlaments, and cytoplasmic intermediate laments. In metazoans, SUNs bind to the nuclear lamina; in yeast and plants, other intranuclear proteins bind SUNs. A nuclear pore complex (NPC) is shown for reference. (B) Side view of the structure of the SUN2-nesprin2 KASH complex. Trimeric SUN2 domains are represented by different shades of blue, and the KASH peptide is in orange. The structure illustrates the orientation of the KASH peptide between adjacent SUN domains. Modied from Sosa et al. (2012) with permission. (C) Schematic diagrams of KASH proteins from representative organisms and the cytoskeletal laments to which they bind. Binding to actin laments is mediated by CH domains and binding to cytoplasmic intermediate laments by plectin. Binding to MTs is mediated by dynein and kinesins; direct binding to MTs has not been reported. The specic splice variants of nesprin-1 and nesprin-2 that interact with MT motors are unknown; for simplicity, a short variant of each is depicted. H.s., Homo sapiens; M.m., Mus musculus; D.m., Drosophila melanogaster; C.e., Caenorhabditis elegans; S.p., Schizosaccharomyces pombe.

1380 Cell 152, March 14, 2013 2013 Elsevier Inc.

Table 2. Physical Characteristics of Typical Nuclear Movements System Fertilized Egg Male pronucleus, Xenopus Female pronucleus, Xenopus Migrating Neurons Cortical brain slice SVZ explants, matrigel Granular neurons on radial glia Radial Glia INM, Cortical Brain Slice Basal directed Apical directed Other Systems Fibroblasts polarizing for migration Astrocytes polarizing for migration Oocyte (D.m.) Hypodermal cell (C.e.) Budding yeast 0.280.35 0.05 0.07 0.23 1.18 510 10 510 3.3 12 continuous continuous continuous continuous continuous actomyosin ow actomyosin ow MT polymerization kinesin1 dynein (and MT depolymerization) Gomes et al., 2005; Luxton et al., 2010 Dupin et al., 2011 Zhao et al., 2012 Fridolfsson and Starr, 2010 Adames and Cooper, 2000 0.14 0.06 3050 3050 intermittent with long pauses continuous kinesin3 dynein Tsai et al., 2010 Tsai et al., 2007 0.33 1.25 1.0 15 25 1.3 saltatory saltatory saltatory MT and myosin II MT and myosin II Tsai et al., 2007 Schaar and McConnell, 2005 Solecki et al., 2004; Solecki et al., 2009 16 0.21.5 100300 100300 ? ? MT polymerization dynein nczy, 1998 Reinsch and Go nczy, 1998 Reinsch and Go Rate (mm min1) Distance (mm) Mode Dependence References

D.m., Drosophila melanogaster; C.e., Caenorhabditis elegans.

movement has progressed from fertilized eggs to more molecularly tractable systems, yet the idea that distinct MT-dependent processes move the nucleus has persisted and has been strengthened by newer studies. Nuclear Movement by MT Pushing and Pulling Forces In the male pronuclear form of nuclear movement, an MT organizing center (MTOC) connects the nucleus to MTs, and MT dynamics power movement (Figure 3A). This form of nuclear movement occurs before cell division in the budding yeast S. cerevisiae (Adames and Cooper, 2000), the ssion yeast nczy S. pombe (Tran et al., 2001), early C. elegans embryos (Go et al., 1999), Drosophila oocytes (Zhao et al., 2012), and cultured mammalian cells (Levy and Holzbaur, 2008). The MTOC is either embedded in the nuclear envelope (yeast spindle pole body) or is tightly associated with it (other systems). In C. elegans, the centrosome connects to the nuclear envelope through the LINC complex proteins Zyg-12, a KASH protein, and SUN1 (Malone et al., 2003). Outer nuclear membrane Zyg-12 binds to dynein, moving the centrosome close to the nucleus and promoting association between Zyg12 and a centrosomal splice variant lacking the transmembrane domain. Zyg-12 is not conserved, so whether a similar mechanism is present in other organisms is unclear. Defects in A-type lamins and emerin increase spacing between the nucleus and centrosome in mammalian cells (Lee et al., 2007; Salpingidou et al., 2007); however, it is not clear that these proteins directly link them. For male pronuclear type of nuclear movement, forces are generated by MTs interacting with cortical or cytoplasmic sites (Figure 3A). The interaction can be simply physical or mediated

by anchored dynein. In S. pombe, interaction of growing MTs with the periphery generates pushing forces that maintain the nucleus in the middle of the cell (Tran et al., 2001). Pushing forces are restricted to systems in which relatively short distances (10 mm) are involved because longer MTs cannot withstand compressive forces. Thus, in larger cells, MT pulling forces contribute to centrosome movements. In most cases, pulling forces are generated by cortically anchored dynein (Grill et al., 2003; Schmoranzer et al., 2009), as originally described in budding yeast, where dynein immobilized in the bud pulls on spindle-pole-body-associated MTs, moving the nucleus toward the bud neck (Adames and Cooper, 2000). In syncytial cells with multiple nuclei, a more complex MT pulling mechanism exists. In the lamentous fungus Aspergillus, in which genetic screens revealed roles for dynein and its regulators in nuclear positioning (Morris et al., 1998), MT anchoring at cortical sites appears to evenly space nuclei in the syncytial hyphae (Gladfelter and Berman, 2009). In differentiating insect and mammalian muscle cells, which lack active centrosomes, MT minus ends associate directly with the nuclear envelope through uncharacterized factors. These cells also use dynein pulling and MT sliding by kinesin-1 and MT-associated proteins to cluster nuclei near the center of syncytial myotubes (Folker et al., 2012; Metzger et al., 2012). Nuclear Movement by Attached MT Motor Forces In the female pronuclear form of nuclear movement, nuclei associate laterally with MTs and move along them, powered by nuclear-envelope-associated motors (Figure 3B). This is typical of nuclear movements that occur during developmental events. Genetic screens that identied KASH (Unc83) and SUN
Cell 152, March 14, 2013 2013 Elsevier Inc. 1381

Figure 3. Mechanisms of Nuclear Movement


(A) Schematic of male pronuclear-type nuclear movement mediated by MTs (green). Forces (arrows) can be generated by MT polymerization, depolymerization, or dynein motors (red) anchored in the cortex or cytoplasmic sites. (B) Schematic of female pronuclear-type nuclear movement mediated by MT dynein (red) and kinesin (orange) motors. Forces (arrows) are generated by motors that laterally connect nuclei to MTs. (C) Schematic of actomyosin-type nuclear movement. Force (arrows) is generated by the actomyosin-dependent ow of dorsal actin cables (red).

(Unc84) proteins in C. elegans revealed that these proteins were required for nuclear movement in various cell types (Starr and Fridolfsson, 2010). Unc83 recruits both dynein and kinesin-1 motors to the nuclear envelope, where kinesin-1 is responsible for moving the nucleus while dynein contributes to directionality (Fridolfsson and Starr, 2010). Female pronuclear-type nuclear movements are pronounced in the developing nervous system. Early genetic screens in Drosophila identied Klarsicht, or Klar, as a founding member of the KASH protein family, and it is required for apical movements of nuclei that establish the proper arrangement of cells in the ommatidium (Mosley-Bishop et al., 1999). Klar function has been linked to kinesin and dynein (Welte, 2004), suggesting that it may recruit these MT motors to the nuclear envelope.
1382 Cell 152, March 14, 2013 2013 Elsevier Inc.

Mutants in the dynein regulators dynactin and DLis1 have similar nuclear migration defects as Klar mutants (Fan and Ready, 1997; Swan et al., 1999). Mutants in lamin Dm(0) and the SUN protein klaroid disrupt Klar localization to the nuclear envelope and apical movement of the nucleus, generating the same Klar phenotype (Kracklauer et al., 2007; Patterson et al., 2004). This result was the rst to suggest that the nuclear lamina anchored the LINC complex. Female pronuclear-type nuclear movements occur during two stages of vertebrate central nervous system development. In neuroepithelial radial glial cells, which serve as neuronal precursors, nuclear movement occurs along the apical-basal axis in a cell-cycle-dependent fashion. This has been termed interkinetic nuclear migration (INM). During INM, the nucleus moves basally during G1 and returns during G2 to an apical location where mitosis occurs (Taverna and Huttner, 2010). As the centrosome remains apical, basal and apical movements occur in MT plus and minus end directions, respectively. MT motors have been implicated in these movements. The kinesin-3 family member, Kif1a, has been implicated for plus-end-directed movement and dynein for minus-end-directed movement (Tsai et al., 2007; Tsai et al., 2010). Nesprin-1 and nesprin-2 may serve as recruitment factors for MT motors in vertebrate INM. Knockout of their genes in mice and zebrash leads to defective INM in the neocortex and retina, and mouse nesprin-2 coimmunoprecipitates with dynein and kinesin-1 (Tsujikawa et al., 2007; Yu et al., 2011; Zhang et al., 2009). Interfering with the dynactin and Lis1 gives similar phenotypes (Tsai et al., 2005; Tsujikawa et al., 2007). Nuclear movements in vertebrate INM may be more complex than in the Drosophila eye, as myosin II and actomyosin contractility may also play a role (Norden et al., 2009; Schenk et al., 2009). The second stage of vertebrate central nervous system development involving female pronuclear-type movements is neuron migration. After their birth in the neuroepithelium, neurons migrate signicant distances to their nal locations. Most migrating neurons exhibit a characteristic two-stroke form of migration in which the narrow leading process extends; the centrosome then moves forward into a swelling in the leading process followed by the nucleus and the rest of the soma (Tsai and Gleeson, 2005). Nuclear movement toward MT minus ends at the centrosome is dependent on dynein and its regulators Lis1 and NudE (Shu et al., 2004; Tsai et al., 2007). The centrosome also moves in a dynein- and Lis1-dependent fashion. Lis1 binds to a specic nucleotide state of dynein and enhances force generation, which may be necessary for moving the nucleus (McKenney et al., 2010). Nesprin-2 and SUN1/SUN2, which are also required for the forward movement of the nucleus, may recruit dynein to the nucleus (Zhang et al., 2009). Doublecortin, a MT-associated protein, is also required for nuclear movement during neuron migration (Koizumi et al., 2006). Importantly, nuclear movement during neuronal migration is also dependent on actomyosin contraction (see below), so this is not a pure form of female pronuclear-type movement. The two-stroke mode of migration with a large separation (518 mm) between the centrosome and nucleus is thought to be a particular feature of neurons and is not typically observed in other migrating cells. Nonetheless, the same anterior

Table 3. Genes Encoding Proteins Functioning in Nuclear Positioning Linked to Human Disease Human Gene DCX LIS1 TUBA3 LMNB1 LMNB2 SUN1 SUN2 SYNE1 Protein doublecortin Lis1 a-tubulin lamin B1 lamin B2 Sun1 Sun2 nesprin-1 Function stabilizes microtubules dynein regulation MT component lamina component lamina component LINC complex LINC complex LINC complex Human Disease(s) lissencephaly lissencephaly lissencephaly adult onset leukodystrophy results from gene duplication susceptibility to acquired partial lipodystrophy none to date none to date (1) cerebellar ataxia; (2) myopathies; (3) arthrogryposis myopathies high-frequency hearing loss (1) myopathy; (2) partial lipodystrophy; (3) peripheral neuropathy; (4) progeria (1) coordination defects; 2) cardiomyopathy and muscular dystrophy; 3) congenital joint contractures and muscle weakness cardiomyopathy and skeletal muscular dystrophy progressive high-frequency hearing loss (1) cardiomyopathy with variable skeletal muscular dystrophy; (2) fat loss from extremities; (3) peripheral nerve defects; (4) accelerated aging phenotypes Disease Phenotypes mislocalization of cortical neurons, smooth brain mislocalization of cortical neurons, smooth brain mislocalization of cortical neurons, smooth brain demyelination regional fat loss

SYNE2 SYNE4 LMNA

nesprin-2 nesprin-4 A-type lamins

LINC complex LINC complex lamina components

localization of the centrosome relative to the nucleus, albeit in closer proximity, occurs in many migrating cell types, and dynein has been implicated in nuclear movements in migrating nonneuronal cells (Luxton and Gundersen, 2011). Actin-Mediated Nuclear Movement A groundbreaking study in C. elegans identied an outer nuclear membrane protein, termed Anc-1, which bound to actin and was essential for anchoring nuclei in the syncytial hypodermal and intestinal cells (Starr and Han, 2002). Anc-1 is one of the founding members of the KASH protein family and requires the SUN protein Unc84 for its outer nuclear membrane localization. While the discovery of Anc-1 showed that nuclear connections to the actin cytoskeleton anchor nuclei, we now know that nuclei are also actively moved through actin-dependent processes, typically in cells polarizing for migration. Nuclear Movement by Tethering to Moving Actin Cables The rearward positioning of the nucleus in migrating cells (Figure 1B) may result, at least in part, from an extension of the leading edge. Yet, studies in a number of cultured cell types have revealed that rearward nuclear positioning is an active process independent of cell protrusion (Desai et al., 2009; Dupin et al., 2011; Gomes et al., 2005; Luxton et al., 2010). A direct mechanism for moving the nucleus has been established in experiments utilizing wounded monolayers of serumstarved broblasts treated with LPA, which stimulates cell polarization, but not protrusion or migration (Luxton et al., 2010). In this system, rearward-moving dorsal actin cables induced by Cdc42 provide force to move the nucleus (Figure 3C). Movement of dorsal actin cables is likely powered by myosin II, as its inhibition prevents actin ow and nuclear movement (Gomes et al., 2005).

These cables are directly coupled to the nucleus by nesprin-2G and SUN2, which accumulate along them to form linear assemblies termed transmembrane actin-associated nuclear (TAN) lines. Actin-binding CH domains of nesprin-2G are required for TAN line formation and nuclear movement. A-type lamins anchor TAN lines to the nucleoskeleton, and in their absence, TAN lines slip over an immobile nucleus (Folker et al., 2011). This anchorage is presumably mediated through SUN2 binding to A-type lamins. Additional anchorage may be mediated by SUN2 binding to SAMP1, which also localizes to TAN lines and is necessary for nuclear movement (Borrego-Pinto et al., 2012). Nuclear Movement by Actomyosin Contraction Nuclear movement appears to be rate limiting for cells migrating through narrow extracellular spaces in which nuclei become deformed (Friedl et al., 2011). In at least some of these cases, passage through a narrow opening specically requires myosin II (Beadle et al., 2008), suggesting that actomyosin-mediated nuclear movement is necessary. Myosin II is also necessary for the forward movement of the nucleus in migrating neurons (Solecki et al., 2009; Tsai et al., 2007), localizing behind it where it may provide contractile forces that help to move it into the leading process. This may reect the difculty of moving the nucleus into the narrow leading process, which requires nuclei to become elongated. Nuclear Positioning and Disease We have provided several examples of nuclear positioning events that are required for specic cellular processes. Given this requirement, one could imagine that defects in the molecular toolbox for nuclear positioning could lead to cellular dysfunction. Indeed, results from human subjects with inherited diseases and
Cell 152, March 14, 2013 2013 Elsevier Inc. 1383

mouse models have shown that alterations in proteins involved in nuclear positioning are associated with pathology. Mutations in genes encoding proteins involved in MT function, LINC complex components, and the nuclear lamina all cause human diseases (Table 3). Lissencephaly is characterized by mislocalization of cortical neurons, resulting in decreased cortical complexity and a smooth brain surface. Affected children have severe psychomotor retardation, seizures, muscle spasticity, and failure to thrive. At the cellular level, neuronal migration required for brain development is blocked. Most cases of classic lissencephaly are caused by deletion or truncating mutations in LIS1 (Reiner et al., 1993). The Lis1 protein is required for INM and nuclear and centrosomal movement during two-stroke neuronal migration (Shu et al., 2004; Tsai et al., 2007). Similarly, mutations in DCX encoding doublecortin cause X-linked lissencephaly and defective nuclear movement in neurons (Gleeson et al., 1998; Koizumi et al., 2006). De novo mutations in TUBA3 encoding a-1 tubulin also cause lissencephaly and defective nuclear movement in neurons (Keays et al., 2007). Intriguingly, depletion of lamin B1, lamin B2, or both in mice causes lissencephaly-like phenotypes (Cofnier et al., 2010, 2011). These phenotypes result from neuronal migration defects, which likely have accompanying abnormalities in nuclear movement, although this has not been assessed directly. Nuclei spin in mouse broblasts lacking lamin B1, suggesting that B-type lamins function in nuclear anchoring (Ji et al., 2007). B-type lamins may therefore anchor LINC complexes. Mutations in genes encoding B-type lamins have not yet been linked to human developmental brain disorders, but duplications in LMNB1 cause overexpression of lamin B1 and an adult-onset demyelinating disease (Padiath et al., 2006). Experiments in knockout mice implicate SUN1, SUN2, nesprin-1, and nesprin-2 in nuclear migration during neurogenesis and migration (Zhang et al., 2009). However, mutations in genes encoding nesprins have been linked to diseases other than lissencephaly. Mutations in SYNE1 encoding nesprin-1 cause adult-onset autosomal-recessive cerebellar ataxia characterized by diffuse cerebellar atrophy and impaired walking, dysarthria, and poor coordination (Gros-Louis et al., 2007). This could potentially result from neuronal migration defects in a specic region of the brain. Mutations in SYNE1 have also been reported to cause an autosomal-recessive form of arthrogryposis multiplex congenita characterized by congenital joint contractures, muscle weakness, and progressive motor decline (Attali et al., 2009). Mutations in SYNE1 and SYNE2 have further been reported to cause Emery-Dreifuss muscular dystrophy (EDMD)-like phenotypes (Zhang et al., 2007a). Mutations in the gene encoding the LINC-complex-associated protein emerin were rst reported to cause X-linked EDMD (Bione et al., 1994), and mutations in LMNA-encoding A-type lamins are responsible for most autosomally inherited cases (Bonne et al., 1999). This suggests an association between LINC complex function and EDMD-like phenotypes, which generally share a dilated cardiomyopathy with variable skeletal muscle involvement. More recently, mutation in SYNE4 encoding nesprin-4 has been shown to cause autosomal-recessive, progressive high-frequency hearing loss (Horn et al., 2013).
1384 Cell 152, March 14, 2013 2013 Elsevier Inc.

Nuclear positioning defects caused by SYNE1 and SYNE2 mutations have been described. One patient with a SYNE1 mutation and cerebellar ataxia was reported to have fewer muscle nuclei under neuromuscular junctions (Gros-Louis et al., 2007). Similarly, deletion of the KASH domain from nesprin-1 in mice abolishes synaptic nuclei clustering and disrupts spacing of nonsynaptic nuclei in skeletal muscle; deletion of the nesprin-2 KASH domain has no effect but exacerbates the defect in mice lacking nesprin-1 (Zhang et al., 2007b). Nesprin-2 deletion in mice disrupts nuclear movement in cells of the neocortex and retina, causing reduced thickness of the cortex and the outer nuclear layer into which newly formed photoreceptor cells migrate (Yu et al., 2011; Zhang et al., 2009). Mice lacking nesprin-4 suffer from deafness, mimicking the human mutation phenotype, and have abnormal positioning of nuclei in cochlear outer hair cells (Horn et al., 2013). Although no disease-causing mutations in SUN1 or SUN2 have been described in humans, depletion of both proteins from mice cause nuclear positioning defects in muscle, retina, and developing brain, similar to those in mice lacking nesprin-1 and nesprin-2 (Lei et al., 2009; Yu et al., 2011; Zhang et al., 2009). Mice without SUN1 also have hearing loss and abnormal nuclear positioning in cochlear outer hair cells (Horn et al., 2013). The tissue-selective human diseases and pathology in mice that occur in response to alterations in different SUNs and nesprins may result because only certain isoforms are necessary in different tissues. Data from mice demonstrate tissue-selective differences in the expression of nesprins and SUNs, yet there is no comprehensive analysis of the expression patterns and tissue-type functionality of all of the different nesprins and SUNs. Results from knockout mice also suggest redundancy in the function of SUN1 and SUN2 and different tissue effects of nesprin-1 and nesprin-2. Mutations in LMNA encoding the A-type lamins cause a broad range of human diseases often referred to as laminopathies (Dauer and Worman, 2009). LMNA mutations that cause EDMD and related myopathies are mostly missense or small in-frame deletions, which lead to expression of variant proteins, splice site truncations, or promoter mutations. Depletion of A-type lamins from mice leads primarily to cardiac and skeletal muscle phenotypes, suggesting that LMNA mutations, even dominant ones leading to variant protein expression, somehow cause loss of function (Sullivan et al., 1999). Skeletal muscles from humans with autosomal dominant EDMD and Lmna null mice both have nuclei in the center of myobers rather than at their normal peripheral localization. However, this also occurs in other myopathies not associated with defects in proteins directly implicated in nuclear positioning. For more on laminopathies, please see the Review by Schreiber and Kennedy on page 1365 of this issue (Schreiber and Kennedy, 2013). In migrating broblasts depleted of A-type lamins or expressing variants associated with myopathy, actin-dependent rearward nuclear movement fails to occur (Folker et al., 2011). In these cells, nesprin-2G assembles into TAN lines that slip over the nucleus rather than moving with it, indicating an anchorage defect. Amino acid substitutions within an immunoglobulin-like

motif in the tail of A-type lamins cause partial lipodystrophy, which is characterized by fat loss from the extremities. In contrast to those causing myopathy, expression of lamin A variants that cause lipodystrophy inhibit MT-dependent centrosome positioning, but not actin-dependent nuclear movement in migrating broblasts (Folker et al., 2011). Except for cases in which nuclear positioning defects associate with abnormal neuronal migration, the relationship of the positioning defects observed in model systems to pathogenic mechanisms remains uncertain. It is not known why alterations in the nuclear positioning proteins affect only cells in certain tissues when the proteins are widely expressed. In some instances, observed nuclear positioning defects may not directly connect to the disease, such as mispositioning of nuclei at the neuromuscular junction in cerebellar ataxia. Overall, alterations in the nuclear positioning toolbox most often affect tissues, such as the nervous system and striated muscle, in which cell migration plays an important role in organ development or homeostasis. Abnormal force transmission between the nucleus and cytoplasm may also render cells more susceptible to damage by mechanical stress, leading to activation of stress response or apoptotic pathways, resulting, respectively, in cell dysfunction or death. Cellular Signicance of Nuclear Positioning: Hypotheses and Perspectives Our understanding of why cells move and position their nuclei is still rudimentary. Yet, interfering with proteins involved in nuclear movement inhibits many cell functions. Defects in the nuclear positioning toolbox also cause disease. Thus, nuclear positioning itself may inuence other cellular activities. Here, we put into perspective evidence supporting the hypotheses that nuclear positioning inuences the organization and mechanical properties of the surrounding cytoplasm, cytoplasmic signaling, and accessibility of the nucleus to signaling pathways. The Nuclear Envelope as a Cytoskeletal Integrator Identication of the LINC complex and other proteins mediating nucleocytoskeletal connections raises the possibility that the nucleus not only attaches to the cytoskeleton, but also organizes it. Even before the identication of specic nucleocytoskeletal connectors, a classical experiment by Ingber and colleagues revealed that the nucleus was physically connected to integrins in the plasma membrane (Maniotis et al., 1997). These investigators showed that applying force to bronectin beads attached to integrins moved the nucleus tens of microns away. Although the nature of the connection was not identied, this observation clearly reects linkages that exist between the nucleus and the plasma membrane. The nucleus inuences the MT cytoskeleton through its association with MTOCs, which determine where MT minus ends are anchored. A more direct inuence of the nucleus on MT distribution occurs in cells with noncentrosomal MTs. In multinucleated myotubes, which lack functional centrosomes, MTs minus ends are attached to nuclei by unidentied linkers, contributing to an overall bipolar array of MTs with mixed polarity (Tassin et al., 1985). The nucleus may also affect organization of the actin cytoskeleton. CH-domain-containing

nesprins tether the nucleus to actin laments, but whether they organize actin arrays around it is less certain. In broblasts polarizing for migration, depleting nesprin-2G or A-type lamins does not alter the overall distribution of actin laments or the formation and movement of dorsal actin cables (Folker et al., 2011; Luxton et al., 2010). However, alterations in actin laments and focal adhesions have been reported when LINC complex components are perturbed (Hale et al., 2008; Khatau et al., 2009). This may reect lack of direct connection of the actin arrays to the nuclear envelope or indirect effects. These ndings suggest that, at least under some circumstances, the nucleus actively participates in organizing certain actin structures. Additional evidence that the nucleus organizes the cytoplasm comes from biophysical measurements. Cytoplasmic stiffness adjacent and distal to the nucleus is altered in cells depleted of A-type lamins (Broers et al., 2004; Lammerding et al., 2004). Whether this result solely reects direct physical links between the nucleus and cytoskeleton or indirect effects of signaling pathways that are also modied by alterations in the nuclear envelope (see below) is presently unclear. The Nuclear Envelope as a Regulator of Signaling Pathways As the largest and most compression-resistant membranebound organelle in the cell, the nucleus has been likened to a molecular shock absorber (Dahl et al., 2004). Theoretically, movement of such a large, non-deformable organelle through the cytoplasm will result in tensile and/or compressive forces. Mediated by nuclear connections to the cytoskeleton, these forces could be transmitted to distal sites that are mechanical transducers, such as integrin-based focal adhesions or cadherin-based cell-cell adhesions (Leckband et al., 2011; Parsons et al., 2010). In a sense, the nucleus would act like the bead in Ingbers experiment, except that force would originate inside rather than outside of the cell. Given that adhesions respond to mechanical stimuli by regulating Rho GTPase and mitogen-activated protein (MAP) kinase signaling, the prediction is that nuclear movement may affect the activity of these pathways. The idea that nuclear movement may regulate cellular signaling pathways has not been directly tested. Yet there is evidence that alterations in the nuclear movement toolbox alter signaling pathways. Lamin A variants that cause myopathies increase MAP kinase signaling, as does knockdown of A-type lamins or emerin (Muchir et al., 2007, 2009). Similar results have been obtained for Rho signaling (Hale et al., 2008). Given that alterations in A-type lamins interfere with actin-dependent nuclear movement (Folker et al., 2011), it is possible that changes in signaling result from altered nuclear positioning. A-type lamins may also affect signaling by interacting with proteins in the pathway, for example, by binding the MAP kinase lez et al., 2008). KASH proteins may recruit ERK1/2 (Gonza signaling molecules to the nuclear envelope and regulate their activities, as nesprin-2 binds active ERK1/2, and its knockdown results in prolonged ERK1/2 activity (Warren et al., 2010). As other actin-dependent membrane structures such as focal adhesions regulate signaling, TAN lines assembled on the surface of the nuclear envelope may also.
Cell 152, March 14, 2013 2013 Elsevier Inc. 1385

Nuclear Position as a Response Regulator of Signaling Pathways The position of the nucleus may also alter its responsiveness to pathways that regulate transcription and mRNA transport and localization. It is generally assumed that latent cytoplasmic transcription factors and second messengers activated by plasma membrane receptors reach the nucleus in an unabated fashion. However, the distance that they travel may depend on encounters with costimulatory and inhibitory factors in the cytoplasm (Calvo et al., 2010). Thus, the nucleuss position relative to the origin of an external signal may modulate its response. This could be particularly important for asymmetrically encountered signals, for example, on the apical or basal aspects of epithelia or in gradients of external factors during development. The spatial relationship between the nucleus and the primary cilium changes in many developing epithelia, such as the neuroepithelium, and may affect the output of signaling pathways, such as the Sonic hedgehog pathway that requires the cilium (Goetz and Anderson, 2010). Signaling from intracellular sites, such as the signaling endosome, may enhance responsiveness by bringing the signal in close proximity to the nucleus. Only one study has directly examined the relationship between nuclear position and asymmetrical signaling (Del Bene et al., 2008). A gradient of Notch signaling, highest at the apical surface, exists in the retinal neuroepithelium, as in other epithelia (Murciano et al., 2002). INM moves the nucleus basally during G1, exposing it to lower Notch activity. A mutation in the zebrash mok gene encoding the dynactin p150glued subunit causes longer and faster basal nuclear excursions, resulting in increased basal mitoses and the formation of early differentiating neurons at the expense of later ones (Del Bene et al., 2008). Notch overexpression rescues the mok phenotype, showing that it results from inadequate exposure of the nucleus to Notch due to defective nuclear movement. Alterations in Syne-2 lead to similar changes in INM and cell fate in zebrash retina (Tsujikawa et al., 2007). Deciencies in Cep120 and TACC, proteins that affect the centrosome-MT connection, or in nesprin-2 or SUN1/2 also affect INM in developing mouse cerebral cortex and lead to early depletion of neural progenitors (Xie et al., 2007; Zhang et al., 2009). Although altered cell fate has not yet been demonstrated in these studies, they are consistent with altered response to Notch or other apical signals. Conclusions Rather than being a passive or random phenomenon, active mechanisms exist to position nuclei in cells. We have reviewed the molecular tools and mechanisms that move and position nuclei, most of which are conserved among eukaryotes. Human diseases result from genetic abnormalities in nuclear movement toolbox proteins and, in some cases, are linked to altered nuclear movement. We have highlighted potential mechanisms by which nuclear position may inuence cellular processes and disease pathogenesis. Additional investigation is needed to understand how the nucleus affects these processes and to separate direct from indirect effects of its positioning. Future basic research on nuclear positioning and how it affects cellular processes is likely to signicantly impact public health.
1386 Cell 152, March 14, 2013 2013 Elsevier Inc.

ACKNOWLEDGMENTS We thank Susumu Antoku, Wakam Chang, Edgar Gomes, Gant Luxton, and Alex Palazzo for their comments and Wakam Chang for Figures 1B and 2A. The authors are supported by NIH grants R01GM099481, R01NS059352, R01HD070713, and R01AR048997.

REFERENCES Adames, N.R., and Cooper, J.A. (2000). Microtubule interactions with the cell cortex causing nuclear movements in Saccharomyces cerevisiae. J. Cell Biol. 149, 863874. Attali, R., Warwar, N., Israel, A., Gurt, I., McNally, E., Puckelwartz, M., Glick, B., Nevo, Y., Ben-Neriah, Z., and Melki, J. (2009). Mutation of SYNE-1, encoding an essential component of the nuclear lamina, is responsible for autosomal recessive arthrogryposis. Hum. Mol. Genet. 18, 34623469. licher, F., and Theriot, J.A. (2010). Bipedal locoBarnhart, E.L., Allen, G.M., Ju motion in crawling cells. Biophys. J. 98, 933942. Beadle, C., Assanah, M.C., Monzo, P., Vallee, R., Rosenfeld, S.S., and Canoll, P. (2008). The role of myosin II in glioma invasion of the brain. Mol. Biol. Cell 19, 33573368. Bione, S., Maestrini, E., Rivella, S., Mancini, M., Regis, S., Romeo, G., and Toniolo, D. (1994). Identication of a novel X-linked gene responsible for Emery-Dreifuss muscular dystrophy. Nat. Genet. 8, 323327. cane, H.M., Hammouda, E.H., Bonne, G., Di Barletta, M.R., Varnous, S., Be Merlini, L., Muntoni, F., Greenberg, C.R., Gary, F., Urtizberea, J.A., et al. (1999). Mutations in the gene encoding lamin A/C cause autosomal dominant Emery-Dreifuss muscular dystrophy. Nat. Genet. 21, 285288. , F., Gorja na cz, M., Koch, B., Borrego-Pinto, J., Jegou, T., Osorio, D.S., Aurade Mattaj, I.W., and Gomes, E.R. (2012). Samp1 is a component of TAN lines and is required for nuclear movement. J. Cell Sci. 125, 10991105. Broers, J.L., Peeters, E.A., Kuijpers, H.J., Endert, J., Bouten, C.V., Oomens, C.W., Baaijens, F.P., and Ramaekers, F.C. (2004). Decreased mechanical stiffness in LMNA-/- cells is caused by defective nucleo-cytoskeletal integrity: implications for the development of laminopathies. Hum. Mol. Genet. 13, 25672580. Buchman, J.J., and Tsai, L.H. (2008). Putting a notch in our understanding of nuclear migration. Cell 134, 912914. n ez, L., and Crespo, P. (2010). The Ras-ERK pathway: Calvo, F., Agudo-Iba understanding site-specic signaling provides hope of new anti-tumor therapies. Bioessays 32, 412421. uninger, Cappello, S., Attardo, A., Wu, X., Iwasato, T., Itohara, S., Wilsch-Bra M., Eilken, H.M., Rieger, M.A., Schroeder, T.T., Huttner, W.B., et al. (2006). The Rho-GTPase cdc42 regulates neural progenitor fate at the apical surface. Nat. Neurosci. 9, 10991107. Cofnier, C., Chang, S.Y., Nobumori, C., Tu, Y., Farber, E.A., Toth, J.I., Fong, L.G., and Young, S.G. (2010). Abnormal development of the cerebral cortex and cerebellum in the setting of lamin B2 deciency. Proc. Natl. Acad. Sci. USA 107, 50765081. Cofnier, C., Jung, H.J., Nobumori, C., Chang, S., Tu, Y., Barnes, R.H., 2nd, Yoshinaga, Y., de Jong, P.J., Vergnes, L., Reue, K., et al. (2011). Deciencies in lamin B1 and lamin B2 cause neurodevelopmental defects and distinct nuclear shape abnormalities in neurons. Mol. Biol. Cell 22, 46834693. Crisp, M., Liu, Q., Roux, K., Rattner, J.B., Shanahan, C., Burke, B., Stahl, P.D., and Hodzic, D. (2006). Coupling of the nucleus and cytoplasm: role of the LINC complex. J. Cell Biol. 172, 4153. Dahl, K.N., Kahn, S.M., Wilson, K.L., and Discher, D.E. (2004). The nuclear envelope lamina network has elasticity and a compressibility limit suggestive of a molecular shock absorber. J. Cell Sci. 117, 47794786. Dauer, W.T., and Worman, H.J. (2009). The nuclear envelope as a signaling node in development and disease. Dev. Cell 17, 626638.

Del Bene, F., Wehman, A.M., Link, B.A., and Baier, H. (2008). Regulation of neurogenesis by interkinetic nuclear migration through an apical-basal notch gradient. Cell 134, 10551065. Desai, R.A., Gao, L., Raghavan, S., Liu, W.F., and Chen, C.S. (2009). Cell polarity triggered by cell-cell adhesion via E-cadherin. J. Cell Sci. 122, 905911. Dupin, I., Sakamoto, Y., and Etienne-Manneville, S. (2011). Cytoplasmic intermediate laments mediate actin-driven positioning of the nucleus. J. Cell Sci. 124, 865872. Fan, S.S., and Ready, D.F. (1997). Glued participates in distinct microtubulebased activities in Drosophila eye development. Development 124, 1497 1507. stlund, C., Luxton, G.W., Worman, H.J., and Gundersen, G.G. Folker, E.S., O (2011). Lamin A variants that cause striated muscle disease are defective in anchoring transmembrane actin-associated nuclear lines for nuclear movement. Proc. Natl. Acad. Sci. USA 108, 131136. Folker, E.S., Schulman, V.K., and Baylies, M.K. (2012). Muscle length and myonuclear position are independently regulated by distinct Dynein pathways. Development 139, 38273837. Fridolfsson, H.N., and Starr, D.A. (2010). Kinesin-1 and dynein at the nuclear envelope mediate the bidirectional migrations of nuclei. J. Cell Biol. 191, 115128. Fridolfsson, H.N., Ly, N., Meyerzon, M., and Starr, D.A. (2010). UNC-83 coordinates kinesin-1 and dynein activities at the nuclear envelope during nuclear migration. Dev. Biol. 338, 237250. Friedl, P., Wolf, K., and Lammerding, J. (2011). Nuclear mechanics during cell migration. Curr. Opin. Cell Biol. 23, 5564. Gladfelter, A., and Berman, J. (2009). Dancing genomes: fungal nuclear positioning. Nat. Rev. Microbiol. 7, 875886. Gleeson, J.G., Allen, K.M., Fox, J.W., Lamperti, E.D., Berkovic, S., Scheffer, I., Cooper, E.C., Dobyns, W.B., Minnerath, S.R., Ross, M.E., and Walsh, C.A. (1998). Doublecortin, a brain-specic gene mutated in human X-linked lissencephaly and double cortex syndrome, encodes a putative signaling protein. Cell 92, 6372. Godin, J.D., Thomas, N., Laguesse, S., Malinouskaya, L., Close, P., Malaise, O., Purnelle, A., Raineteau, O., Campbell, K., Fero, M., et al. (2012). p27(Kip1) is a microtubule-associated protein that promotes microtubule polymerization during neuron migration. Dev. Cell 23, 729744. Goetz, S.C., and Anderson, K.V. (2010). The primary cilium: a signalling centre during vertebrate development. Nat. Rev. Genet. 11, 331344. Gomes, E.R., Jani, S., and Gundersen, G.G. (2005). Nuclear movement regulated by Cdc42, MRCK, myosin, and actin ow establishes MTOC polarization in migrating cells. Cell 121, 451463. nczy, P., Pichler, S., Kirkham, M., and Hyman, A.A. (1999). Cytoplasmic Go dynein is required for distinct aspects of MTOC positioning, including centrosome separation, in the one cell stage Caenorhabditis elegans embryo. J. Cell Biol. 147, 135150. lez, J.M., Navarro-Puche, A., Casar, B., Crespo, P., and Andre s, V. Gonza (2008). Fast regulation of AP-1 activity through interaction of lamin A/C, ERK1/2, and c-Fos at the nuclear envelope. J. Cell Biol. 183, 653666. ffer, E., Stelzer, E.H., and Hyman, A.A. (2003). The Grill, S.W., Howard, J., Scha distribution of active force generators controls mitotic spindle position. Science 301, 518521. , N., Dion, P., Fox, M.A., Laurent, S., Verreault, S., Sanes, Gros-Louis, F., Dupre J.R., Bouchard, J.P., and Rouleau, G.A. (2007). Mutations in SYNE1 lead to a newly discovered form of autosomal recessive cerebellar ataxia. Nat. Genet. 39, 8085. Hale, C.M., Shrestha, A.L., Khatau, S.B., Stewart-Hutchinson, P.J., Hernandez, L., Stewart, C.L., Hodzic, D., and Wirtz, D. (2008). Dysfunctional connections between the nucleus and the actin and microtubule networks in laminopathic models. Biophys. J. 95, 54625475. Haque, F., Lloyd, D.J., Smallwood, D.T., Dent, C.L., Shanahan, C.M., Fry, A.M., Trembath, R.C., and Shackleton, S. (2006). SUN1 interacts with nuclear

lamin A and cytoplasmic nesprins to provide a physical connection between the nuclear lamina and the cytoskeleton. Mol. Cell. Biol. 26, 37383751. Horn, H.F., Brownstein, Z., Lenz, D.R., Shivatzki, S., Dror, A.A., Dagan-Rosenfeld, O., Friedman, L.M., Roux, K.J., Kozlov, S., Jeang, K.-T., et al. (2013). The LINC complex is essential for hearing. J. Clin. Invest. 123, 740750. Ji, J.Y., Lee, R.T., Vergnes, L., Fong, L.G., Stewart, C.L., Reue, K., Young, S.G., Zhang, Q., Shanahan, C.M., and Lammerding, J. (2007). Cell nuclei spin in the absence of lamin B1. J. Biol. Chem. 282, 2001520026. Keays, D.A., Tian, G., Poirier, K., Huang, G.J., Siebold, C., Cleak, J., Oliver, r, Z., et al. (2007). Mutations in alpha-tubulin P.L., Fray, M., Harvey, R.J., Molna cause abnormal neuronal migration in mice and lissencephaly in humans. Cell 128, 4557. Khatau, S.B., Hale, C.M., Stewart-Hutchinson, P.J., Patel, M.S., Stewart, C.L., Searson, P.C., Hodzic, D., and Wirtz, D. (2009). A perinuclear actin cap regulates nuclear shape. Proc. Natl. Acad. Sci. USA 106, 1901719022. King, M.C., Drivas, T.G., and Blobel, G. (2008). A network of nuclear envelope membrane proteins linking centromeres to microtubules. Cell 134, 427438. Koizumi, H., Higginbotham, H., Poon, T., Tanaka, T., Brinkman, B.C., and Gleeson, J.G. (2006). Doublecortin maintains bipolar shape and nuclear translocation during migration in the adult forebrain. Nat. Neurosci. 9, 779786. Kracklauer, M.P., Banks, S.M., Xie, X., Wu, Y., and Fischer, J.A. (2007). Drosophila klaroid encodes a SUN domain protein required for Klarsicht localization to the nuclear envelope and nuclear migration in the eye. Fly (Austin) 1, 7585. Lammerding, J., Schulze, P.C., Takahashi, T., Kozlov, S., Sullivan, T., Kamm, R.D., Stewart, C.L., and Lee, R.T. (2004). Lamin A/C deciency causes defective nuclear mechanics and mechanotransduction. J. Clin. Invest. 113, 370378. Leckband, D.E., le Duc, Q., Wang, N., and de Rooij, J. (2011). Mechanotransduction at cadherin-mediated adhesions. Curr. Opin. Cell Biol. 23, 523530. Lee, J.S., Hale, C.M., Panorchan, P., Khatau, S.B., George, J.P., Tseng, Y., Stewart, C.L., Hodzic, D., and Wirtz, D. (2007). Nuclear lamin A/C deciency induces defects in cell mechanics, polarization, and migration. Biophys. J. 93, 25422552. Lei, K., Zhang, X., Ding, X., Guo, X., Chen, M., Zhu, B., Xu, T., Zhuang, Y., Xu, R., and Han, M. (2009). SUN1 and SUN2 play critical but partially redundant roles in anchoring nuclei in skeletal muscle cells in mice. Proc. Natl. Acad. Sci. USA 106, 1020710212. Levy, J.R., and Holzbaur, E.L. (2008). Dynein drives nuclear rotation during forward progression of motile broblasts. J. Cell Sci. 121, 31873195. Liu, Q., Pante, N., Misteli, T., Elsagga, M., Crisp, M., Hodzic, D., Burke, B., and Roux, K.J. (2007). Functional association of Sun1 with nuclear pore complexes. J. Cell Biol. 178, 785798. Luxton, G.W., and Gundersen, G.G. (2011). Orientation and function of the nuclear-centrosomal axis during cell migration. Curr. Opin. Cell Biol. 23, 579588. Luxton, G.W., Gomes, E.R., Folker, E.S., Vintinner, E., and Gundersen, G.G. (2010). Linear arrays of nuclear envelope proteins harness retrograde actin ow for nuclear movement. Science 329, 956959. Luxton, G.W., Gomes, E.R., Folker, E.S., Worman, H.J., and Gundersen, G.G. (2011). TAN lines: a novel nuclear envelope structure involved in nuclear positioning. Nucleus 2, 173181. Malone, C.J., Misner, L., Le Bot, N., Tsai, M.C., Campbell, J.M., Ahringer, J., and White, J.G. (2003). The C. elegans hook protein, ZYG-12, mediates the essential attachment between the centrosome and nucleus. Cell 115, 825836. Maniotis, A.J., Chen, C.S., and Ingber, D.E. (1997). Demonstration of mechanical connections between integrins, cytoskeletal laments, and nucleoplasm that stabilize nuclear structure. Proc. Natl. Acad. Sci. USA 94, 849854. McKenney, R.J., Vershinin, M., Kunwar, A., Vallee, R.B., and Gross, S.P. (2010). LIS1 and NudE induce a persistent dynein force-producing state. Cell 141, 304314.

Cell 152, March 14, 2013 2013 Elsevier Inc. 1387

McNiven, M.A. (2013). Breaking away: matrix remodeling from the leading edge. Trends Cell Biol. 23, 1621. Metzger, T., Gache, V., Xu, M., Cadot, B., Folker, E.S., Richardson, B.E., Gomes, E.R., and Baylies, M.K. (2012). MAP and kinesin-dependent nuclear positioning is required for skeletal muscle function. Nature 484, 120124. Mislow, J.M., Holaska, J.M., Kim, M.S., Lee, K.K., Segura-Totten, M., Wilson, K.L., and McNally, E.M. (2002). Nesprin-1alpha self-associates and binds directly to emerin and lamin A in vitro. FEBS Lett. 525, 135140. Morimoto, A., Shibuya, H., Zhu, X., Kim, J., Ishiguro, K., Han, M., and Watanabe, Y. (2012). A conserved KASH domain protein associates with telomeres, SUN1, and dynactin during mammalian meiosis. J. Cell Biol. 198, 165172. Morris, N.R., Emov, V.P., and Xiang, X. (1998). Nuclear migration, nucleokinesis and lissencephaly. Trends Cell Biol. 8, 467470. Mosley-Bishop, K.L., Li, Q., Patterson, L., and Fischer, J.A. (1999). Molecular analysis of the klarsicht gene and its role in nuclear migration within differentiating cells of the Drosophila eye. Curr. Biol. 9, 12111220. Muchir, A., Pavlidis, P., Decostre, V., Herron, A.J., Arimura, T., Bonne, G., and Worman, H.J. (2007). Activation of MAPK pathways links LMNA mutations to cardiomyopathy in Emery-Dreifuss muscular dystrophy. J. Clin. Invest. 117, 12821293. Muchir, A., Wu, W., and Worman, H.J. (2009). Reduced expression of A-type lamins and emerin activates extracellular signal-regulated kinase in cultured cells. Biochim. Biophys. Acta 1792, 7581. pez-Sa nchez, J., and Frade, J.M. (2002). InterkiMurciano, A., Zamora, J., Lo netic nuclear movement may provide spatial clues to the regulation of neurogenesis. Mol. Cell. Neurosci. 21, 285300. Nery, F.C., Zeng, J., Niland, B.P., Hewett, J., Farley, J., Irimia, D., Li, Y., Wiche, G., Sonnenberg, A., and Breakeeld, X.O. (2008). TorsinA binds the KASH domain of nesprins and participates in linkage between nuclear envelope and cytoskeleton. J. Cell Sci. 121, 34763486. Norden, C., Young, S., Link, B.A., and Harris, W.A. (2009). Actomyosin is the main driver of interkinetic nuclear migration in the retina. Cell 138, 11951208. Osmani, N., Vitale, N., Borg, J.P., and Etienne-Manneville, S. (2006). Scrib controls Cdc42 localization and activity to promote cell polarization during astrocyte migration. Curr. Biol. 16, 23952405. stlund, C., Folker, E.S., Choi, J.C., Gomes, E.R., Gundersen, G.G., and WorO man, H.J. (2009). Dynamics and molecular interactions of linker of nucleoskeleton and cytoskeleton (LINC) complex proteins. J. Cell Sci. 122, 40994108. Padiath, Q.S., Saigoh, K., Schiffmann, R., Asahara, H., Yamada, T., Koeppen, cek, L.J., and Fu, Y.H. (2006). Lamin B1 duplications cause A., Hogan, K., Pta autosomal dominant leukodystrophy. Nat. Genet. 38, 11141123. Palazzo, A.F., Joseph, H.L., Chen, Y.J., Dujardin, D.L., Alberts, A.S., Pster, K.K., Vallee, R.B., and Gundersen, G.G. (2001). Cdc42, dynein, and dynactin regulate MTOC reorientation independent of Rho-regulated microtubule stabilization. Curr. Biol. 11, 15361541. Parsons, J.T., Horwitz, A.R., and Schwartz, M.A. (2010). Cell adhesion: integrating cytoskeletal dynamics and cellular tension. Nat. Rev. Mol. Cell Biol. 11, 633643. Patterson, K., Molofsky, A.B., Robinson, C., Acosta, S., Cater, C., and Fischer, J.A. (2004). The functions of Klarsicht and nuclear lamin in developmentally regulated nuclear migrations of photoreceptor cells in the Drosophila eye. Mol. Biol. Cell 15, 600610. Razafsky, D., and Hodzic, D. (2009). Bringing KASH under the SUN: the many faces of nucleo-cytoskeletal connections. J. Cell Biol. 186, 461472. Reiner, O., Carrozzo, R., Shen, Y., Wehnert, M., Faustinella, F., Dobyns, W.B., Caskey, C.T., and Ledbetter, D.H. (1993). Isolation of a Miller-Dieker lissencephaly gene containing G protein beta-subunit-like repeats. Nature 364, 717721. nczy, P. (1998). Mechanisms of nuclear positioning. J. Cell Reinsch, S., and Go Sci. 111, 22832295.

Roux, K.J., Crisp, M.L., Liu, Q., Kim, D., Kozlov, S., Stewart, C.L., and Burke, B. (2009). Nesprin 4 is an outer nuclear membrane protein that can induce kinesin-mediated cell polarization. Proc. Natl. Acad. Sci. USA 106, 21942199. Salpingidou, G., Smertenko, A., Hausmanowa-Petrucewicz, I., Hussey, P.J., and Hutchison, C.J. (2007). A novel role for the nuclear membrane protein emerin in association of the centrosome to the outer nuclear membrane. J. Cell Biol. 178, 897904. Schaar, B.T., and McConnell, S.K. (2005). Cytoskeletal coordination during neuronal migration. Proc. Natl. Acad. Sci. USA 102, 1365213657. uninger, M., Calegari, F., and Huttner, W.B. (2009). Schenk, J., Wilsch-Bra Myosin II is required for interkinetic nuclear migration of neural progenitors. Proc. Natl. Acad. Sci. USA 106, 1648716492. Schmoranzer, J., Fawcett, J.P., Segura, M., Tan, S., Vallee, R.B., Pawson, T., and Gundersen, G.G. (2009). Par3 and dynein associate to regulate local microtubule dynamics and centrosome orientation during migration. Curr. Biol. 19, 10651074. Schreiber, K.H., and Kennedy, B.K. (2013). When lamins go bad: Nucleur structure and disease. Cell 152, this issue, 13651375. Shu, T., Ayala, R., Nguyen, M.D., Xie, Z., Gleeson, J.G., and Tsai, L.H. (2004). Ndel1 operates in a common pathway with LIS1 and cytoplasmic dynein to regulate cortical neuronal positioning. Neuron 44, 263277. Solecki, D.J., Model, L., Gaetz, J., Kapoor, T.M., and Hatten, M.E. (2004). Par6alpha signaling controls glial-guided neuronal migration. Nat. Neurosci. 7, 11951203. Solecki, D.J., Trivedi, N., Govek, E.E., Kerekes, R.A., Gleason, S.S., and Hatten, M.E. (2009). Myosin II motors and F-actin dynamics drive the coordinated movement of the centrosome and soma during CNS glial-guided neuronal migration. Neuron 63, 6380. Sosa, B.A., Rothballer, A., Kutay, U., and Schwartz, T.U. (2012). LINC complexes form by binding of three KASH peptides to domain interfaces of trimeric SUN proteins. Cell 149, 10351047. Splinter, D., Tanenbaum, M.E., Lindqvist, A., Jaarsma, D., Flotho, A., Yu, K.L., Grigoriev, I., Engelsma, D., Haasdijk, E.D., Keijzer, N., et al. (2010). Bicaudal D2, dynein, and kinesin-1 associate with nuclear pore complexes and regulate centrosome and nuclear positioning during mitotic entry. PLoS Biol. 8, e1000350. Starr, D.A., and Han, M. (2002). Role of ANC-1 in tethering nuclei to the actin cytoskeleton. Science 298, 406409. Starr, D.A., and Fridolfsson, H.N. (2010). Interactions between nuclei and the cytoskeleton are mediated by SUN-KASH nuclear-envelope bridges. Annu. Rev. Cell Dev. Biol. 26, 421444. Sullivan, T., Escalante-Alcalde, D., Bhatt, H., Anver, M., Bhat, N., Nagashima, K., Stewart, C.L., and Burke, B. (1999). Loss of A-type lamin expression compromises nuclear envelope integrity leading to muscular dystrophy. J. Cell Biol. 147, 913920. Swan, A., Nguyen, T., and Suter, B. (1999). Drosophila Lissencephaly-1 functions with Bic-D and dynein in oocyte determination and nuclear positioning. Nat. Cell Biol. 1, 444449. Tanabe, L.M., Kim, C.E., Alagem, N., and Dauer, W.T. (2009). Primary dystonia: molecules and mechanisms. Nat. Rev. Neurol. 5, 598609. Tassin, A.M., Maro, B., and Bornens, M. (1985). Fate of microtubule-organizing centers during myogenesis in vitro. J. Cell Biol. 100, 3546. Taverna, E., and Huttner, W.B. (2010). Neural progenitor nuclei IN motion. Neuron 67, 906914. Tomlinson, A., and Ready, D.F. (1986). Sevenless: a cell-specic homeotic mutation of the Drosophila eye. Science 231, 400402. , S., and Chang, F. (2001). A mechanism Tran, P.T., Marsh, L., Doye, V., Inoue for nuclear positioning in ssion yeast based on microtubule pushing. J. Cell Biol. 153, 397411. Tsai, L.H., and Gleeson, J.G. (2005). Nucleokinesis in neuronal migration. Neuron 46, 383388.

1388 Cell 152, March 14, 2013 2013 Elsevier Inc.

Tsai, F.C., and Meyer, T. (2012). Ca2+ pulses control local cycles of lamellipodia retraction and adhesion along the front of migrating cells. Curr. Biol. 22, 837842. Tsai, J.W., Chen, Y., Kriegstein, A.R., and Vallee, R.B. (2005). LIS1 RNA interference blocks neural stem cell division, morphogenesis, and motility at multiple stages. J. Cell Biol. 170, 935945. Tsai, J.W., Bremner, K.H., and Vallee, R.B. (2007). Dual subcellular roles for LIS1 and dynein in radial neuronal migration in live brain tissue. Nat. Neurosci. 10, 970979. Tsai, J.W., Lian, W.N., Kemal, S., Kriegstein, A.R., and Vallee, R.B. (2010). Kinesin 3 and cytoplasmic dynein mediate interkinetic nuclear migration in neural stem cells. Nat. Neurosci. 13, 14631471. Tsujikawa, M., Omori, Y., Biyanwila, J., and Malicki, J. (2007). Mechanism of positioning the cell nucleus in vertebrate photoreceptors. Proc. Natl. Acad. Sci. USA 104, 1481914824. Warren, D.T., Tajsic, T., Mellad, J.A., Searles, R., Zhang, Q., and Shanahan, C.M. (2010). Novel nuclear nesprin-2 variants tether active extracellular signal-regulated MAPK1 and MAPK2 at promyelocytic leukemia protein nuclear bodies and act to regulate smooth muscle cell proliferation. J. Biol. Chem. 285, 13111320. Welte, M.A. (2004). Bidirectional transport along microtubules. Curr. Biol. 14, R525R537. Wilhelmsen, K., Litjens, S.H., Kuikman, I., Tshimbalanga, N., Janssen, H., van den Bout, I., Raymond, K., and Sonnenberg, A. (2005). Nesprin-3, a novel outer nuclear membrane protein, associates with the cytoskeletal linker protein plectin. J. Cell Biol. 171, 799810. Wilkie, G.S., Korfali, N., Swanson, S.K., Malik, P., Srsen, V., Batrakou, D.G., de las Heras, J., Zuleger, N., Kerr, A.R., Florens, L., et al. (2011). Several novel nuclear envelope transmembrane proteins identied in skeletal muscle have cytoskeletal associations. Mol. Cell. Proteomics, 10, M110.003129. Xie, Z., Moy, L.Y., Sanada, K., Zhou, Y., Buchman, J.J., and Tsai, L.H. (2007). Cep120 and TACCs control interkinetic nuclear migration and the neural progenitor pool. Neuron 56, 7993.

Yu, J., Lei, K., Zhou, M., Craft, C.M., Xu, G., Xu, T., Zhuang, Y., Xu, R., and Han, M. (2011). KASH protein Syne-2/Nesprin-2 and SUN proteins SUN1/2 mediate nuclear migration during mammalian retinal development. Hum. Mol. Genet. 20, 10611073. Zhang, Q., Skepper, J.N., Yang, F., Davies, J.D., Hegyi, L., Roberts, R.G., Weissberg, P.L., Ellis, J.A., and Shanahan, C.M. (2001). Nesprins: a novel family of spectrin-repeat-containing proteins that localize to the nuclear membrane in multiple tissues. J. Cell Sci. 114, 44854498. Zhang, Q., Ragnauth, C.D., Skepper, J.N., Worth, N.F., Warren, D.T., Roberts, R.G., Weissberg, P.L., Ellis, J.A., and Shanahan, C.M. (2005). Nesprin-2 is a multi-isomeric protein that binds lamin and emerin at the nuclear envelope and forms a subcellular network in skeletal muscle. J. Cell Sci. 118, 673687. Zhang, Q., Bethmann, C., Worth, N.F., Davies, J.D., Wasner, C., Feuer, A., Ragnauth, C.D., Yi, Q., Mellad, J.A., Warren, D.T., et al. (2007a). Nesprin-1 and -2 are involved in the pathogenesis of Emery Dreifuss muscular dystrophy and are critical for nuclear envelope integrity. Hum. Mol. Genet. 16, 2816 2833. Zhang, X., Xu, R., Zhu, B., Yang, X., Ding, X., Duan, S., Xu, T., Zhuang, Y., and Han, M. (2007b). Syne-1 and Syne-2 play crucial roles in myonuclear anchorage and motor neuron innervation. Development 134, 901908. Zhang, X., Lei, K., Yuan, X., Wu, X., Zhuang, Y., Xu, T., Xu, R., and Han, M. (2009). SUN1/2 and Syne/Nesprin-1/2 complexes connect centrosome to the nucleus during neurogenesis and neuronal migration in mice. Neuron 64, 173187. Zhao, T., Graham, O.S., Raposo, A., and St Johnston, D. (2012). Growing microtubules push the oocyte nucleus to polarize the Drosophila dorsalventral axis. Science 336, 9991003. Zhou, X., Graumann, K., Evans, D.E., and Meier, I. (2012a). Novel plant SUNKASH bridges are involved in RanGAP anchoring and nuclear shape determination. J. Cell Biol. 196, 203211. Zhou, Z., Du, X., Cai, Z., Song, X., Zhang, H., Mizuno, T., Suzuki, E., Yee, M.R., Berezov, A., Murali, R., et al. (2012b). Structure of Sad1-UNC84 homology (SUN) domain denes features of molecular bridge in nuclear envelope. J. Biol. Chem. 287, 53175326.

Cell 152, March 14, 2013 2013 Elsevier Inc. 1389

SnapShot: Replication Timing


1

Benjamin D. Pope,1 Oscar M. Aparicio,2 and David M. Gilbert,1 Department of Biological Science, Florida State University, Tallahassee, FL 32306, USA 2 Molecular and Computational Biology Program, University of Southern California, Los Angeles, CA 90089, USA YEAST MAMMALS
Replication foci

Replication foci
5 m

REPLICATION DOMAINS
1 m

Early replicating

NUCLEAR INTERIOR

Replication
Late replicating Constitutively early
NUCLEAR LAMINA

Prereplicative complex Replisomes Unreplicated DNA Replicated DNA

GC rich Gene rich High nuclease sensitivity

Intermediate sequence composition Low nuclease sensitivity Dynamic chromatin marks Correlated with transcription

Developmentally regulated

AT rich Gene poor Low nuclease sensitivity

Constitutively late

Interdomain interaction

cis boundary

Species Yeast Mammals

Genome size 12-14 Mb 2-4 Gb

S-phase length <1 hr 8-10 hr

Replication fork rate 1-2 kb/min 1-2 kb/min

Number of potential origins 500-1,000 >250,000

Number of replicons per S 100-200 25,000-50,000

Number of foci per S 15-30 5,000-10,000

Replicons/ focus 4-7 6-20

Number of domains 40-70? 4,000-5,000

MAMMALS

High chromatin mobility


Pre-RC assembly Timing decision point (TDP)

Domains repositioned at the TDP remain anchored for the remainder of interphase
Origin decision point (ODP) Restriction point Early replication Late replication Timing determinants lost

Chromosome condensation

G1 Timing decision point (TDP)

EARLY S Early replication

LATE S Late replication

G2

x
Domain size 250 kb 400-800 kb M Prereplicative complex Cdc45 Fkh1/2 Rif1, Taz1, Yku70 or Rpd3 Replisome

Differentiation

YEAST

Early origin Limited pool

Late origin Cdc45 loading stimulated by Fkh1/2 and inhibited by Rif1, Taz1, Yku70 or Rpd3 deacetylation Initiation of Cdc45 loaded origins and removal of Rif1/Taz1/Yku70/Rpd3mediated repression

Terminating replisomes release Cdc45 to activate unfired origins

1390

Cell 152, March 14, 2013 2013 Elsevier Inc.

DOI http://dx.doi.org/10.1016/j.cell.2013.02.038

See online version for legend and references.

SnapShot: Replication Timing


1

Benjamin D. Pope,1 Oscar M. Aparicio,2 and David M. Gilbert,1 Department of Biological Science, Florida State University, Tallahassee, FL 32306, USA 2 Molecular and Computational Biology Program, University of Southern California, Los Angeles, CA 90089, USA

All organisms use similar principles to duplicate DNA at replication forks (Yao and ODonnell, 2010). However, eukaryotic cells contain large chromosomes with hundreds to thousands of replication origins and complex, heterogeneous chromatin. Conserved cell-cycle and checkpoint mechanisms ensure one complete round of replication (Labib, 2010), and additional mechanisms coordinate initiation at the many replicons (regions replicated from a single origin) in space and time. A temporal order to genome replication balances replication with limiting cellular resources such as initiation factors and nucleotide pools (Aparicio, 2013; Rhind and Gilbert, 2013). Chromatin features regulate replication timing by controlling the access of initiation factors to replication origins. In fact, replication timing is one of the few cellular functions that are clearly regulated at the level of large-scale/long-range chromatin folding. In mammals, this temporal order is regulated during development and is linked to transcriptional regulation (Nordman and Orr-Weaver, 2012; Rhind and Gilbert, 2013). Replication foci in budding and fission yeast are nuclear sites of active replication that can be visualized by fluorescently tagged replication fork proteins or nucleotide analogs (Kitamura et al., 2006; Meister et al., 2007). Foci (yellow) in the displayed image (originally published in Trends Cell Biol., December 2001) exhibit a pattern that is typical of early S in budding yeast. Chromatin is counterstained (red). These foci are mobile and frequently fuse with other foci or split to form new foci, making precise measurements of their characteristics difficult. The ratio of replication foci to active replication forks indicates that several closely spaced replicons are active simultaneously within each focus, although the organization can only be modeled imprecisely at this point. Replication foci in mammals are more numerous than in yeast and, relative to the size of the nucleus, less mobile (Maya-Mendoza et al., 2010; Rhind and Gilbert, 2013). An average focus replicates ~1 Mb of DNA in 4560 min. In the displayed image (originally published in Genome Res., June 2010), cells were dual labeled in successive pulses to visualize both early (green) and late (red) foci simultaneously, highlighting the spatial compartmentalization of chromatin replicated at different times during S phase. Foci labeled in one cell cycle are stable in appearance for many cycles, indicating that they are structural units and most likely the cytological equivalents of replication domains measured by molecular genomics methods. Like foci, replication domains are chromosomal units that are replicated coordinately by synchronously firing clusters of replicons, which also approach megabase size in mammals. At least half of replication domains are regulated to replicate at different times in different tissues (Nordman and Orr-Weaver, 2012; Rhind and Gilbert, 2013). Replicons associated with the nuclear interior are replicated early in S phase, whereas those adjacent to the lamina replicate later. Domains that switch replication timing during differentiation move between subnuclear compartments, as indicated both by physical position and by changes in interdomain chromatin interactions (Takebayashi et al., 2012). Replication domain boundaries may insulate chromatin types from each other, facilitating the differential replication timing of adjacent chromatin domains. In yeast, the extent to which clusters of origins form replication domains is controversial. At least four large (~250 kb) regions in budding yeast have distinctly late timing (McCune et al., 2008). In addition, each chromosome could be considered to contain several domainsthe centromere, each arm, and the telomerespossibly with a few additional subdomains. The range in number of potential origins reflects differences in budding (~500) and fission yeast (~1,000), wherein origin efficiencies are generally higher in budding yeast, probably resulting in similar replicon numbers. Exiting mitosis, the chromatin of mammalian cells is highly mobile and lacks determinants for replication timing. Within 12 hr, cells reach the timing decision point (TDP), when replication domains/foci anchor in their respective subnuclear positions for the remainder of interphase and simultaneously acquire the ability to dictate a replication-timing program (Rhind and Gilbert, 2013). Establishing this timing program occurs upstream of specifying which sites will be used for initiation (origin decision point) and the activation of S phase Cdk activity (restriction point). The replication-timing program is executed during S phase through the firing of several sequential groups of internally localized foci (green), followed by replication at the nuclear and nucleolar periphery (red) and, finally, a few sites of internally localized heterochromatin (not shown). Chromatin in G2 phase lacks determinants for replication timing, suggesting that such determinants are lost during replication. Coincident with the TDP (between mitosis and start in yeast), Fkh1/2 and Rif1/Taz1/Yku70/Rpd3 organize early and late-replicating chromatin, respectively. Fkh1/2 bind consensus DNA elements near early replicating origins and through interaction with the origin recognition complex (ORC) and/or through Fkh1/2 dimerization bring the origins into proximity (Aparicio, 2013). Fkh1/2 stimulates Cdc45 recruitment and loading onto prereplication complexes (pre-RCs) at early origins during G1 phase, facilitating early initiation. Rif1/Taz1/Yku70 position and tether telomeres and some internal sequences to the nuclear periphery, whereas Rpd3 modifies chromatin, isolating late-replicating chromatin from Cdc45 (and other initiation factors). The incorporation of the limited pool of Cdc45 (and other factors) into early replisomes delays firing of Rif1/Taz1/Yku70/Rpd3-repressed origins until termination of early replicons recycles Cdc45. REFErENCES Aparicio, O.M. (2013). Location, location, location: its all in the timing for replication origins. Genes Dev. 27, 117128. Kitamura, E., Blow, J.J., and Tanaka, T.U. (2006). Live-cell imaging reveals replication of individual replicons in eukaryotic replication factories. Cell 125, 12971308. Labib, K. (2010). How do Cdc7 and cyclin-dependent kinases trigger the initiation of chromosome replication in eukaryotic cells? Genes Dev. 24, 12081219. Maya-Mendoza, A., Olivares-Chauvet, P., Shaw, A., and Jackson, D.A. (2010). S phase progression in human cells is dictated by the genetic continuity of DNA foci. PLoS Genet. 6, e1000900. McCune, H.J., Danielson, L.S., Alvino, G.M., Collingwood, D., Delrow, J.J., Fangman, W.L., Brewer, B.J., and Raghuraman, M.K. (2008). The temporal program of chromosome replication: genomewide replication in clb5Delta Saccharomyces cerevisiae. Genetics 180, 18331847. Meister, P., Taddei, A., Ponti, A., Baldacci, G., and Gasser, S.M. (2007). Replication foci dynamics: replication patterns are modulated by S-phase checkpoint kinases in fission yeast. EMBO J. 26, 13151326. Nordman, J., and Orr-Weaver, T.L. (2012). Regulation of DNA replication during development. Development 139, 455464. Rhind, N., and Gilbert, D.M. (2013). Replication timing. In DNA Replication and Human Disease, S.D. Bell, M. Mechali, and M.L. DePamphilis, eds. (Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press). http://dx.doi.org/10.1101/cshperspect.a010132. Takebayashi, S., Dileep, V., Ryba, T., Dennis, J.H., and Gilbert, D.M. (2012). Chromatin-interaction compartment switch at developmentally regulated chromosomal domains reveals an unusual principle of chromatin folding. Proc. Natl. Acad. Sci. USA 109, 1257412579. Yao, N.Y., and M. ODonnell. (2010). SnapShot: The replisome. Cell 141, 1088e1081.

1390.e1 Cell 152, March 14, 2013 2013 Elsevier Inc. DOI http://dx.doi.org/10.1016/j.cell.2013.02.038

Vous aimerez peut-être aussi