Vous êtes sur la page 1sur 5
Serial Analysis of Gene Expression Victor B. Velculescu, Lin Zhang, Bert Vogelstein, Kenneth W. Kinzler Science, New Series, Volume 270, Issue 5235 (Oct. 20, 1995), 484-487. Stable URL: btp//links jstor.org/sici?sic!=0036-8075% 281995 1020% 293%3A270%3A5235%3C484%3ASAOGE%3E2,0.COW3B2-8 ‘Your use of the ISTOR archive indicates your acceptance of JSTOR’s Terms and Conditions of Use, available at hhup:/www.jstor org/about/terms.html. JSTOR’s Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the sereen or printed page of such transmission. Science is published by American Association for the Advancement of Science, Please contact the publisher for further permissions regarding the use of this work. Publisher contact information may be obtained at hup:/www.jstor.org/journals/aaas.html, Science (©1995 American Association for the Advancement of Science ISTOR and the ISTOR logo are trademarks of ISTOR, and are Registered in the U.S. Patent and Trademark Office For more information on ISTOR contact jstor-info@umich.edv, ©2002 JSTOR upulwww jstor.org/ Sun Sep 15 13:58:49 2002 Serial Analysis of Gene Expression Victor E. Velculescu, Lin Zhang, Bert Vogelstein, Kenneth W. Kinzler* The characteristics of an organism are determined by the genes expressed within it. A ‘method was developed, called serial analysis of gene expression (SAGE), that allows the quantitative and simultaneous analysis of a large number of transcripts. To demonstrate this strategy, short diagnostic sequence tags were isolated from pancreas, concatenated, and cloned. Manual sequencing of 1000 tags revealed a gene expression pattern char- acteristic of pancreatic function. New pancreatic transcripts corresponding to novel tags were identified. SAGE should provide a broadly applicable means for the quantitative Cataloging and comparison of expressed genes in variety of normal, developmental, and disease states. Determination of the genomic sequence of, higher organisms, including humans, is now a teal and attainable goal. However, this analysis tepresents only one level of genetic complexity. The ordered and timely expres- sion of this information represents another level of complexity equally important to the definition and biology of the organism. ‘Techniques based on complementary DNA. (cDNA) subtraction or differential display ccan be quite useful for comparing gene ex- pression differences berween two cell types (1), but provide only a partial picture, with no direct information about abun- dance. The expressed sequence tag (EST) approach is a valuable tool for gene dis- covery (2), but like RNA blotting, ribo- nuclease (RNase) protection, and reverse luansctiptase-polymerase chain reaction (RT-PCR) analysis (3), it evaluates only a limited number of genes at a time. Here we describe the serial analysis of gene expression (SAGE), a technique that al- lows a rapid, detailed analysis of thousands of transcripts. \GE js based on two principles. First, a short nucleotide sequence tag [9 to 10 base paits (bp)] contains sufficient information to uniquely identify a transcript, provided it is isolated from a defined position within the transcript. For example, a sequence as short as 9 bp can distinguish 262,144 tran- scripts (4°) given a random nucleotide dis- tribution at the tag site, whereas current estimates suggest that even the human ge: ‘nome encodes only about 80,000 transcripts (4). Second, concatenation of short se- ‘quence tags allows the efficient analysis of transcripts ina serial manner by the se- quencing of multiple tags within a single clone. As with serial communication by ‘TE Wasa and KW Rae, Oreo Cato and the Progam Human Genes and Meer Beloy, does Hoplins Unters, Blbmora, MD 21251, USK ‘Cdn and Voge, Hower Hughes Medal Irttuta Jorma Hopene Oncclegy Caer, Bane, O21", USA 408 computers, wherein information is trans- ‘mitted as a continuous string of data serial analysis of the sequence tags requires a means to establish the register and bound- aries of each tag. Figure [ shows how these principles were Implemented for the analysis of mRNA ex- pression. Double-stranded cDNA was syn- thesized from mRNA by means of a bio tinylated oligo(dT) primer. The cDNA was then cleaved with a restriction endonvcle- ase (anchoring enzyme) that would be ex pected to cleave most transcripts at least tance. Typiealy, restriction endoniicleases with 4-bp recognition sites were used for this purpose hecause they cleave every 256 bp (44) on average, whereas most tan. scripts ate considerably lnger-"The most 3° portion of the cleaved cDNA way then folated by binding to streptavidin beads This process provides a umique site on each teanscript that coresponds to the rest tion site located closest to the polyadeny- late [poly(A] til. The cDNA was then divided ie half and ligated via the anchor sng restriction site 10 one of two linkers containing a type IIS resteiction site ( sing enzyme). Type IIS restriction endo- tcleases cleave ata defined distance up 0 20 bp away from theie aymettc recogni- tion sites (5). The linkers are designed so that cleavage of the lization products with the tagging enayme results in release of the linker witha shor piece of the CDNA, v RRR 28222 Te AE « Tag Joomacans Divide in halt Lgate to linkers (A+B) Cleave with tagging enzyme (TE) aunt ena gate and armpit with primers A and 8 ‘Cleave with anchoring enzyme (AE) Bind to streptaviin beads “ y WE 22 ecoeeenee "CEACG#ASO0000000 Te AE ¥e Tea ——— aie ae ae Cleave with anchoring enzyme Isolate tags Concatenate ana cone y En ONE Fe Oo NCE 9. GAC ONDOGAGCAIECOTCOT|RT OS ag oP Tag ae AE tag —— AE Fig, 1. Schomatc of SAGE. The anchring enzyme is Niall and the tagging enzyme is Fok .Soquonces ‘cobras red and green represent pimer-daved sequences, whereas ble reprsntstranscip-dorved sequences, wth Xand Qindeating nuceotces of carat tags, Soo text fr futher explsation + VOL.270 + 20 0CTORER 1995 ats nas nerateonre a Turck me geen anton seats For example, Fig. | shows a combination of anchoring enzyme and tagging enzyme that would yield a 9-bp tag. After blunt tends were created, the two pools of released tags were ligated to each other. Ligated tags then served as templates for polymerase chain reaction (PCR) amplification with primers specific to cach linker. This step ‘Served several purposes in addition to allow ing amplification ofthe tag sequences. First, it provided for orientation and punctuation. ‘of the tag sequence in a very compact man- net. The resulting amplification products contained two tags (one ditag) linked tail to tail, flanked by sites for the anchoring en- zyme. In the final sequencing template, this, resulted in 4 bp of punctuation per ditag Second and most importantly, the analysis ‘of ditags, formed before any amplification stops, provided a means to completely elim- inate potential distortions introduced by PCR. Because the probability of any two tags being coupled in the same ditag is small, ven for abundant transcripts, repeat= ced ditags potentially produced by biased PCR could be excluded from analysis with ‘out substantially altering the final results, Cleavage of the PCR product with the an- choring enzyme allowed isolation of ditags that could then be concatenated by ligs tion, cloned, and sequenced. As a demonstration of this approach, ‘SAGE was used to characterize gene expres: sion in the human pancreas. We chose Nia IIL as the anchoring enzyme and Bsm Fl as the tagging enzyme, vielding a 9-bp tag (6). Com- puter analysis of human transcripts from Gen- Bank indicated that greater than 95% of tags of this length were likely to be unique and that inclusion of two additional bases provid- little additional resolution (7). As ourlined above, mRNA from human pancteas was wed Table 1. Pancreatic SAGE tags. Tag indicates the 9-bp sequence ientiying each tag, adjacent othe bp anchomng Nia Il ste. n and Percent cate the numberof times the ag was denied ar te frequency, respectiely. Gene ndeates the description aed accession umber ofthe GenBank release 87 entry found to exact match the dicated tag when the SAGE software group was used, wth the folowing exceptions. When mutipe enires were dented because of dypicated entries (7), only one lent i sted. For cFymotypsinogen and trypsinagen 1, other genes (adenosine triphosphatase and ‘yon ala ight chan, respectively] were dentifed that were precited fo conta the same fags, But ‘Lbsequent hyeidation and gequonce analyss ented the sted gones as the source ofthe tags. Alu entry estes a match with a GenBank entry fora ransopt that contained at least one copy ofthe Aly ‘conensus sequence (5) Teg Gene a Percent ‘Gagcacace Procarbonypeptidase At 167318) 64 78 Trererere Pancreat rypsinogen 2(M27602) 6 38 GAACACAAA (Chnmotnypsinogen (24400) a 4a TOAGGGTGA Pancreat trypsin 1422612) 3 37 acereicca Elastase HB (M1892) 20 2a Grererect Protease E (000806) 16 io TeATTGGCC Pancreatic pase (NS3265) 16 18 cCAGAGAGT Procarborypoptidase 8 (M1057) 4 7 TOCTOAAAA NNomatch fe Table, 2, 1) “ 7 agoorTast Bile sat stimulated nase (54857) 2 v4 GreTaoccr Nomaten 1 13 TeocaGace: No match fe Table 2P2) 9 "1 Gresaacce 21 Aly enties a 10 Gataactcr Nomaten a 10 AAGGTAACA Secretory typsin inhibitor (11949), 6 o7 TooocreTs Nomatcn 5 08 raaccacs Nomatch 5 08. CcTeTAATC. MQ1159, M0966, 11 Al entries 5 os. CACGTTGGA Nomatch 5 08. ‘RGCOCTACA Nomatcn 5 os aacaccTce Elongation factor 221 1692) 5 os ‘ACGCAGGGA Nomatch (see Table 2, P3), 5 08. 'ARTTGAAGA No match (see Table 2, Pe) 5 06 TroreTeGs Nomatch 2 05. TICATAGAC Nomatch 4 os. Greacacce NF-xB X61499), lu entry ($94541) 4 os Gtaaaacoe "TNE receptor I (55094), Au entry O14), 4 os, (GAAGACACA Nomatch 4 05. CCTEGGAAS Pancreatic mucin 108582) 4 os. cceaTeare Milochondal Gy oxidase 015759) 4 os SAGE tags occuring: Greater than thes times 330 452 Tiree times (15 = 3 45, 5a “Two times (22 x2 ot 78 One tie 251 418 Total SAGE tags B40 1000 SCIENCE + VOL 270 + 200CTORER 1995 See REP co generate ditags (8) that were cloned into a plasmid vector (9). Clones containing at Feast 10 tags (range 10 to >50) were identified by PCR amplification and! manually sequenced (10), Table 1 shows the analysis of the fist 1000 tags. Sixteen pereent were eliminated Ihecause they cither had sequence ambiguities co were derived from linker sequences. The remaining 840 tag. included 351 tags chat cccurred once and 77 tgs that were identified multiple times (Table 1). Nine ofthe 10 most abundant rags matched at leat one entry in GenBank release 87 (Table 1). The remain- ing tag was subsequently shown to be derived from amylase (see below). All 10 transcripts were derived from genes of known pancreatic function, and thie prevalence was consistent with previous analyses of pancreatic RNA {hrough conventional approaches (11) “The quantitative nature of SAGE was evaluated by constuction ofan oligo(dT)- primed pancreatic cDNA library that was sereened with DNA probes for trypsinogen 1 and 2, procarboxypeptidase Al, chymo- trypsinogen, and elastase IIIB and procease mm sice Ty conateray ei ‘Doe ! | TRYV2 PROCAR CHYO ELAPAO Fig. 2. Comparison of rans! abundance. Bars represent the percent abundance as determined by SAGE (dark bars) or hybeeization analy fight bars SAGE quanttations were dorwed rom Ta be 1 as folows TRY1/2 cludes the tags for trypsinogen 1 and 2; PROCAR indicates tags for procarboxypeptidase At; CHYMO indicates tags forchymotypsinogan; and ELA/PRO cides the tags for elastase IIB and protease E. The cONA hybriszations were as described (12). ror bars represent the standard deviation determined by taking the square root of counted evens acon: voting itt a percent abundance. A Poisson ds tribution was assumed Pr Pe Fig. 3. Soreaning a CONA library with SAGE tags. and P2 show typical hybrdlzaton results 0b tained wit 12:bp oigonuceaties as descrbed (19).PY and P2 comespondto the tanserpts de- sorbed in Table 2. Images were obtained wth a Molecular Dynamics Prosphormager, ard the Circle indicates the outine ofthe fter membrane towhich the recombinant phage wer ransered belorehybrczaion, 485

Vous aimerez peut-être aussi