Vous êtes sur la page 1sur 25

Using the NCBI Map Viewer to Browse UNIT 18.

5
Genomic Sequence Data
Tyra G. Wolfsberg1
1
Bethesda, Maryland

ABSTRACT
This unit includes a basic protocol with an introduction to the Map Viewer, describing how
to perform a simple text-based search of genome annotations to view the genomic context
of a gene, navigate along a chromosome, zoom in and out, and change the displayed maps
to hide and show information. It also describes some of NCBI’s sequence-analysis tools,
which are provided as links from the Map Viewer. The alternate protocols describe
different ways to query the genome sequence, and also illustrate additional features of
the Map Viewer. Alternate Protocol 1 shows how to perform and interpret the results
of a BLAST search against the human genome. Alternate Protocol 2 demonstrates how
to retrieve a list of all genes between two STS markers. Finally, Alternate Protocol 3
shows how to find all annotated members of a gene family. Curr. Protoc. Hum. Genet.
69:18.5.1-18.5.25.  C 2011 by John Wiley & Sons, Inc.

Keywords: genome browser r genome assembly r genomic sequence r gene map

INTRODUCTION
The NCBI Map Viewer is an interface to a large, integrated set of genomic data, in-
cluding sequence, cytogenetic, genetic linkage, and radiation hybrid maps, as well as
the assembled and annotated genomic sequence itself. Along with the UCSC Genome
Browser (Karolchik et al., 2009) and Ensembl (Fernández-Suárez and Schuster, 2010),
it is one of the primary Web sites from which genome sequence data can be accessed.

This unit includes an introduction to the Map Viewer (see Basic Protocol), which de-
scribes how to perform a simple text-based search of genome annotations to view the
genomic context of a gene, navigate along a chromosome, zoom in and out, and change
the displayed maps to hide and show information. It also describes some of NCBI’s
sequence-analysis tools, which are provided as links from the Map Viewer. The alter-
nate protocols describe different ways to query the genome sequence, and also illustrate
additional features of the Map Viewer. Alternate Protocol 1 shows how to perform and
interpret the results of a BLAST search against the human genome. Alternate Protocol
2 demonstrates how to retrieve a list of all genes between two STS markers. Finally,
Alternate Protocol 3 shows how to find all annotated members of a gene family.

At the time of this writing, NCBI provides Map Viewers for seventeen vertebrates, twelve
invertebrates, eighteen protozoa, forty-six plants, and seventeen fungi. Although the data
themselves are different for each organism, the basic navigation principles are the same.
The Basic Protocol and Alternate Protocols 1 and 2 are illustrated with examples from
the human genome, while Alternate Protocol 3 uses the mouse genome.

GENERAL NAVIGATION IN THE NCBI MAP VIEWER BASIC


PROTOCOL
This protocol introduces the basic concepts of the Map Viewer interface, including how
to perform a text-based search of genome annotations to view the genomic context of a
gene, navigate along a chromosome, zoom in and out, and change the displayed maps High-
Throughput
Sequencing

Current Protocols in Human Genetics 18.5.1-18.5.25, April 2011 18.5.1


Published online April 2011 in Wiley Online Library (wileyonlinelibrary.com).
DOI: 10.1002/0471142905.hg1805s69 Supplement 69
Copyright C 2011 John Wiley & Sons, Inc.
to hide and show information. It also describes some of NCBI’s sequence-analysis tools
that are provided as links from the Map Viewer. The figures shown in this protocol
illustrate examples using the Map Viewer for the human genome, but Map Viewers are
also provided for other organisms (see above).

Materials
Computer with Internet access
An up-to-date Internet browser, such as Firefox (Windows, Mac OS X, and
Linux; http://www.mozilla.org/firefox); Safari (Windows, Mac OS X;
http://www.apple.com/safari); or Internet Explorer (Windows;
http://www.microsoft.com/ie)
1. Start at the NCBI home page, http://www.ncbi.nlm.nih.gov/.
2. Follow the link to Maps & Markers in the left sidebar. Then, click on the link called
Map Viewer under either the Tools or Quick Links. Alternatively, navigate directly
to the Map Viewer at http://www.ncbi.nlm.nih.gov/mapview/.
The resulting page lists the available Map Viewers, organized by organism type. For
most organisms, only a single set of maps is available. However, NCBI makes two maps
available for both human and mouse: the most recently updated version (build 37), as
well as one older version (build 36). Click on the Build number or the magnifying glass
icon to retrieve the main Map Viewer page for that organism. Other links under the Tool
column include the “B” icon, a BLAST search on sequences from that organism; the
“R” icon, a chromosomal region search; the “Cf” icon (select organisms only), a Clone
Finder tool to identify clones within a given genomic regions; and the “G” icon (select
organisms only), for organism-specific genome resources at NCBI and elsewhere.

3. To perform a text-based search directly from this main page, select the organism
name from the pull-down menu at the top of the page and enter a query in the
text box.
The figures in this protocol illustrate a query for the acetylcholinesterase gene, or ACHE,
on the most recent version of the human genome assembly. Thus, type ACHE into the text
box at the top of the page, select “Homo sapiens” from the pull-down menu, then hit the
“Go!” button. The Map Viewer supports many types of queries in addition to gene name,
including gene product, accession number, protein domain, and marker name. Additional
queries are described in the alternate protocols.

4. The top of the search results page that then appears shows a schematic of the human
chromosomes, with the position(s) of the query marked in red. The middle blue bar
shows the total number of hits to the query, and the table at the bottom of the page
shows the details.
Sample search results are shown in Figure 18.5.1. NCBI displays up to thirteen dif-
ferent human genome assemblies, described at http://www.ncbi.nlm.nih.gov/genome/
guide/human/release notes.html. “Reference” is the assembly provided by the Genome
Reference Consortium (assembly GRCh37). Celera is assembled from whole-genome
shotgun sequences. HuRef is assembled from whole-genome shotgun sequences from a
single person. CRA TCAGchr7v2 is an assembly of chromosome 7 from The Center for
Applied Genomics (TCAG). ALT REF LOCI 1 through ALT REF LOCI 7 are seven
alternate loci representing the MHC region on chromosome 6. ALT REF LOCI 8 and
ALT REF LOCI 9 are alternate haplotype assemblies for portions of chromosome 4 and
17, respectively.
The query term ACHE yields 86 hits on the human genome. The majority of these hits are
to transcripts (mRNA sequences) with the term ACHE in their title, and which have been
NCBI Map Viewer
mapped to human genome. Hits on chromosomes 3 are to genes that contain the word
to Browse ACHE in their name, and the hit on chromosome 7 is to the ACHE gene itself.
Genomic
Sequence Data

18.5.2
Supplement 69 Current Protocols in Human Genetics
Figure 18.5.1 The main entry point for the NCBI Map Viewer, showing the results of a query for
ACHE on build 37.1 of the human genome.

5. The search results table is organized in six columns. The first two columns show
the chromosome and genome assembly of the hit. The fourth column lists the Map
Element that was mapped to the genome (accession number, gene symbol, etc), while
the third column provides the title of that element. The fifth column specifies the
type of element that was mapped to the genome (transcript, gene, etc). The sixth
column shows which map(s) the element was annotated on. The Quick Filter box to
the right of the search results table allows limited filtering by element type.

To view the genomic context of a hit, including all maps onto which it has been
placed, click on the name of the hit in the Map Element column. Click on the name
of an individual map to launch a view of the query term placed on that map alone.
To see all hits on a single chromosome, click on the chromosome number in the
top graphic, or on the “all matches” link adjacent to the chromosome number in the
results table.
To limit the results to only the nine gene elements, select Gene in the Quick Filter box.
The ACHE gene symbol appears five times, twice on the Reference genome assembly and
once on the three alternate assemblies for chromosome 7. To generate the view shown in
Figure 18.5.2, click on the first instance of the term ACHE in the Map Element column
for the Reference assembly. The second instance displays the Ensembl Genes track, which
shows genes annotated by Ensembl.

6. Look at the default Map Viewer display for the map element selected in the previous
step.
Figure 18.5.2 shows the genomic context of the ACHE gene. Thirty genes surrounding
ACHE are labeled in this view. The sequence coordinates on chromosome 7 (99,220K- High-
101,040K bp) are indicated at the top of the page in the Region Displayed, as well as Throughput
Sequencing
in the two text boxes on the left blue sidebar. In the ideogram in the blue sidebar, the
18.5.3
Current Protocols in Human Genetics Supplement 69
Figure 18.5.2 The default Map View for the query ACHE.

region displayed is indicated in red, relative to the known cytogenetic banding patterns
on chromosome 7.
Details are shown in the three default maps in the center of the window. The posi-
tion of the query, ACHE, is shown in red and/or pink on each map. The Genes cyto
(Genes Cytogenetic) map shows the cytogenetic locations of genes as reported in En-
trez Gene. Hs UniG, or UniGene Human, shows human mRNA and EST sequences
aligned to the genome, and named with their UniGene cluster identifier. The gray his-
togram shows the density of aligned ESTs and mRNAs in this region, and the blue lines
show putative intron/exon boundaries, with exons as thick blue lines. The map on the
right side of the display is known as the master map, and is always shown in ver-
bose mode, with more details than the maps to the left. In this case, the master map
is the Genes seq (Genes sequence) map, which shows annotated gene models. These
maps, as well as the others hidden in this view, are described in more detail in the
section of the online help documentation at http://www.ncbi.nlm.nih.gov/mapview/static/
humansearch.html.

7. Explore the Genes sequence map in more detail. The Genes sequence map shows
both known and putative genes that NCBI has annotated on the genomic contigs.
This map shows all possible exons for genes, including alternative splices; to see only
exon combinations that exist in individual transcripts, use the RefSeq Transcripts
map, as detailed in Alternate Protocol 1. Exons are depicted as boxes, and introns
as the lines between them. Coding exons are shaded boxes; untranslated exons are
unfilled boxes. A black arrow to the right of the gene symbol indicates the direction
of transcription of the gene. Furthermore, genes transcribed from the bottom of the
display up are displayed to the left of the gray line and genes transcribed from the
NCBI Map Viewer top down are displayed to the right. NCBI creates the following four types of gene
to Browse
Genomic models for the human genome assembly. The model type is indicated in the column
Sequence Data labeled E when the Genes sequence map is the master map.
18.5.4
Supplement 69 Current Protocols in Human Genetics
Figure 18.5.3 The Evidence Viewer for exon 3 of ACHE. This window shows the alignment of
the genomic contig NT 007933.15 with two RefSeq mRNAs (NM ######) and eleven GenBank
mRNAs. For color version of this figure go to http://www.currentprotocols.com/protocol/hg1805.

a. best RefSeq: The model is supported by the best alignment of an mRNA Reference
Sequence (Pruitt et al., 2007) to the genome sequence.
b. mRNA: The model is supported by the alignment of other transcripts to the genome
sequence.
c. protein: The model is supported by the alignment of a protein sequence to the
genome sequence.
d. external: The model is provided by an outside source and NCBI does not indicate
what evidence was used to predict it.

When the Genes sequence map is the master map and is on the right side of the Map
Viewer display, up to twelve additional annotations are available for each gene.

a. OMIM, Online Mendelian Inheritance in Man, is a continuously updated catalog


of human genes and genetic disorders (Borate and Baxevanis, 2009).
b. HGNC, the HUGO Gene Nomenclature Committee, provides a unique name for
each human gene.
c. sv, sequence viewer, provides a graphical representation of the gene, including
annotated features like coding region (CDS), RNA, and gene.
d. pr links to a list of protein sequences encoded by the gene.
e. dl, Sequence Download, allows the user to retrieve the genomic sequence or
annotation of the gene in text format. To retrieve a different region, change the
coordinates listed in the text box. The region can be returned either as sequence High-
in FASTA format or in GenBank format (UNIT 6.8). The GenBank format shows Throughput
all the features that have been annotated on the selected region, including mRNA Sequencing

18.5.5
Current Protocols in Human Genetics Supplement 69
Figure 18.5.4 The Model Maker for ACHE, showing the exons contributed by 16 mRNAs (light
blue) and 3 Gnomon gene predictions (hmm######). Together, these 19 sequences contribute 24
potential exons, numbered and shown in green. For color version of this figure go to http://www.
currentprotocols.com/protocol/hg1805.

and CDS. Note that, at the top of the page, the location of the region is shown in
chromosomal coordinates. Further down, the location is shown on the coordinates
of the NT ###### contig that spans this region.
f. ev, evidence viewer, displays the biological evidence supporting a particular gene
model. This view shows all GenBank and RefSeq mRNAs, ESTs, and model
exons in graphical fashion, and also displays alignments of the mRNAs with the
genomic sequence. Figure 18.5.3 shows the evidence viewer for a portion of exon
3 of ACHE, with mismatches between the genomic and mRNA sequences shown
in red. For additional information, click on Evidence Viewer Help on any ev
page. Sequences in the alignment can be copied and pasted into other computer
applications.
g. mm links to the Model Maker, which shows the exons that result when GenBank
mRNAs and gene predictions are aligned to the genomic sequence. ESTs can be
added as well. The user can then select individual exons to create a custom model
of the gene.
The Model Maker for ACHE (Fig. 18.5.4) shows 19 transcripts that align to the
genome in this region. Together, these 19 sequences contribute 24 potential exons to
the ACHE gene model, numbered and shown in green. To create a model, click on
individual exons in the Putative Exons section. A graphic will appear first, followed
by a DNA sequence in the top box, and a three-frame translation of this sequence
in the bottom three boxes. For example, the novel transcript produced by splicing
NCBI Map Viewer
to Browse together exons 1-7-14-15-22 is shown in Figure 18.5.4. Additional features of the
Genomic Model Maker are described in the Help Document, available by clicking help on any
Sequence Data mm page.

18.5.6
Supplement 69 Current Protocols in Human Genetics
h. hm, HomoloGene, is a system for automated detection of homologs among the
annotated genes of several completely sequenced eukaryotic genomes.
i. sts links to markers in NCBI’s UniSTS.
j. JCVI, displayed only in conjunction with the HuRef genome assembly, is a link
to the HuRef Browser at the J. Craig Venter Institute.
k. CCDS, the Consensus CDS project, is a collaborative effort involving NCBI that
aims to identify a set of consistently annotated, high-quality human protein coding
regions. Gene models with a CCDS link should be considered a “gold standard,”
as these models were identified by multiple gene prediction methods.
l. SNP links to NCBI’s dbSNP display for all the SNPs in that gene.
8. The NCBI Map Viewer provides an integrated set of sequence, cytogenetic, genetic
linkage, radiation hybrid, and YAC contig maps. Use the Maps & Options link in
the blue sidebar on the left of the page (or near the top of the page on the right
side) to hide and show maps in the Map Viewer. It is also possible to change the
position shown in the Map Viewer by entering the new coordinates in the Maps &
Options window. All of the human sequence maps are based on the same coordinate
system, that of the assembled genome sequence. Connections are drawn between
maps when the same object has been placed on two maps, for example, an STS or
a gene name. The elements will be connected even if they are known by different
names on different maps.
The Maps & Options window is shown in Figure 18.5.5. To generate this view, open
the Maps & Options window by clicking either of the abovementioned links, highlight
Genes cyto and ugHs, then click REMOVE to remove them from the list. Next, add the
maps called Phenotype and Variation by highlighting their names and clicking ADD.
Make sure that the Genes sequence map remains as the master by selecting its name,

High-
Figure 18.5.5 The Maps & Options window, which allows the user to alter the default Map Viewer Throughput
settings. Sequencing

18.5.7
Current Protocols in Human Genetics Supplement 69
Gene, then clicking Make Master/Move to Bottom. Finally, click OK to update the Map
Viewer with these new selections. The resulting page is shown in Figure 18.5.6.
The Phenotype, or Pheno map (Fig. 18.5.6), displays the placement of loci associated
with phenotypes on the human genome assembly. Phenotypes include those described in
Online Mendelian Inheritance in Man (OMIM) as well as quantitative trait loci (QTLs).
Phenotype names are linked to descriptions in OMIM. The Variation map shows the
position of genetic variation data from NCBI’s dbSNP. To conserve space on the display,
the positions of SNPs are shown as a histogram when many SNPs are present. In zoomed-in
regions with few SNPs, SNP names are linked to dbSNP.

9. NCBI provides two ways to adjust focus of the display. Use the zoom control bar
in the blue sidebar to change the view to the full chromosome (widest bar at the
top), 1/10,000 of the chromosome (narrowest bar at the bottom), or some interval
in between. Alternatively, place the mouse cursor over one of the maps in the Map
Viewer so that it assumes the shape of a hand, click once, and select one of the
options from the pop-up menu:

Recenter [the display around that point]


Zoom in ×2
Zoom in ×4
Zoom in ×8
Zoom out ×2
Show [a defined sequence interval of] 10 M, 1 M, 100 K, or 10 K
Show Sequence [of the genome in this region].

Continuing the above example, place the mouse cursor over the Genes sequence map
shown in Figure 18.5.6, click once on the map location (not name) of ACHE to open the
pop-up menu, and select “Show 100 K”. The map location of ACHE is slightly above its

NCBI Map Viewer


to Browse
Genomic Figure 18.5.6 A customized view of the genomic context of the ACHE gene, using the maps
Sequence Data selected in the Maps & Options window shown in Figure 18.5.5.

18.5.8
Supplement 69 Current Protocols in Human Genetics
name, and the two are connected by a faint gray line, The resulting window will display a
100 K window around the ACHE gene. To zoom in even more, click on the map location
of ACHE again and select “Zoom in ×8.”
10. Additional navigation is also available in the Map Viewer window. Clicking the blue
arrow next to a map name will make that map the master and move it to the rightmost
position. The blue “X” next to a map name is used to remove that map from the
view. Small arrows on the top and bottom of the master map (the rightmost map)
scroll the display up and down. Alternatively, if one knows the exact position along
the chromosome that one wishes to view, one may enter its coordinates in Region
Shown in the blue sidebar, then hit Go.
Click on the arrow next to the Variation map to make it the master.

11. The map on the right side of the display, the master map, has an enhanced display
compared to the other maps. Each map will link to relevant NCBI data. The details
available for the Genes sequence map are described above. Furthermore, the position
of the query term on the master map is indicated as a red bar.
Figures 18.5.2 to 18.5.4 demonstrate some of the additional features available when
Genes sequence is the master map. Figure 18.5.7 shows an example of the SNP properties
that are visible when Variation is the master map. The graphic in the Map column indicates
the status of the mapping of the SNP to the genome assembly. The Gene column indicates
if the SNP is part of a gene. Heterozygosity indicates average heterozygosity of the
SNP. The Validation column distinguishes validated from unvalidated SNPs. The final
column presents miscellaneous information. A detailed figure legend is available from
http://www.ncbi.nlm.nih.gov/SNP/get html.cgi?whichHtml=verbose.

Figure 18.5.7 A zoomed-in view of the display in Figure 18.5.6, showing the region around
ACHE in detail. The variation (SNP) map is the master map.
High-
Throughput
Sequencing

18.5.9
Current Protocols in Human Genetics Supplement 69
In Figure 18.5.7, all the SNPs that are within the red bar on the Variation map fall within
the exons or introns of the ACHE gene. In six of the SNPs, the L (locus), T (transcript), and
C (coding sequence) in the Gene column are colored (rs17886728, rs1799806, rs7636,
rs1056867, rs13246682, and rs17881553). These 6 SNPs occur in the coding sequence of
the ACHE gene. More information on each SNP, including the sequence and whether the
SNP is synonymous or nonsynonymous, is available by clicking on the SNP identifier and
linking to NCBI’s dbSNP. According to the Summary of Maps at the bottom of the page,
only 17 of the 147 SNPs in this region are labeled (see a similar region in Fig. 18.5.12).
Some of these SNPs fall into bins called, for example, “5 variations”; others are indicated
in the graphic as unlabeled tick marks. Some of these unlabeled SNPs could also fall in
the coding sequence of ACHE.

ALTERNATE BLAST SEARCH AGAINST THE GENOME


PROTOCOL 1
There are many ways to access the NCBI’s Map Viewer in addition to a basic query by
gene name. One of these is to perform a BLAST search (see UNIT 6.8), using either a protein
or nucleotide sequence as a query. This example is illustrated using the Map Viewer for
the human genome. The example also explains how to view individual transcripts that
have been aligned to the genome.

Materials
Computer with Internet access
An up-to-date Internet browser, such as Firefox (Windows, Mac OS X, and
Linux; http://www.mozilla.org/firefox); Safari (Windows, Mac OS X;
http://www.apple.com/safari); or Internet Explorer (Windows;
http://www.microsoft.com/ie)
1. Start at the NCBI’s BLAST page, http://www.ncbi.nlm.nih.gov/BLAST/. The section
of the page labeled Genomes displays links to organism-specific genomic BLAST
databases. Alternatively, these genomic databases can also be accessed from the Map
Viewer for certain organisms. Select the page for the appropriate organism.
For this example, select Human. A number of different nucleotide and protein databases
related to the human genome sequencing project are available from this page. Descriptions
of these databases may be viewed by clicking the Database link (i.e., the word Database
in blue near the top of the page). The assembled genome itself is in the database called
“genome.”

2. Enter either a sequence accession or “gi” number or a FASTA-formatted sequence


(see UNIT 6.8) for the query into the box on the BLAST form, choose the correct
BLAST program, change any default parameters, and click Begin Search.
For a more detailed discussion of the BLAST program and parameters, see UNIT 6.8.
A gi number is a unique integer assigned to a sequence that changes if the sequence is
updated.
The examples shown in Figures 18.5.8, 18.5.9, and 18.5.10 are of a megaBLAST search
against the “genome (all assemblies)” database using the accession number for the ACHE
mRNA RefSeq, NM 000665, as a query. All other parameters were left in their default
settings. megaBLAST is a version of BLAST that is optimized for quickly comparing highly
related nucleotide sequences.

3. The first page returned from the BLAST server allows the user to change various
formatting options, as well as see the results. Click the View Report button to
check the results. When the search is complete, the resulting page is similar to a
standard BLAST report.
NCBI Map Viewer
to Browse Near the top of the results page is a schematic showing the position of the hit on the query
Genomic sequence (Fig. 18.5.8). The top red line indicates the query sequence, and lines under this
Sequence Data
depict the alignment of database sequences with the query, color-coded by BLAST score.
18.5.10
Supplement 69 Current Protocols in Human Genetics
Figure 18.5.8 The results of a megaBLAST search against the human “genome” database using
the accession number for the ACHE mRNA RefSeq, NM 000665.

In this case, the query aligns with four database sequences throughout its length, and
with similar high scores. The contig in the first alignment, NT 007933.15, is 77 Mb of as-
sembled genomic sequence from the reference genome assembly of human chromosome 7.
The contig in the second alignment is from the Celera genome assembly of chromosome 7.
The contig in the third alignment is from the TCAG assembly of chromosome 7. The
contig from the fourth alignment is from the HuRef assembly of chromosome 7. The
bottom of the BLAST results page depicts the alignment of the query with the database
sequences.

4. In blue, above the schematic of the BLAST hits, is a link to Other reports.
To link to an overview of the genomic context of the BLAST hit(s), click on the
[Organism] genome view link (Fig. 18.5.8).
The Genome View, shown in Figure 18.5.9, is very similar to the initial Map Viewer
results page (Fig. 18.5.1), with the BLAST hits shown as the queries. The schematic at
the top of the page shows the location of the BLAST hits on the chromosomes. The blue
bar below the BLAST scores legend provides the name of the query sequence, extracted
from the accession number provided in step 2, above. The next section shows the details
of the database hits from the BLAST search. Starting on the left, the Chr column provides
the chromosome number of the hit, while the Assembly refers to the genome assembly.
The Map element is the accession number of the database sequence that was hit in the
BLAST search, while the Type is the type of database sequence. The number of Hits is
the number of blocks in the sequence alignment. The five blocks in each of the hits to
chromosome 7 represent alignments between the five exons of the ACHE mRNA and the
genome. Each of the chromosome 7 hits has a similar Score and E-value, because the
three sequence assemblies are identical, or nearly identical, in this region. There are also
three lower-scoring hits to chromosome 16.
High-
Throughput
Sequencing

18.5.11
Current Protocols in Human Genetics Supplement 69
Figure 18.5.9 The Genome View of the results of the BLAST search shown in Figure 18.5.8, as
shown in the Map Viewer results page.

NCBI Map Viewer


to Browse
Genomic
Sequence Data Figure 18.5.10 The genomic context of the results of the BLAST search shown in Figure 18.5.8.

18.5.12
Supplement 69 Current Protocols in Human Genetics
5. To launch the Map Viewer and view the genomic context of an individual hit, click
on an item in the Map Element column. The BLAST hits are integrated into the
Contig map, the master map in this view. The positions of the individual BLAST hits
are depicted as bars and color-coded by score. Highlighted in pink are the position
of the hit in the query, as well as the percent identity between the query and the
genomic sequence.
To view the data shown in Figure 18.5.10, click on NT 007933 in the Map Element column
shown in Figure 18.5.9. Four of the five BLAST hits correspond to exons drawn on the
Genes sequence map, while the fifth hit is shorter than the corresponding Genes sequence
exon. The explanation for this discrepancy is that the Genes sequence map depicts a
flattened view of all exons for a particular gene model. The RefSeq RNA map depicts
individual transcripts, and indicates that there are two alternatively spliced forms of
ACHE (NM ###### accession numbers), as well as two gene predictions (XM ######)
encoded on the opposite strand as ACHE, in this region. The two ACHE transcripts differ
in the length of the final exon (Figure 18.5.10), and the BLAST hit corresponds to the
form with the shorter final exon. As the Genes sequence map depicts a composite view of
the exons, the final exon appears long. The map on the left, the Model map, depicts ab
initio gene models predicted by Gnomon.
6. This view can now be manipulated as described above (see Basic Protocol, steps 7
to 11).

USING THE NCBI MAP VIEWER TO VIEW A REGION BETWEEN TWO ALTERNATE
MARKERS PROTOCOL 2
A third way to access the NCBI’s Map Viewer is to view a region between two markers.
Such a strategy would be useful in a positional cloning project, as one can quickly view
all the genes in a critical region defined by two markers. This example is illustrated using
the Map Viewer for the human genome.

Materials
Computer with Internet access
An up-to-date Internet browser, such as Firefox (Windows, Mac OS X, and
Linux; http://www.mozilla.org/firefox); Safari (Windows, Mac OS X;
http://www.apple.com/safari); or Internet Explorer (Windows;
http://www.microsoft.com/ie)
1. Start at the NCBI home page, http://www.ncbi.nlm.nih.gov/.
2. Follow the link to Maps & Markers in the left sidebars. Then click on the link called
Map Viewer under either the Tools or Quick Links. Alternatively, navigate directly
to the Map Viewer at http://www.ncbi.nlm.nih.gov/mapview/.
3. Click on Build 37.1 in the Build column. Search for the region containing two
markers by querying for both terms separated by the word OR.
For the example in this protocol, the critical region is the <2 Mb region containing the two
STS markers RH46231 and RH71410. Thus, type RH46231 OR RH71410 in the query
box. As the search engine recognizes most marker names and their aliases, RH46231 can
also be found by its alias stSG22199.

4. The resulting page shows the position of the two hits on the human genome. Click on
the line for an individual hit to view the genomic context of that hit. If both hits are
on the same chromosome, view them simultaneously by clicking on the chromosome
number in the top graphic, or the “all matches” link in the bottom table.
For this example, click on “all matches” on the reference assembly of chromosome 7 in High-
the results table. Throughput
Sequencing

18.5.13
Current Protocols in Human Genetics Supplement 69
5. The Map Viewer returns a zoomed-out view showing the genomic region surrounding
the two markers. This view is too broad if one wishes to analyze only those genes
that are between the two markers. To limit the view to the region precisely between
the two markers, type the marker names in the text boxes in the blue sidebar, then
hit “Go.”
To continue the example, enter RH46231 into the top text box in the blue sidebar on the
left-hand side of the screen and enter RH71410 into the bottom box, then click “Go.”
Figures 18.5.11 and 18.5.12 show the resulting view of the region of chromosome 7
between the STS markers RH46231 and RH71410. The STS map shows the placement
of STSs from various sources onto the genome using Electronic PCR (e-PCR; Schuler,
1998). e-PCR is a computational method that predicts the location of sequence tagged
sites in DNA by searching for subsequences that closely match the PCR primers used to
make the STS, and which also have the correct order, orientation, and spacing such that
they could prime the amplification of a PCR product of the correct molecular weight. The
two STS markers used in the search are highlighted in pink, at the top and bottom of the
interval. The STS is the master map. In this case, the additional information to the right
of the map is a table indicating which genetic and RH maps each STS has been placed
on.
At the bottom of the display, below the graphic, is a section entitled Summary of Maps
(Fig. 18.5.12). For the Genes sequence map, for example, this summary reveals the coor-
dinates being displayed in the graphic and the total number of genes on the chromosome.
It also indicates the number of genes in the region shown in the graphic, 80, versus the
number of genes whose names are labeled in the graphic, 50 in this case. It is important
to remember that if space is limiting, the Map Viewer only displays a portion of the data
in a given region. In order to view the names of all the genes in the region, one could
zoom in using the zoom controls described in the Basic Protocol. Alternatively, one could
increase the page length of the graphic by typing a larger number into the Page Length
box of the Maps & Options window (Fig. 18.5.5).

NCBI Map Viewer


to Browse
Genomic
Sequence Data Figure 18.5.11 The genomic context of the region between RH markers RH46231 and RH71410.

18.5.14
Supplement 69 Current Protocols in Human Genetics
Figure 18.5.12 The Summary of Maps from the region displayed in Figure 18.5.11. The summary
highlights the fact that although there are 80 genes in this region, only 50 of them are labeled with
a gene symbol.

The other maps displayed in the default include the UniGene Human (Hs UniG)
and Genes sequence (Genes seq) described above, as well as the GeneMap99-GB4
(GM99 GB4), a map of markers mapped onto the GB4 RH panel by the International
Radiation Hybrid Consortium. The red lines show the position of the STS markers on the
maps.

6. View all the genes between the two markers by selecting Data As Table View from
the left blue sidebar.
The Data as Table View, shown in Figure 18.5.13, will provide a table of all the data in
the current window. The view may appear slowly in the Web browser, especially if the
genomic interval is large and contains much data. However, unlike the graphic, this view
will report all available data in the region. To improve the download speed, remove any
unneeded maps by clicking on the X next to the map name before selecting the Data as
Table View link. In the resulting table, the Genes on Sequence section shows the start
and stop position of all genes in this interval, the gene symbol, orientation of the gene
along the chromosome, links to some additional views of the gene such as sv (sequence
viewer) and ev (evidence viewer) discussed in the Basic Protocol, the evidence for the
gene, cytogenetic position, and full name. The data on the genes in this critical region
can be examined directly from the Web browser. For researchers who prefer to track their
data in spreadsheets, the table can be saved and imported into outside programs such as
Microsoft Excel.
7. Make changes to the display as necessary (see Basic Protocol, steps 6 to 11).

High-
Throughput
Sequencing

18.5.15
Current Protocols in Human Genetics Supplement 69
Figure 18.5.13 A text view of the genes located between the two RH markers RH46231 and
RH71410. This view was generated from the page shown in Figure 18.5.11 by clicking on the link
to Data as Table View.

ALTERNATE USING THE NCBI MAP VIEWER TO IDENTIFY ALL MEMBERS OF A


PROTOCOL 3 GENE FAMILY IN THE MOUSE GENOME
This protocol demonstrates some advanced features of the Map Viewer text-based query.
The example is illustrated using the Map Viewer for the mouse genome.

Materials
Computer with Internet access
An up-to-date Internet browser, such as Firefox (Windows, Mac OS X, and
Linux; http://www.mozilla.org/firefox); Safari (Windows, Mac OS X;
http://www.apple.com/safari); or Internet Explorer (Windows;
http://www.microsoft.com/ie)
1. Start at the NCBI home page, http://www.ncbi.nlm.nih.gov/.
2. Follow the link to Maps & Markers in the left sidebars. Then click on the link called
Map Viewer under either the Tools or Quick Links. Alternatively navigate directly
to the Map Viewer at http://www.ncbi.nlm.nih.gov/mapview/.
3. Click on the appropriate organism name to get to the main search page for that
organism, and query for the gene family of interest by entering the terms in the
“Search for” box. If the Map Viewer displays more than one assembly for that
organism, it will be easier to understand the results if the query is limited to a single
assembly using the assembly pull-down menu. Next, click the Find button. In order
NCBI Map Viewer to find all members of a gene family, it is probably necessary to use the asterisk
to Browse
Genomic (*) as a wildcard to represent any number of characters. To limit the search to gene
Sequence Data symbols, restrict the search with the term [sym].
18.5.16
Supplement 69 Current Protocols in Human Genetics
The goal of this protocol is to find the map position of all mouse members of the ADAM
gene family on the most recent assembly of the mouse reference genome. First, click on
the magnifying glass icon for Mus musculus Build 37.1 and, on the page that appears,
change the “assembly” pull-down menu at the top of the page from “All” to “reference.”
One might start the search by typing the term ADAM in the “Search for” text box at
the top of the page. This search returns 211 hits, including hits to UniGene clusters,
transcripts, and genes. The sequences are redundant; on chromosome 1, there are four
hits to ADAM23. On the other hand, the returned list is not complete, as ADAM3 is not
listed on chromosome 3. A query for ADAM [sym] returns no matches, as there are no
genes with the symbol ADAM. Searching with the term ADAM* returns 793 redundant
hits. A search for ADAM*[sym] returns 315 hits, but the list includes genes that are
members of both the ADAM, the ADAMTS, and the ADAMdec families. To eliminate the
ADAMTS and ADAMdec family members, add the term NOT ADAMTS* [sym] NOT
ADAMDEC* [sym] to the query. 189 hits are returned, most of which are redundant.
To eliminate the redundancy, click on the button called Advanced Search in the blue search
bar near the top of the page. To search only for genes, not for transcripts or UniGene
clusters, clear all elements except Gene under “Type of mapped object.” To return results
only on the Gene map, clear all maps except Gene under the “Map name.” Click “Find” to
start the search. The results of this final query, ADAM* [sym] NOT ADAMTS*[sym]
NOT ADAMDEC* [sym] AND Gene[obj type] AND genes[map name], which gener-
ates 34 hits, are shown in Figure 18.5.14. Only three genes are returned that may not
be members of the ADAM gene family–LOC384806 on chr8 and 4930523C11Rik and
EG214321 on chromosome 12. These three genes appear in the search because the field
abbreviation of [sym] includes aliases and locus tags as well as preferred symbols. In all
three cases, the gene has an alias or locus tag of ADAM*. Note that this text-based search
returns only those genes whose symbol begins with the term ADAM. Additional unnamed
members of the ADAM gene family may be found in a BLAST search (UNIT 6.8).

Figure 18.5.14 The results of querying NCBI’s mouse Map Viewer with ADAM* [sym] NOT
ADAMTS*[sym] NOT ADAMDEC* [sym] AND Gene[obj type] AND genes[map name]. High-
This search results in all named members of the ADAM gene family in mouse. Throughput
Sequencing

18.5.17
Current Protocols in Human Genetics Supplement 69
Figure 18.5.15 An overview of the genomic context of the members of the ADAM gene family
that map to mouse chromosome 14.

4. Examine the genomic context of one of the genes returned in the search by selecting
a term in the Map Element column. Alternatively, to view all hits on a single
chromosome, click on the chromosome number in the top graphic, or on the term
“all matches” next to the chromosome number in the bottom table.
For example, to see the hits to Adam2, Adam7, and Adam28 on chromosome 14, click on
“all matches” next to chromosome 14. The results are shown in Figure 18.5.15.
Only two maps are displayed, since the query was limited to the Gene (Genes seq) map.
The Assembly map shows all the genome assemblies available in this region of the mouse
genome (see below). Genes seq, as for the human Map Viewer, displays the annotated
genes. Other maps can be added using the Maps and Options button and are described
in detail in the link called Mouse Maps Help near the top of the left sidebar.
5. The NCBI Map Viewer integrates data from the mouse reference assembly (mouse
strain C57BL/6J) as well as several alternate assemblies. The mixed-strain Celera
assembly is nearly complete, while the BAC-based assemblies from other strains
including 129 substrain, A/J, B6/CBAF1J, Balb/c, C3H, and NOD are partial. The
assemblies available in a particular genomic region are listed in the Assembly map.
The blue vertical line indicates the assembly being viewed. By default, the reference
assembly is shown in blue. The assembly can be changed in the Maps & Options
window (accessed as in the Basic Protocol).
In this region of the genome, the available assemblies are Celera and C57BL/6J (the ref-
erence assembly; Fig. 18.5.15). Each assembly has its own associated set of annotations.
By default, the C57BL/6J assembly is colored blue and its annotations are shown. But the
NCBI Map Viewer
to Browse assembly can be changed in the Maps & Options window by selecting the assembly name
Genomic from the assembly pull-down menu and clicking Change Assembly. Maps corresponding
Sequence Data

18.5.18
Supplement 69 Current Protocols in Human Genetics
Figure 18.5.16 The Gene maps for the Celera mouse genome assembly and Reference mouse
genome assembly are slightly different. The word “Celera” is written vertically near the top of the
Celera Gene map.
to that genome assembly can then be added to the display in the Maps & Options
window.
For example, use the Maps & Options window to change the assembly to Celera. Then
add the Celera Gene map to the display using the Maps & Options window. The resulting
graphic (Fig. 18.5.16) shows two slightly different Genes seq maps, one from the reference
assembly, and one from the Celera assembly. The word “Celera” is written vertically near
the top of the Celera Gene map.

6. The navigation around all Map Viewers, including that for mouse, is similar to the
procedures described in the preceding protocols for human.

COMMENTARY
Background Information assembled. This strategy may bypass the need
At least three strategies have been used to for a clone-based physical map, the first step
sequence the 110 genomes available through in clone-by-clone shotgun sequencing. Many
the NCBI Map Viewer (reviewed in Green, other publicly available genome sequences are
2001). The International Human Genome Se- being generated by a hybrid approach us-
quencing Consortium has relied on a method ing both methods. Newer genome assembly
called clone-by-clone shotgun sequencing to methods incorporate NextGen sequence data
generate the sequence of the human genome. (DiGuistini et al., 2009).
In brief, the genome is partitioned into a set of The working draft sequence of the hu-
mapped, overlapping clones, and these clones man genome was published in 2001 (Lander
are subjected to shotgun sequencing. In whole- et al., 2001) and the finished genome sequence
genome shotgun sequencing, as used by Celera announced in 2003 (International Human
Genomics to decipher the sequence of the hu- Genome Sequencing Consortium, 2004). A se-
man genome, the entire genome is fragmented quence becomes finished when it has been de- High-
Throughput
into pieces, and these pieces are sequenced and termined at an accuracy of at least 99.99% and Sequencing

18.5.19
Current Protocols in Human Genetics Supplement 69
has no gaps; sequence data falling short of that The mouse genome sequence has been pro-
benchmark, but which can be positioned along duced by a hybrid strategy that involves a com-
the physical map of the chromosomes, are bination of whole genome shotgun and clone
termed “draft.” Sequences of draft clones are by clone sequencing (Waterston et al., 2002).
deposited into the High Throughput Genomic The assembly process is described at http://
(HTG) division of DDBJ/EMBL/GenBank, www.ncbi.nlm.nih.gov/genome/guide/mouse/
where they receive an accession number. release notes.html. The current mouse
These draft clones may contain gaps, un- genome assembly is Build 37, produced
ordered or unoriented contigs, or sequenc- by NCBI in consultation with the Mouse
ing errors. As the sequence of the clone is Genome Sequencing Consortium (MGSC).
completed, the sequence represented by the This assembly of the reference sequence from
accession number is updated until the clone strain C57BL/6J is composed mostly of fin-
is considered finished. Finished clones are ished clone sequences, with some additional
moved from HTG to the Primate division of whole-genome shotgun sequence contigs.
DDBJ/EMBL/GenBank. The primary assembly for mouse Build 37
Until early 2009, NCBI assembled the includes the sequences of 19 autosomes,
sequences of the individual human genomic chromosomes X and Y, and the mitochondrial
clones into contigs and chromosomes. genome. This reference assembly is also used
These assemblies were performed using the by the UCSC Genome Browser and Ensembl
sequence data in GenBank as of a set date. as starting material for their annotation
A description of this process is at http:// pipelines. In addition to the reference mouse
www.ncbi.nlm.nih.gov/genome/guide/human/ assembly of strain C57BL/6J, the Map
release notes.html. The contigs became part Viewer also includes the nearly complete
of the NCBI RefSeq project (Pruitt et al., mixed-strain Celera assembly as well as
2007), and were annotated with genes and partial BAC-based assemblies from other
other features and assigned an accession strains including several 129 substrains, A/J,
number of the format NT ######. The UCSC AKR/J, B6/CBAF1J, Balb/c, C3H, Cast/Ei,
Genome Browser (Karolchik et al., 2009) and NOD, SJL/J, Spret/Ei, and unknown. In the
Ensembl (Fernández-Suárez and Schuster, future, mouse genome assemblies will be
2010) used the NCBI chromosomes as the performed by the GRC. The GRC is actively
starting material in their annotation pipelines. working to close known gaps in the current
Beginning in 2009, the human genome is reference assembly.
being assembled by the Genome Reference Most genomes are assembled by an ex-
Consortium, a collaboration among the ternal group and then passed on to NCBI,
Wellcome Trust Sanger Institute, the Genome where they are annotated via the NCBI an-
Center at Washington University, the Euro- notation pipeline. Map Viewers are available
pean Bioinformatics Institute, and the NCBI for the following assembled genomes: rhe-
(http://www.ncbi.nlm.nih.gov/projects/genome/ sus macaque (Rhesus Macaque Genome Se-
assembly/grc/). The most recent assembly, quencing and Analysis Consortium, 2007);
GRCh37, is the first one produced by the chimpanzee (Chimpanzee Sequencing and
GRC. It is composed of a primary assembly, Analysis Consortium, 2005); Brown Norway
which includes all assembled chromosomes rat (Gibbs et al., 2004); duck-billed platypus
as well as unplaced or unlocalized sequences, (Warren et al., 2008); opossum (Mikkelsen
and a set of alternate loci, regions for which et al., 2007); cattle (Bovine Genome Se-
there is large-scale variation. For GRCh37, the quencing and Analysis Consortium, 2009); do-
primary assembly consists of the sequences mestic dog (Lindblad-Toh et al., 2005); and
of 22 autosomes, chromosomes X and Y, chicken (Hillier et al., 2004). In addition,
and the mitochondrial genome, as well as NCBI has developed Map Viewers for 100
unplaced sequences on some chromosomes additional vertebrates, invertebrates, protozoa,
and unlocalized sequences for which the plants, and fungi. The amount of sequence and
corresponding chromosome has not been annotation available in these Map Viewers is
determined. The alternate loci include one variable.
alternate contig for the UGT2B17 region Once the genome sequence is available,
on chromosome 4, seven alternate contigs NCBI runs it through the NCBI annotation
NCBI Map Viewer for the MHC region on chromosome 6, and pipeline, described at http://www.ncbi.nlm.
to Browse one alternate contig for the MAPT region on nih.gov/genome/guide/build.shtml. Various
Genomic chromosome 17. features are annotated, including STSs, SNPs,
Sequence Data

18.5.20
Supplement 69 Current Protocols in Human Genetics
Repeats, and genes. Genes are annotated in a Suggestions for Further Analysis
two-step process including reference sequence All of the data presented in the Map Viewer
(RefSeq) transcript alignments and Gnomon are also available for download from the NCBI
predictions. Gene models produced by NCBI’s FTP site. Advanced users who want to write
Genome Annotation project may differ from their own scripts to manipulate the data can
curated RefSeqs or other GenBank mRNA access text files from organism-specific direc-
sequences. They will have accession numbers tories at ftp://ftp.ncbi.nih.gov/genomes/. Nei-
that begin with the prefix XM (mRNA), XR ther the Web interface nor the databases can
(non-coding transcript), and XP (protein). be downloaded at this time.
These sequences should be used with caution, The NCBI Map Viewer is only one view of
as they have not been curated. NCBI’s the human genome. In many cases, it may be
reference sequence project is described at useful to look at the same region of the genome
http://www.ncbi.nlm.nih.gov/RefSeq/ . using the UCSC Genome Browser or Ensembl.
The Map Viewer for each organism inte- Since the three sites use different methods to
grates genomic data from a number of different align mRNAs and ESTs to the genome, as well
sources. For example, the human Map Viewer as different gene-prediction algorithms, the
includes sequence, cytogenetic, genetic link- positions or numbers of predicted genes may
age, and radiation hybrid maps. Some of these vary. When doing such comparisons, however,
maps were generated by NCBI, and others one must be careful to check that the same
are taken from the scientific literature. A de- assembly of the genome is being viewed at
scription of the maps available for each organ- each site. The user may also discover that dif-
ism is available by following the link called ferent sites are better for different types of
“[Organism] Maps Help” in the left blue side- queries. Query results may load more quickly
bar of most Map Viewer pages. in the UCSC Genome Browser than in either
the Map Viewer or Ensembl, making UCSC
more efficient for fast searches. At NCBI, se-
Critical Parameters and quence comparisons against the genome are
Troubleshooting performed with BLAST. UCSC provides the
The human, mouse, and other organism ge- BLAT program, which, while not as sensitive
nomic sequences are a work in progress, and as BLAST, is often much faster. Ensembl pro-
updates will continue. The GRC will period- vides both BLAST and BLAT search capabil-
ically updates the human and mouse genome ities. Both UCSC and Ensembl allow users to
assembly based on new sequence data. The display their own data in the context of the pub-
“build” number is displayed prominently at the licly available annotations, a tool that NCBI
top of each Map Viewer page. The examples does not yet provide. However, NCBI provides
in this unit were all illustrated using human more non-sequence-based maps, such as the
build 37 and mouse build 37. Although users Mitelman Breakpoint, deCODE, and Stanford
working on later mouse and human genome G3 maps. In short, in order to make the most
builds will not be able to recreate the exact fig- of the human genome data, users should learn
ures shown here, the queries themselves will to use all three sites.
remain valid.
At present, only the two most recent hu-
man and mouse builds are available at NCBI;
Disclaimer
This unit was written by Dr. Tyra G.
older builds can be retrieved from the UCSC
Wolfsberg in her private capacity. No offi-
Genome Browser or from Ensembl. Sequence
cial support or endorsement by the National
coordinates along the chromosome frequently
Institutes of Health or the United States De-
change from build to build. Furthermore,
partment of Health and Human Services is in-
changes in sequence data or algorithm imple-
tended or should be inferred.
mentation can sometimes cause large changes
in the assembly; genes can move around
within, or even between, chromosomes. In Literature Cited
Borate, B. and Baxevanis, A. D. 2009. Searching
some cases, the assembly provided by NCBI Online Mendelian Inheritance in Man (OMIM)
may not be correct because of errors in the for Information on Genetic Loci Involved in Hu-
build process or in the underlying data. If a man Disease. Curr. Protoc. Bioinform. 27:1.2.1-
region of the assembly is suspect, it may be 1.2.13.
worth reviewing older versions of the genome Bovine Genome Sequencing and Analysis Consor- High-
assembly at UCSC or Ensembl. tium. 2009. The genome sequence of taurine Throughput
Sequencing

18.5.21
Current Protocols in Human Genetics Supplement 69
cattle: A window to ruminant biology and evo- E.A., Venter, J.C., Payseur, B.A., Bourque, G.,
lution. Science 324:522-528. Lopez-Otin, C., Puente, X.S., Chakrabarti, K.,
Chimpanzee Sequencing and Analysis Consor- Chatterji, S., Dewey, C., Pachter, L., Bray, N.,
tium. 2005. Initial sequence of the chim- Yap, V.B., Caspi, A., Tesler, G., Pevzner, P.A.,
panzee genome and comparison with the human Haussler, D., Roskin, K.M., Baertsch, R.,
genome. Nature 437:69-87. Clawson, H., Furey, T.S., Hinrichs, A.S.,
Karolchik, D., Kent, W.J., Rosenbloom, K.R.,
DiGuistini, S., Liao, N.Y., Platt, D., Robertson, G., Trumbower, H., Weirauch, M., Cooper, D.N.,
Seidel, M., Chan, S.K., Docking, T.R., Birol, I., Stenson, P.D., Ma, B., Brent, M., Arumugam,
Holt, R.A., Hirst, M., Mardis, E., Marra, M.A., M., Shteynberg, D., Copley, R.R., Taylor, M.S.,
Hamelin, R.C., Bohlmann, J., Breuil, C., and Riethman, H., Mudunuri, U., Peterson, J.,
Jones, S.J. 2009. De novo genome sequence as- Guyer, M., Felsenfeld, A., Old, S., Mockrin,
sembly of a filamentous fungus using Sanger, S., and Collins, F. 2004. Genome sequence of
454 and Illumina sequence data. Genome Biol. the Brown Norway rat yields insights into mam-
10:R94. malian evolution. Nature 428:493-521.
Gibbs, R.A., Weinstock, G.M., Metzker, M.L., Fernández-Suárez, X. M. and Schuster, M. K. 2010.
Muzny, D.M., Sodergren, E.J., Scherer, S., Using the Ensembl Genome Server to Browse
Scott, G., Steffen, D., Worley, K.C., Burch, P.E., Genomic Sequence Data. Curr. Protoc. Bioin-
Okwuonu, G., Hines, S., Lewis, L., DeRamo, form. 30:1.15.1-1.15.48.
C., Delgado, O., Dugan-Rocha, S., Miner,
G., Morgan, M., Hawes, A., Gill, R., Celera Green, E.D. 2001. Strategies for the systematic se-
Holt, R.A., Adams, M.D., Amanatides, P.G., quencing of complex genomes. Nat. Rev. Genet.
Baden-Tillson, H., Barnstead, M., Chin, S., 2:573-583.
Evans, C.A., Ferriera, S., Fosler, C., Glodek, Hillier, L.W., Miller, W., Birney, E., Warren, W.,
A., Gu, Z., Jennings, D., Kraft, C.L., Nguyen, Hardison, R.C., Ponting, C.P., Bork, P., Burt,
T., Pfannkoch, C.M., Sitter, C., Sutton, G.G., D.W., Groenen, M.A., Delany, M.E., Dodgson,
Venter, J.C., Woodage, T., Smith, D., Lee, H.M., J.B., Chinwalla, A.T., Cliften, P.F., Clifton,
Gustafson, E., Cahill, P., Kana, A., Doucette- S.W., Delehaunty, K.D., Fronick, C., Fulton,
Stamm, L., Weinstock, K., Fechtel, K., Weiss, R.S., Graves, T.A., Kremitzki, C., Layman,
R.B., Dunn, D.M., Green, E.D., Blakesley, D., Magrini, V., McPherson, J.D., Miner, T.L.,
R.W., Bouffard, G.G., De Jong, P.J., Osoegawa, Minx, P., Nash, W.E., Nhan, M.N., Nelson,
K., Zhu, B., Marra, M., Schein, J., Bosdet, I., J.O., Oddy, L.G., Pohl, C.S., Randall-Maher,
Fjell, C., Jones, S., Krzywinski, M., Mathew- J., Smith, S.M., Wallis, J.W., Yang, S.P.,
son, C., Siddiqui, A., Wye, N., McPherson, J., Romanov, M.N., Rondelli, C.M., Paton, B.,
Zhao, S., Fraser, C.M., Shetty, J., Shatsman, Smith, J., Morrice, D., Daniels, L., Tempest,
S., Geer, K., Chen, Y., Abramzon, S., Nierman, H.G., Robertson, L., Masabanda, J.S., Griffin,
W.C., Havlak, P.H., Chen, R., Durbin, K.J., D.K., Vignal, A., Fillon, V., Jacobbson, L.,
Egan, A., Ren, Y., Song, X.Z., Li, B., Liu, Kerje, S., Andersson, L., Crooijmans, R.P.,
Y., Qin, X., Cawley, S., Worley, K.C., Cooney, Aerts, J., van der Poel, J.J., Ellegren, H.,
A.J., D’Souza, L.M., Martin, K., Wu, J.Q., Caldwell, R.B., Hubbard, S.J., Grafham, D.V.,
Gonzalez-Garay, M.L., Jackson, A.R., Kalafus, Kierzek, A.M., McLaren, S.R., Overton, I.M.,
K.J., McLeod, M.P., Milosavljevic, A., Virk, Arakawa, H., Beattie, K.J., Bezzubov, Y.,
D., Volkov, A., Wheeler, D.A., Zhang, Z., Boardman, P.E., Bonfield, J.K., Croning, M.D.,
Bailey, J.A., Eichler, E.E., Tuzun, E., Birney, Davies, R.M., Francis, M.D., Humphray, S.J.,
E., Mongin, E., Ureta-Vidal, A., Woodwark, C., Scott, C.E., Taylor, R.G., Tickle, C., Brown,
Zdobnov, E., Bork, P., Suyama, M., Torrents, D., W.R., Rogers, J., Buerstedde, J.M., Wilson,
Alexandersson, M., Trask, B.J., Young, J.M., S.A., Stubbs, L., Ovcharenko, I., Gordon, L.,
Huang, H., Wang, H., Xing, H., Daniels, S., Lucas, S., Miller, M.M., Inoko, H., Shiina, T.,
Gietzen, D., Schmidt, J., Stevens, K., Vitt, U., Kaufman, J., Salomonsen, J., Skjoedt, K., Wong,
Wingrove, J., Camara, F., Mar Alba, M., Abril, G.K., Wang, J., Liu, B., Wang, J., Yu, J., Yang,
J.F., Guigo, R., Smit, A., Dubchak, I., Rubin, H., Nefedov, M., Koriabine, M., Dejong, P.J.,
E.M., Couronne, O., Poliakov, A., Hubner, N., Goodstadt, L., Webber, C., Dickens, N.J.,
Ganten, D., Goesele, C., Hummel, O., Kreitler, Letunic, I., Suyama, M., Torrents, D., von
T., Lee, Y.A., Monti, J., Schulz, H., Zimdahl, Mering, C., Zdobnov, E.M., Makova, K.,
H., Himmelbauer, H., Lehrach, H., Jacob, H.J., Nekrutenko, A., Elnitski, L., Eswara, P., King,
Bromberg, S., Gullings-Handley, J., Jensen- D.C., Yang, S., Tyekucheva, S., Radakrishnan,
Seaman, M.I., Kwitek, A.E., Lazar, J., Pasko, A., Harris, R.S., Chiaromonte, F., Taylor, J.,
D., Tonellato, P.J., Twigger, S., Ponting, C.P., He, J., Rijnkels, M., Griffiths-Jones, S., Ureta-
Duarte, J.M., Rice, S., Goodstadt, L., Beatson, Vidal, A., Hoffman, M.M., Severin, J., Searle,
S.A., Emes, R.D., Winter, E.E., Webber, S.M., Law, A.S., Speed, D., Waddington, D.,
C., Brandt, P., Nyakatura, G., Adetobi, M., Cheng, Z., Tuzun, E., Eichler, E., Bao, Z., Flicek,
Chiaromonte, F., Elnitski, L., Eswara, P., P., Shteynberg, D.D., Brent, M.R., Bye, J.M.,
Hardison, R.C., Hou, M., Kolbe, D., Makova, Huckle, E.J., Chatterji, S., Dewey, C.,
K., Miller, W., Nekrutenko, A., Riemer, C., Pachter, L., Kouranov, A., Mourelatos, Z.,
Schwartz, S., Taylor, J., Yang, S., Zhang, Y., Hatzigeorgiou, A.G., Paterson, A.H., Ivarie, R.,
NCBI Map Viewer Lindpaintner, K., Andrews, T.D., Caccamo, M., Brandstrom, M., Axelsson, E., Backstrom, N.,
to Browse Clamp, M., Clarke, L., Curwen, V., Durbin, Berlin, S., Webster, M.T., Pourquie, O.,
Genomic R., Eyras, E., Searle, S.M., Cooper, G.M., Reymond, A., Ucla, C., Antonarakis, S.E., Long,
Sequence Data Batzoglou, S., Brudno, M., Sidow, A., Stone, M., Emerson, J.J., Betran, E., Dupanloup, I.,
18.5.22
Supplement 69 Current Protocols in Human Genetics
Kaessmann, H., Hinrichs, A.S., Bejerano, G., M., Dedhia, N., Blocker, H., Hornischer, K.,
Furey, T.S., Harte, R.A., Raney, B., Siepel, A., Nordsiek, G., Agarwala, R., Aravind, L., Bailey,
Kent, W.J., Haussler, D., Eyras, E., Castelo, R., J.A., Bateman, A., Batzoglou, S., Birney, E.,
Abril, J.F., Castellano, S., Camara, F., Parra, Bork, P., Brown, D.G., Burge, C.B., Cerutti, L.,
G., Guigo, R., Bourque, G., Tesler, G., Pevzner, Chen, H.C., Church, D., Clamp, M., Copley,
P.A., Smit, A., Fulton, L.A., Mardis, E.R., and R.R., Doerks, T., Eddy, S.R., Eichler, E.E.,
Wilson, R.K. 2004. Sequence and compara- Furey, T.S., Galagan, J., Gilbert, J.G., Harmon,
tive analysis of the chicken genome provide C., Hayashizaki, Y., Haussler, D., Hermjakob,
unique perspectives on vertebrate evolution. H., Hokamp, K., Jang, W., Johnson, L.S., Jones,
Nature 432:695-716. T.A., Kasif, S., Kaspryzk, A., Kennedy, S.,
International Human Genome Sequencing Consor- Kent, W.J., Kitts, P., Koonin, E.V., Korf, I.,
tium. 2004. Finishing the euchromatic sequence Kulp, D., Lancet, D., Lowe, T.M., McLysaght,
of the human genome. Nature 431:931-945. A., Mikkelsen, T., Moran, J.V., Mulder, N.,
Pollara, V.J., Ponting, C.P., Schuler, G.,
Karolchik, D., Hinrichs, A. S., and Kent, W. J. Schultz, J., Slater, G., Smit, A.F., Stupka, E.,
2009. The UCSC Genome Browser. Curr. Pro- Szustakowski, J., Thierry-Mieg, D., Thierry-
toc. Bioinform. 28:1.4.1-1.4.26. Mieg, J., Wagner, L., Wallis, J., Wheeler,
Lander, E.S., Linton, L.M., Birren, B., Nusbaum, R., Williams, A., Wolf, Y.I., Wolfe, K.H.,
C., Zody, M.C., Baldwin, J., Devon, K., Dewar, Yang, S.P., Yeh, R.F., Collins, F., Guyer, M.S.,
K., Doyle, M., FitzHugh, W., Funke, R., Gage, Peterson, J., Felsenfeld, A., Wetterstrand,
D., Harris, K., Heaford, A., Howland, J., Kann, K.A., Patrinos, A., Morgan, M.J., de Jong,
L., Lehoczky, J., LeVine, R., McEwan, P., P., Catanese, J.J., Osoegawa, K., Shizuya, H.,
McKernan, K., Meldrim, J., Mesirov, J.P., Choi, S., and Chen, Y.J. 2001. Initial sequencing
Miranda, C., Morris, W., Naylor, J., Raymond, and analysis of the human genome. Nature
C., Rosetti, M., Santos, R., Sheridan, A., 409:860-921.
Sougnez, C., Stange-Thomann, N., Stojanovic, Lindblad-Toh, K., Wade, C.M., Mikkelsen, T.S.,
N., Subramanian, A., Wyman, D., Rogers, J., Karlsson, E.K., Jaffe, D.B., Kamal, M., Clamp,
Sulston, J., Ainscough, R., Beck, S., Bentley, M., Chang, J.L., Kulbokas, E.J. 3rd, Zody,
D., Burton, J., Clee, C., Carter, N., Coulson, M.C., Mauceli, E., Xie, X., Breen, M., Wayne,
A., Deadman, R., Deloukas, P., Dunham, A., R.K., Ostrander, E.A., Ponting, C.P., Galibert,
Dunham, I., Durbin, R., French, L., Grafham, F., Smith, D.R., DeJong, P.J., Kirkness, E.,
D., Gregory, S., Hubbard, T., Humphray, S., Alvarez, P., Biagi, T., Brockman, W., Butler,
Hunt, A., Jones, M., Lloyd, C., McMurray, A., J., Chin, C.W., Cook, A., Cuff, J., Daly, M.J.,
Matthews, L., Mercer, S., Milne, S., Mullikin, DeCaprio, D., Gnerre, S., Grabherr, M., Kellis,
J.C., Mungall, A., Plumb, R., Ross, M., M., Kleber, M., Bardeleben, C., Goodstadt,
Shownkeen, R., Sims, S., Waterston, R.H., L., Heger, A., Hitte, C., Kim, L., Koepfli,
Wilson, R.K., Hillier, L.W., McPherson, J.D., K.P., Parker, H.G., Pollinger, J.P., Searle, S.M.,
Marra, M.A., Mardis, E.R., Fulton, L.A., Sutter, N.B., Thomas, R., Webber, C., Baldwin,
Chinwalla, A.T., Pepin, K.H., Gish, W.R., J., Abebe, A., Abouelleil, A., Aftuck, L.,
Chissoe, S.L., Wendl, M.C., Delehaunty, K.D., Ait-Zahra, M., Aldredge, T., Allen, N., An,
Miner, T.L., Delehaunty, A., Kramer, J.B., P., Anderson, S., Antoine, C., Arachchi, H.,
Cook, L.L., Fulton, R.S., Johnson, D.L., Minx, Aslam, A., Ayotte, L., Bachantsang, P., Barry,
P.J., Clifton, S.W., Hawkins, T., Branscomb, E., A., Bayul, T., Benamara, M., Berlin, A.,
Predki, P., Richardson, P., Wenning, S., Slezak, Bessette, D., Blitshteyn, B., Bloom, T., Blye,
T., Doggett, N., Cheng, J.F., Olsen, A., Lucas, J., Boguslavskiy, L., Bonnet, C., Boukhgalter,
S., Elkin, C., Uberbacher, E., Frazier, M., Gibbs, B., Brown, A., Cahill, P., Calixte, N., Camarata,
R.A., Muzny, D.M., Scherer, S.E., Bouck, J.B., J., Cheshatsang, Y., Chu, J., Citroen, M.,
Sodergren, E.J., Worley, K.C., Rives, C.M., Collymore, A., Cooke, P., Dawoe, T., Daza, R.,
Gorrell, J.H., Metzker, M.L., Naylor, S.L., Decktor, K., DeGray, S., Dhargay, N., Dooley,
Kucherlapati, R.S., Nelson, D.L., Weinstock, K., Dooley, K., Dorje, P., Dorjee, K., Dorris, L.,
G.M., Sakaki, Y., Fujiyama, A., Hattori, M., Duffey, N., Dupes, A., Egbiremolen, O., Elong,
Yada, T., Toyoda, A., Itoh, T., Kawagoe, C., R., Falk, J., Farina, A., Faro, S., Ferguson, D.,
Watanabe, H., Totoki, Y., Taylor, T., Ferreira, P., Fisher, S., FitzGerald, M., Foley,
Weissenbach, J., Heilig, R., Saurin, W., K., Foley, C., Franke, A., Friedrich, D., Gage,
Artiguenave, F., Brottier, P., Bruls, T., D., Garber, M., Gearin, G., Giannoukos, G.,
Pelletier, E., Robert, C., Wincker, P., Smith, Goode, T., Goyette, A., Graham, J., Grandbois,
D.R., Doucette-Stamm, L., Rubenfield, E., Gyaltsen, K., Hafez, N., Hagopian, D.,
M., Weinstock, K., Lee, H.M., Dubois, J., Hagos, B., Hall, J., Healy, C., Hegarty, R.,
Rosenthal, A., Platzer, M., Nyakatura, G., Honan, T., Horn, A., Houde, N., Hughes, L.,
Taudien, S., Rump, A., Yang, H., Yu, J., Wang, Hunnicutt, L., Husby, M., Jester, B., Jones, C.,
J., Huang, G., Gu, J., Hood, L., Rowen, L., Kamat, A., Kanga, B., Kells, C., Khazanovich,
Madan, A., Qin, S., Davis, R.W., Federspiel, D., Kieu, A.C., Kisner, P., Kumar, M., Lance,
N.A., Abola, A.P., Proctor, M.J., Myers, R.M., K., Landers, T., Lara, M., Lee, W., Leger, J.P.,
Schmutz, J., Dickson, M., Grimwood, J., Cox, Lennon, N., Leuper, L., LeVine, S., Liu, J.,
D.R., Olson, M.V., Kaul, R., Raymond, C., Liu, X., Lokyitsang, Y., Lokyitsang, T., Lui, A.,
Shimizu, N., Kawasaki, K., Minoshima, S., Macdonald, J., Major, J., Marabella, R., Maru,
Evans, G.A., Athanasiou, M., Schultz, R., Roe, K., Matthews, C., McDonough, S., Mehta, T., High-
B.A., Chen, F., Pan, H., Ramser, J., Lehrach, H., Meldrim, J., Melnikov, A., Meneus, L., Mihalev, Throughput
Sequencing
Reinhardt, R., McCombie, W.R., de la Bastide, A., Mihova, T., Miller, K., Mittelman, R.,
18.5.23
Current Protocols in Human Genetics Supplement 69
Mlenga, V., Mulrain, L., Munson, G., Navidi, Wakefield, M.J., Olender, T., Lancet, D.,
A., Naylor, J., Nguyen, T., Nguyen, N., Nguyen, Huttley, G.A., Smit, A.F., Pask, A., Temple-
C., Nguyen, T., Nicol, R., Norbu, N., Norbu, Smith, P., Batzer, M.A., Walker, JA., Konkel,
C., Novod, N., Nyima, T., Olandt, P., O’Neill, M.K., Harris, R.S., Whittington, C.M., Wong,
B., O’Neill, K., Osman, S., Oyono, L., Patti, E.S., Gemmell, N.J., Buschiazzo, E., Vargas
C., Perrin, D., Phunkhang, P., Pierre, F., Priest, Jentzsch, I.M., Merkel, A., Schmitz, J., Zemann,
M., Rachupka, A., Raghuraman, S., Rameau, A., Churakov, G., Kriegs, J.O., Brosius, J.,
R., Ray, V., Raymond, C., Rege, F., Rise, C., Murchison, E.P., Sachidanandam, R., Smith,
Rogers, J., Rogov, P., Sahalie, J., Settipalli, C., Hannon, G.J., Tsend-Ayush, E., McMillan,
S., Sharpe, T., Shea, T., Sheehan, M., Sherpa, D., Attenborough, R., Rens, W., Ferguson-
N., Shi, J., Shih, D., Sloan, J., Smith, C., Smith, M., Lefèvre, C.M., Sharp, J.A., Nicholas,
Sparrow, T., Stalker, J., Stange-Thomann, N., K.R., Ray, D.A., Kube, M., Reinhardt, R.,
Stavropoulos, S., Stone, C., Stone, S., Sykes, S., Pringle, T.H., Taylor, J., Jones, R.C., Nixon, B.,
Tchuinga, P., Tenzing, P., Tesfaye, S., Dacheux, J.L., Niwa, H., Sekita, Y., Huang, X.,
Thoulutsang, D., Thoulutsang, Y., Topham, Stark, A., Kheradpour, P., Kellis, M., Flicek,
K., Topping, I., Tsamla, T., Vassiliev, H., P., Chen, Y., Webber, C., Hardison, R.,
Venkataraman, V., Vo, A., Wangchuk, T., Nelson, J., Hallsworth-Pepin, K., Delehaunty,
Wangdi, T., Weiand, M., Wilkinson, J., Wilson, K., Markovic, C., Minx, P., Feng, Y., Kremitzki,
A., Yadav, S., Yang, S., Yang, X., Young, G., C., Mitreva, M., Glasscock, J., Wylie, T.,
Yu, Q., Zainoun, J., Zembek, L., Zimmer, A., Wohldmann, P., Thiru, P., Nhan, M.N., Pohl,
and Lander, E.S. 2005. Genome sequence, com- CS., Smith, S.M., Hou, S., Nefedov, M., de Jong,
parative analysis and haplotype structure of the P.J., Renfree, M.B., Mardis, E.R., and Wilson,
domestic dog. Nature 438:803-819. R.K. 2008. Genome analysis of the platypus
Mikkelsen, T.S., Wakefield, M.J., Aken, B., reveals unique signatures of evolution. Nature
Amemiya, C.T., Chang, J.L., Duke, S., Garber, 453:175-183.
M., Gentles, A.J., Goodstadt, L., Heger, A., Waterston, R.H., Lindblad-Toh, K., Birney, E.,
Jurka, J., Kamal, M., Mauceli, E., Searle, S.M., Rogers, J., Abril, J.F., Agarwal, P., Agarwala,
Sharpe, T., Baker, M.L., Batzer, M.A., Benos, R., Ainscough, R., Alexandersson, M., An, P.,
P.V., Belov, K., Clamp, M., Cook, A., Cuff, Antonarakis, S.E., Attwood, J., Baertsch, R.,
J., Das, R., Davidow, L., Deakin, J.E., Fazzari, Bailey, J., Barlow, K., Beck, S., Berry, E.,
M.J., Glass, J.L., Grabherr, M., Greally, J.M., Birren, B., Bloom, T., Bork, P., Botcherby, M.,
Gu, W., Hore, T.A., Huttley, G.A., Kleber, M., Bray, N., Brent, M.R., Brown, D.G., Brown,
Jirtle, R.L., Koina, E., Lee, J.T., Mahony, S., S.D., Bult, C., Burton, J., Butler, J., Campbell,
Marra, M.A., Miller, R.D., Nicholls, R.D., Oda, R.D., Carninci, P., Cawley, S., Chiaromonte,
M., Papenfuss, A.T., Parra, Z.E., Pollock, D.D., F., Chinwalla, A.T., Church, D.M., Clamp, M.,
Ray, D.A., Schein, J.E., Speed, T.P., Thompson, Clee, C., Collins, F.S., Cook, L.L., Copley, R.R.,
K., VandeBerg, J.L., Wade, C.M., Walker, Coulson, A., Couronne, O., Cuff, J., Curwen, V.,
J.A., Waters, P.D., Webber, C., Weidman, J.R., Cutts, T., Daly, M., David, R., Davies, J.,
Xie, X., Zody, M.C.; Broad Institute Genome Delehaunty, K.D., Deri, J., Dermitzakis, E.T.,
Sequencing Platform; Broad Institute Whole Dewey, C., Dickens, N.J., Diekhans, M., Dodge,
Genome Assembly Team, Graves, J.A., Ponting, S., Dubchak, I., Dunn, D.M., Eddy, S.R.,
C.P., Breen, M., Samollow, P.B., Lander, E.S., Elnitski, L., Emes, R.D., Eswara, P., Eyras, E.,
and Lindblad-Toh, K. 2007. Genome of the mar- Felsenfeld, A., Fewell, G.A., Flicek, P., Foley,
supial Monodelphis domestica reveals innova- K., Frankel, W.N., Fulton, L.A., Fulton, R.S.,
tion in non-coding sequences. Nature 447:167- Furey, T.S., Gage, D., Gibbs, R.A., Glusman,
177. G., Gnerre, S., Goldman, N., Goodstadt, L.,
Grafham, D., Graves, T.A., Green, E.D.,
Pruitt, K.D., Tatusova, T., and Maglott, D.R. 2007. Gregory, S., Guigo, R., Guyer, M., Hardison,
NCBI reference sequences (RefSeq): A curated R.C., Haussler, D., Hayashizaki, Y., Hillier,
non-redundant sequence database of genomes, L.W., Hinrichs, A., Hlavina, W., Holzer, T.,
transcripts and proteins. Nucleic Acids Res. Hsu, F., Hua, A., Hubbard, T., Hunt, A., Jackson,
35:D61-D65. I., Jaffe, D.B., Johnson, L.S., Jones, M., Jones,
Rhesus Macaque Genome Sequencing and Analysis T.A., Joy, A., Kamal, M., Karlsson, E.K.,
Consortium. 2007. Evolutionary and biomedical Karolchik, D., Kasprzyk, A., Kawai, J., Keibler,
insights from the rhesus macaque genome. Sci- E., Kells, C., Kent, W.J., Kirby, A., Kolbe,
ence 316:222-234. D.L., Korf, I., Kucherlapati, R.S., Kulbokas,
E.J., Kulp, D., Landers, T., Leger, J.P., Leonard,
Schuler, G.D. 1998. Electronic PCR: Bridging the
S., Letunic, I., Levine, R., Li, J., Li, M., Lloyd,
gap between genome mapping and genome se-
C., Lucas, S., Ma, B., Maglott, D.R., Mardis,
quencing. Trends Biotechnol. 16:456-459.
E.R., Matthews, L., Mauceli, E., Mayer, J.H.,
Warren, W.C., Hillier, L.W., Marshall Graves, J.A., McCarthy, M., McCombie, W.R., McLaren,
Birney, E., Ponting, C.P., Grützner, F., Belov, S., McLay, K., McPherson, J.D., Meldrim, J.,
K., Miller, W., Clarke, L., Chinwalla, A.T., Meredith, B., Mesirov, J.P., Miller, W., Miner,
Yang, S.P., Heger, A., Locke, D.P., Miethke, P., T.L., Mongin, E., Montgomery, K.T., Morgan,
Waters, P.D., Veyrunes, F., Fulton, L., Fulton, M., Mott, R., Mullikin, J.C., Muzny, D.M.,
NCBI Map Viewer B., Graves, T., Wallis, J., Puente, X.S., López- Nash, W.E., Nelson, J.O., Nhan, M.N., Nicol,
to Browse Otı́n, C., Ordóñez, G.R., Eichler, E.E., Chen, R., Ning, Z., Nusbaum, C., O’Connor, M.J.,
Genomic L., Cheng, Z., Deakin, J.E., Alsop, A., Okazaki, Y., Oliver, K., Overton-Larty, E.,
Sequence Data Thompson, K., Kirby, P., Papenfuss, A.T., Pachter, L., Parra, G., Pepin, K.H., Peterson, J.,
18.5.24
Supplement 69 Current Protocols in Human Genetics
Pevzner, P., Plumb, R., Pohl, C.S., Poliakov, A.,
Ponce, T.C., Ponting, C.P., Potter, S., Quail, M.,
Reymond, A., Roe, B.A., Roskin, K.M., Rubin,
E.M., Rust, A.G., Santos, R., Sapojnikov,
V., Schultz, B., Schultz, J., Schwartz, M.S.,
Schwartz, S., Scott, C., Seaman, S., Searle, S.,
Sharpe, T., Sheridan, A., Shownkeen, R., Sims,
S., Singer, J.B., Slater, G., Smit, A., Smith, D.R.,
Spencer, B., Stabenau, A., Stange-Thomann, N.,
Sugnet, C., Suyama, M., Tesler, G., Thompson,
J., Torrents, D., Trevaskis, E., Tromp, J.,
Ucla, C., Ureta-Vidal, A., Vinson, J.P., Von
Niederhausern, A.C., Wade, C.M., Wall, M.,
Weber, R.J., Weiss, R.B., Wendl, M.C., West,
A.P., Wetterstrand, K., Wheeler, R., Whelan, S.,
Wierzbowski, J., Willey, D., Williams, S.,
Wilson, R.K., Winter, E., Worley, K.C., Wyman,
D., Yang, S., Yang, S.P., Zdobnov, E.M., Zody,
M.C., and Lander, E.S. 2002. Initial sequencing
and comparative analysis of the mouse genome.
Nature 420:520-562.

High-
Throughput
Sequencing

18.5.25
Current Protocols in Human Genetics Supplement 69

Vous aimerez peut-être aussi