Académique Documents
Professionnel Documents
Culture Documents
Selection
N. Manikanda Boopathi
Genetic Mapping
and Marker Assisted
Selection
Basics, Practice and Benefits
N. Manikanda Boopathi
Plant Molecular Biology &
Bioinformatics
Tamil Nadu Agricultural University
Coimbatore, TN, India
ISBN 978-81-322-0957-7
ISBN 978-81-322-0958-4 (eBook)
DOI 10.1007/978-81-322-0958-4
Springer New Delhi Heidelberg New York Dordrecht London
Library of Congress Control Number: 2012954276
Springer India 2013
This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or
part of the material is concerned, specifically the rights of translation, reprinting, reuse of
illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way,
and transmission or information storage and retrieval, electronic adaptation, computer software,
or by similar or dissimilar methodology now known or hereafter developed. Exempted from this
legal reservation are brief excerpts in connection with reviews or scholarly analysis or material
supplied specifically for the purpose of being entered and executed on a computer system, for
exclusive use by the purchaser of the work. Duplication of this publication or parts thereof is
permitted only under the provisions of the Copyright Law of the Publishers location, in its
current version, and permission for use must always be obtained from Springer. Permissions for
use may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable
to prosecution under the respective Copyright Law.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this
publication does not imply, even in the absence of a specific statement, that such names are
exempt from the relevant protective laws and regulations and therefore free for general use.
While the advice and information in this book are believed to be true and accurate at the date of
publication, neither the authors nor the editors nor the publisher can accept any legal responsibility
for any errors or omissions that may be made. The publisher makes no warranty, express or
implied, with respect to the material contained herein.
Printed on acid-free paper
Springer is part of Springer Science+Business Media (www.springer.com)
Preface
Preface
vi
N. Manikanda Boopathi
nmboopathi@tnau.ac.in
www.sites.google.com/sites/drnmboopathi
Contents
1
2
2
3
5
8
9
9
10
10
20
20
20
20
23
23
27
27
28
28
29
29
30
30
31
31
32
33
34
vii
Contents
viii
34
35
35
35
35
37
37
37
39
39
39
40
40
41
42
42
42
43
43
45
45
46
46
51
51
54
54
54
55
55
56
60
61
61
62
62
63
64
64
64
65
66
68
Contents
ix
68
69
69
70
70
70
70
74
74
75
75
75
75
75
75
76
76
76
76
77
78
78
78
80
Phenotyping ...................................................................................
Phenotyping Versus QTL Mapping.................................................
Need for Precise Phenotyping.........................................................
Phenotyping for Biotic Stress .........................................................
109
109
110
111
Contents
112
113
115
115
115
115
117
117
119
120
124
124
125
125
125
140
140
141
141
142
143
144
144
145
146
146
146
147
148
151
152
153
153
155
155
156
157
161
162
162
163
Contents
xi
10
166
167
168
170
171
171
172
172
172
173
173
175
184
185
185
185
186
187
187
188
188
188
189
189
189
190
190
190
190
191
191
192
175
176
177
181
181
181
184
184
Contents
xii
193
193
195
196
196
199
200
200
201
202
202
202
203
203
203
204
204
205
206
206
207
209
210
210
211
212
213
213
214
215
217
219
219
220
220
221
221
222
222
224
225
225
225
225
225
Contents
xiii
225
227
227
227
228
230
231
231
231
232
232
233
233
234
234
236
236
236
237
238
239
240
241
243
243
244
245
245
246
246
246
247
247
248
248
250
250
256
257
257
Contents
xiv
258
259
260
260
260
262
262
263
263
263
264
264
264
264
265
265
265
266
266
266
267
267
268
268
269
270
271
271
272
273
274
275
275
276
276
277
277
278
278
Contents
xv
281
283
285
286
288
289
290
290
290
Germplasm Characterisation:
Utilising the Underexploited
Resources
N.M. Boopathi, Genetic Mapping and Marker Assisted Selection: Basics, Practice
and Benefits, DOI 10.1007/978-81-322-0958-4_1, Springer India 2013
Early Vigour
Several physiological and biochemical studies
have shown that selection of germplasm accessions that shown early and vigorous establishment allow the stored water available for later
developmental stages when soil moisture becomes
progressively exhausted and increasingly limiting
Flowering Time
Another critical factor that optimises adaptation
(and produce better yield) under low water availability is flowering time. It was established in
almost all the crops that there is positive association between yield and flowering time across
different levels of water availability. Days to
achieve 50% flowering can be phenotyped quite
easily and effectively under both irrigated control and water-stressed experimental conditions,
and it can be used as a valuable trait for drought
tolerance breeding program. Flowering delay
(=days to flowering under stress conditions
days to flowering under irrigated control) could
serve as a potential additional trait to the 50%
flowering.
Chlorophyll Concentration, Leaf Rolling
and Leaf Drying
The traits that have been phenotyped to indirectly
estimate photosynthetic potential (a critical element that decides final yield) are chlorophyll
concentration, leaf rolling and leaf drying, all of
which are interconnected. Total and individual
components of chlorophylls and chlorophyll stability index can be measured both under normal
and water stressed conditions. Similarly, leaf rolling and drying scores need to be phenotyped by
essentially following the procedures around
midday.
Grain Yield
The main objective of drought tolerance breeding
program is to develop a variety that produces
higher yield when compared to currently available varieties in the given environment under the
types of drought stress that occur most frequently.
Allele Mining
in terms of interpreting cause and effect relationships between yield and drought tolerance traits.
Allele Mining
Allele mining refers to identification of naturally
occurring allelic variation at agronomically
important genetic loci (otherwise called as
genes). This can be performed by using a variety
of approaches including mutant screening, QTL
and AB-QTL analysis, association mapping and
genome-wide survey for the signature of artificial
selection (each method is described in details in
subsequent chapters). Though several methods
have been described, efficient extraction and
exploitation of the adaptive variation and valuable traits present in the germplasm is yet to be
uncovered. For example, several traditional and
improved cultivars from drought-prone areas
have some tolerance to reproductive stage
drought stress, but they have rarely been used in
molecular breeding program. A more extensive
survey of these germplasm may lead to the
identification of new germplasm entries carrying
superior alleles for agronomic and economic
crop traits. Such unique alleles can be integrated
into molecular crop breeding program that aimed
to combat pest and diseases; to promote yield,
quality or nutritional properties; or to improve
abiotic stress tolerance.
Thus, the successful allele mining procedure
is highly dependent on the use of diverse germplasm collections, especially those rich in wild
species. This is because the majority of allelic
variation at the gene(s) of interest is largely
assumed to occur in the wild relatives of a crop
(i.e. not in the cultivating crop varieties) due to
the unavoidable loss of variation during the
domestication process. Several efforts have been
made to identify useful new alleles that are present in the wild gene pool in almost all the crop
plants. Despite those efforts, unfortunately, entire
germplasm entries have not yet been efficiently
characterised for their novel phenotypes due to
several challenges including lack of resources
for evaluating huge collections. Alternatively,
core collection of germplasm has been proposed
Box 1.1 Rapid and Inexpensive Strategy for Allele Mining in Rice
Allele Mining
Software
Numerous software programs are available for
assessing genetic diversity, such as Arlequin,
DnaSP, PowerMarker, MEGA2, PAUP, TFPGA,
GDA, GENEPOP, NTSYSpc, Structure, Gene
Strut, POPGENE, Maclade, PHYLIP, SITES,
CLUSTALW and MALIGN. Most of them are
freely available in the World Wide Web. Most of
the programs perform similar tasks, with the main
differences being in the user interface, type of
data input and output, and platform. Thus, choosing which to use depends profoundly on individual favourites.
10
11
12
13
14
15
Individual3
Individual2
Individual1
Ladder
Individual 1
Individual 2
Individual 3
A1
A2
1,0
1,1
0,1
Scoring by band
Locus A
Scoring by genotype
Geno
types
A1A1
A1A2
A2A2
Locus A
16
Individual2
Individual1
Ladder
Individual 1
Individual 2
Locus A
Locus A
Geno
types
AA or Aa
aa
17
12
SSR1
0
1
13
SSR2
1
1
1
SSR3
0
0
9
SSR4
1
1
Note:
First column first row: type of matrix (1 for
rectangular matrix; 2 similarity matrix)
Second column first row: number of the
markers scored in this analysis
Third column first row: number of
accessions
Fourth column first row: presence of missing value (0 if there is no missing value; 1
if there is any missing value)
Fifth column first row: the value given for
missing value (if any)
First column second row: leave it empty
First column second row: marker (or quantitative trait) names in each column
First column third row: name of the accessions in the entire column (it is better to
restrict the marker name and accession
name to eight characters)
Second column third row onwards: marker
score for each accession for the corresponding
marker.
2. Save the Excel file as *.txt (text tab delimited
file) and import this file through NTedit.
18
Interpretation of Results
Yet another critical step in a diversity analysis is to investigate the variation present
in the germplasm, that is, not to visualise
(continued)
19
PIC = 1 Pi 2 ,
Individual5
Individual4
Individual3
Individual2
Individual1
Ladder
n =1
Ind1
Indi2
Ind3
Ind4
Ind5
Freq*
Freq2**
SSR1a
2/5
(2/5)2 = 0.16
SSR1b
1/5
(1/5)2 = 0.04
SSR1c
2/5
(2/5)2 = 0.16
SSR1d
1/5
(1/5)2 = 0.04
Sum
PIC
0.40
0.60
Freq*: frequency of allele = number of individual having this allele/total number of individuals
Freq2**: (frequency of allele)2
PIC = 1 sum
20
Parental Selection
Successful crop breeding program depends on
careful selection of parents that complement each
other for the given trait and yield. Thus, choosing
parents is one of the most important steps in
a breeding program. Although breeders have
different approaches for parental selection, all
the strategies share a common feature: Selected
parents should be as diverse as possible at phenotypic and genotypic level. At least one locally
adapted, popular cultivar is used as one parent to
ensure the recovery of a high proportion of progenies with adaptation and quality that are acceptable by farmers and end users. Each parent should
complement the weakness of the other parent.
For instance, when we select parents for drought
tolerance breeding, it is better to avoid parents that
are highly drought susceptible but genetically
diverse. In such cases, use of improved modern
varieties as one of the parent may offer many disease-, insect- and abiotic stress-tolerant genes.
Thus, a thorough phenotyping and genetic diversity analysis will lead to identify most appropriate parental lines for biparental or multiparental
crosses to produce new segregating populations
(discussed in chapter 2) suitable for high-resolution
genetic map construction and efficient quantitative trait loci (QTL) discovery.
Bibliography
Literature Cited
Comai L, Young K, Till BJ et al (2004) Efficient discovery
of DNA polymorphisms in natural populations by
Ecotilling. Plant J 37:778786
Labate JA (2000) Software for population genetic analysis
of molecular marker data. Crop Sci 40:15211528
Mohammadi SA, Prasanna BM (2003) Analysis of genetic
diversity in crop plants salient statistical tools and
considerations. Crop Sci 43:12351248
Further Readings
Alpert P (2006) Constraints of tolerance: why are desiccation-tolerant organisms so small or rare? J Exp Biol
209:15751584
Bibliography
Ribaut JM (ed) (2006) Drought adaptation in cereals.
The Haworth Press Inc, Binghamton, 642 pp
Richards RA (2008) Genetic opportunities to improve
cereal root systems for dryland agriculture. Plant Prod
Sci 11:1216
Torres R, Mackill D (2006) Improvement of rice drought
tolerance through backcross breeding: evaluation of
21
donors and selection in drought nurseries. Field Crop
Res 97:7786
Tuberosa R, Salvi S (2007) Dissecting QTLs for tolerance
to drought and salinity. In: Jenks MA, Hasegawa PM,
Jain M (eds) Advances in molecular breeding toward
drought and salt tolerant crops. Springer, Dordrecht,
pp 381411
N.M. Boopathi, Genetic Mapping and Marker Assisted Selection: Basics, Practice
and Benefits, DOI 10.1007/978-81-322-0958-4_2, Springer India 2013
23
24
1
x
2
Requires less time
to be developed
The populations
can be further
utilised for
marker-assisted
backcross breeding
Number of generations
required to make
Number of informative
gametes per individual
Number of recombinant
events per gamete
Number of possible
genotypes per locus
Merits
BC progenies
Parent (x)
Parent F1 (x)
Parent BC
F2 progenies
Parent (x)
Parent F1 (s) F2
Particulars
Development procedure
DH lines
Parent (x)
Parent F1 Anther
culture DH lines
Table 2.1 Characteristics of major types of mapping populations used in genetic mapping studies.
2x
68
RILs
Parent (x)
Parent F1 (s) SSD F6
or more RILs
(continued)
NILs
Parent (x) Parent F1
(x) Parent BC
continues with Parent
up to BC6 (s) two
generations NILs
9
1:1
1:1
The recombination
information in case
of backcrosses is
based on only one
parent
1:0a
1:1a
Quantitative traits
cannot be precisely
mapped using F2
population as each
individual is genetically
different and cannot be
evaluated in replicated
trials over locations and
years. Thus, the effect
the G x E interaction or
epistatic interaction on
the expression of
quantitative traits cannot
be precisely estimated
Not a long-term
population; impossible
to construct exact
replica or increase seed
amount
3:1
1:2:1
1:1
1:1
RILs
Requires many seasons/
generations to develop.
DH lines
Recombination from the
male side alone is
accounted
BC progenies
They are not
immortal
F2 progenies
Linkage established
using F2 population is
based on one cycle of
meiosis
F2 populations are of
limited use for fine
mapping.
1:1
1:1
Linkage drag is a
potential problem in
constructing NILs, which
has to be taken care of
NILs
Require many generations
for development
Inheritance of dominant
markers
Inheritance of
co-dominant markers
Particulars
Demerits
26
Mapping Population Development
F2 Progenies
27
F2 Progenies
Development of F2 progenies are the simplest and
rapid method when compared to other mapping
population types. This is the population in which
the foundations of Mendelian laws were first
established. Usually, two pure lines that result
from natural or artificial inbreeding are selected
as parents (Fig. 2.1). Alternatively, two doubled
haploid lines can be used as parents to avoid any
residual heterozygosity. Crossing of such parents
will lead to produce fertile progenies and those
progenies are called as F1 generation. If the parental lines are true homozygotes, all individuals of
the F1 generation will have the same genotype
and have a similar phenotype as per the Mendels
law of uniformity. Each individual of F1 plant is
then selfed to produce F2 population that segregates for the given trait. Thus, F2 population is the
outcome of one meiosis, during which the genetic
material is recombined. The expected segregation ratio for each co-dominant marker is 1:2:1
28
Male parent
(Donor parent )
Female parent
(elite line)
aaBB
F1
Hybrid
Haploids
AB
Ab
AaBb
AAbb
X
Anther culture
ab
aB
F1
Chromosome doubling by
Colchicine treatment
BC1F1
Female parent
(elite line)
X
Female parent
(elite line)
BC2F1
Doubled
haploids
AABB
AAbb
aaBB
F2
aabb
BC4F1
S
BC4F2 Near Isogenic Lines (NILs)
F3
SSD
(Each plant contributes a single
offspring to the next generation)
F7
Fig. 2.1 Schematic illustration that explains development of commonly used mapping populations in genetic mapping.
X refers to crossing, S refers to selfing, SSD single seed descent method
F2 Intermating Populations
or Immortalised F2 Populations
Random intermating of F2 populations has been
suggested for obtaining precise estimates of recombination frequencies between tightly linked loci.
Immortalised F2 populations can be developed by
paired crossing of the randomly chosen RILs
derived from a cross in all possible combinations
excluding reciprocals. The set of RILs used for
crossing along with the F1s produced provides a
true representation of all possible genotype combinations (including the heterozygotes) expected in
the F2 of the cross from which the RILs are derived.
The RILs can be maintained by selfing and required
quantity of F1 seed can be produced at will by fresh
hybridisation. This population therefore provides
an opportunity to map heterotic QTLs and interaction effects from multi-location data.
BC Progenies
DH Lines
Doubled haploid (DH) lines contain two identical sets of chromosomes in their cells. They are
completely homozygous, as only one allele is
available for all the genes. Usually, DH lines are
produced from haploid lines. These haploid lines
either occur spontaneously (e.g. rapeseed and
maize) or can be induced artificially (Fig. 2.1).
Haploid plants are usually smaller and less vigorous than diploids and nearly sterile. Haploids
can be induced by culturing immature anthers on
special media, and haploid plant can later be
regenerated from the haploid cells of the gametophyte. Alternatively, microspore culture can be
employed. As a rare event, in some of the haploid plants, the chromosome number doubles
spontaneously that leads to DH plants. Such
lines can also be obtained artificially by colchicine treatment of haploid plants. It is shown that
colchicine prevents the formation of the spindle
apparatus during mitosis and thus inhibits the
separation of chromosomes and leading to DH
plants. If callus is induced in haploid plants, a
doubling of chromosomes often occurs spontaneously during endomitosis and DH lines can be
regenerated via somatic embryogenesis. On the
other hand, in vitro culture conditions may
decrease the genetic variability of regenerated
29
BC Progenies
To analyse the specific genes or other regulatory
DNA elements derived from one parent (i.e.
donor parent) in the background of another parent (i.e. recurrent (or elite) parent), the hybrid F1
plant is backcrossed to recurrent parent (Fig. 2.1).
Two key features that best describe BC progenies
are: unlinked donor fragments are separated by
segregation and linked donor fragments are minimised due to recombination with the recurrent
parent. In order to reasonably reduce the number
and size of donor fragments, backcrossing is
repeated. With each round of backcrossing, the
proportion of the donor genome is reduced by
50%. Sometimes backcrossing process can be
accelerated by use of recurrent parent-specific
markers (referred to as background markers; discussed in detail in chapter 3). With each round of
backcrossing, the number and size of genomic
fragments of the donor parent are reduced until a
single gene (or other regulatory DNA element)
differentiates the BC progeny from the recurrent
parent. That particular progeny is later screened
for the trait introduced by the donor. In the event
of dominant expression of traits, the progeny can
be screened directly; on the other hand, recessive
expression of traits requires the testing of selfed
progeny of each BC progeny. Identical BC progeny with the exception of few donor loci is called
as near isogenic lines (NILs) and discussed separately (see below). BC progeny incorporated with
a fragment of genomic DNA from a very distantly
related species is called as introgression line,
while the BC progeny incorporated with genetic
material from a different variety is indicated as
inter-varietal substitution lines. At this point, it
should be noted that recombination is reduced in
30
RILs
Recombinant inbred lines (RILs) are the homozygous selfed or sib-mated progeny of the individuals of an F2 population (Fig. 2.1). Use of RIL
concept in genetic mapping was originally developed for mouse. Nearly 20 generations of sib
mating are required to reach useful levels of
homozygosity in animals. However, in plants,
RILs with more than 98% homozygosity are produced by selfing within eight or nine generations
(unless the species is completely self-incompatible). Self-pollination allows production of RILs
in a relatively short period of time. In fact, in some
of the strict self-pollinating crops, almost complete homozygosity can be reached within six
generations. Development of RILs is usually following a single-seed descent method, since during the selfing process, one seed of each line is the
source for the next generation. Bulk method and
pedigree methods without selection can also be
used for production of RILs. In RILs, alleles
derived from either of the parent are arranged in
alternative way along each chromosome. In each
generation, meiotic events lead to further recombination and reduce heterozygosity until completely homozygous RILs with fragments of either
parental genome are achieved. Since recombination cannot change the genetic constitution of
RILs, further segregation in the progeny of such
lines is absent. Because of this, RILs are considered as a permanent resource that can be replicated indefinitely and be shared by many groups
among the researchers. Another advantage of
using RILs is it can be used to construct higherresolution genetic map than F2 populations, and
hence, the map positions of even tightly linked
Multi-Cross Populations
31
Multi-Cross Populations
The features of the genetic structure of RILs can
be studied using two-, four- and eight-way crosses
following either selfing or sib mating. Though
eight-way cross RILs have been successfully
shown in mouse, it is yet to be demonstrated in
major crops. Interestingly, there are several contrasting features between the nested association
mapping (NAM) strategy (explained below) and
eight-way cross RILs. In maize, which has very
low linkage disequilibrium and tremendous genetic
32
Natural Populations
33
Natural Populations
The main limitations of experimental mapping
populations are: they are laborious, time consuming and require great care and effort in construction. The natural variation existing among
individuals of one species can also be exploited
for genetic mapping. In case of crops, germplasm
entries consisting of different breeding materials
and wild species can fulfil this purpose. It has
been shown that such natural populations can be
used to map complex traits that are influenced by
the action of many genes in a quantitative way.
However, it is important that such a collection of
different accessions of the germplasm should
contain a whole range of phenotypes for a given
trait. More importantly, the availability of extreme
phenotypes of interest is valuable. The basic
norm of this idea is that genomic fragments naturally present in a particular genotype are transmitted as non-recombining blocks and that
molecular markers can easily follow the inheritance of such blocks. These are called as haplotypes and their existence reveals a state of linkage
disequilibrium (LD) among allelic variants of
tightly linked genes (explained in detail in
Chapter 6). Usually, the association between a
marker and a trait can exist if one marker allele or
haplotype is significantly associated with a
particular phenotype when studied in unrelated
genotypes (such as natural population). The
main strength of this approach is that it does not
require the construction of mapping populations.
Particularly, for self-pollinating crops, inbred
individuals of natural ecotypes are specifically
immortal, and phenotyping needs to be performed
only once. In addition, natural populations are
34
35
Characterisation of Mapping
Populations
Precise genotypic and phenotypic characterisation
of mapping population is vital for success of any
mapping project. Since the molecular genotype of
any individual is independent of environment, it is
not influenced by G E interaction. However, trait
phenotype could be influenced by the environment, particularly in case of quantitative characters. Therefore, it becomes important to precisely
estimate the trait value by evaluating the genotypes
in multi-location testing over seasons and/or years
using immortal mapping populations to have a
valid markertrait association.
36
Bibliography
37
Bibliography
Literature Cited
Broman KW (2005) The genomes of recombinant inbred
lines. Genetics 169:11331146
Frisch M, Melchinger EA (2008) Precision of recombination frequency estimates after random intermating
with finite population sizes. Genetics 178:597600
McMullen MD et al (2009) Genetic properties of maize
nested association mapping population. Science
325:737740
Michelmore RW, Paran I, Kesseli RV (1991) Identification
of markers linked to disease resistance genes by bulked
segregant analysis: a rapid method to detect markers in
specific regions by using segregating populations.
Proc Natl Acad Sci USA 88:98289832
Qin H, Guo W, Zhang YM, Zhang T (2008) QTL mapping
of yield and fiber traits based on a four-way cross population in Gossypium hirsutum L. Theor Appl Genet
117:883894
Tanksley SD, Nelson JC (1996) Advanced backcross QTL
analysis: a method for the simultaneous discovery and
transfer of valuable QTLs from unadapted germplasm
into elite breeding lines. Theor Appl Genet 92:191203
Yu J et al (2008) Genetic design and statistical power of
nested association mapping in maize. Genetics
178:539551
Further Readings
McCouch SR, Kochert G, Yu ZH, Wang ZY, Khush GS,
Tanksley SD, Coffman RW (1988) Molecular mapping
of rice chromosomes. Theor Appl Genet 76:815829
Rao SQ, Xu SZ (1998) Mapping quantitative trait loci for
ordered categorical traits in four-way crosses. Heredity
81:214224
Xu S (1996) Mapping quantitative trait loci using fourway crosses. Genet Res 68:175181
Morphological Markers
During the early days of plant breeding, breeders
use to cross and select the progeny based on
certain neutral characteristics, since these easily
recognisable characteristics most probably coincide with specific expression of agronomically
and economically important traits. Therefore,
those visibly observable characteristics are
used to mark or tag the desired (or sometime
undesired) progeny among the population, and
they are called as phenotypic or morphological
markers. Genetically, their function is based on
linkage between the genes for the characteristics
and the agronomic trait. This concept of using
markers in genetics dates back to as early as nineteenth century. Gregor Mendel used phenotypebased genetic markers in his experiment. It is also
interesting to note that those phenotype-based
genetic markers in Drosophila led to the establishment of the theory of genetic linkage by
Alfred Henry Sturtevant at Dr. Morgans laboratory
(the details of linkage mapping are discussed in
N.M. Boopathi, Genetic Mapping and Marker Assisted Selection: Basics, Practice
and Benefits, DOI 10.1007/978-81-322-0958-4_3, Springer India 2013
39
40
Principle
Isoenzyme in plant breeding (also called as protein markers) is based on the principle that allelic
41
Electrophoresis
Electrophoresis is presently the most powerful analytical technique available to separate isoenzymes.
Its scope of application has been broadened
tremendously in recent years by simplification of
the apparatus and by the development of synthetic
support media, which have shortened the time of
analysis. The theory underlying electrophoresis is
simple. Direct current is used to separate the individual isoenzymes (electrophoretic mobility) by
taking advantage of the differences in net charge
of each isoenzyme. Changes in electrophoretic
42
Chromatography
Immunochemistry
Gel Filtration
Gel filtration or molecular sieving is carried out
on various cross-linked dextran polymers (e.g.
Sephadexes) or cross-linked polyacrylamide
polymers (e.g. Bio-Gel P). However, use of dextran
43
Catalysis
44
Chromosome Structure
The chromosomes of eukaryotic cells are larger
and more complex than those found in prokaryotes,
but each unreplicated chromosome nevertheless
consists of a single molecule of DNA. Although
linear, the DNA molecules in eukaryotic chromosomes are highly folded and condensed; if stretched
out, some human chromosomes would be several
centimetres longthousands of times longer than
the span of a typical nucleus. To package such a
tremendous length of DNA into this small volume,
each DNA molecule is coiled again and again and
tightly packed around histone proteins, forming the
rod-shaped chromosomes. Most of the time, the
chromosomes are thin and difficult to observe, but
before cell division, they condense further into
thick, readily observed structures; it is at this stage
that chromosomes are usually studied under genetic
mapping.
A functional chromosome has three essential elements: a centromere, a pair of telomeres
and origins of replication. The centromere is
the attachment point for spindle microtubules,
which are the filaments responsible for moving
chromosomes during cell division. The centromere appears as a constricted region that
45
Mitochondrial DNA
In animals and most fungi, the mitochondrial
genome consists of a single, highly coiled, circular DNA molecule (mtDNA). Plant mitochondrial
genomes often exist as a complex collection of
multiple circular DNA molecules. Each mitochondrion contains multiple copies of the mitochondrial genome, and a cell may contain many
mitochondria. Like eubacterial chromosomes,
mtDNA lacks the histone proteins normally associated with eukaryotic nuclear DNA. The guaninecytosine (GC) content of mtDNA is often
sufficiently different from that of nuclear DNA
that mtDNA can be separated from nuclear DNA
by density gradient centrifugation. Mitochondrial
genomes are small compared with nuclear
genomes and vary greatly in size among different
organisms. Most of this size variation is in
non-coding sequences such as introns and intergenic regions. Flowering plants (angiosperms)
have the largest and most complex mitochondrial
genomes known; their mitochondrial genomes
46
Chloroplast DNA
Geneticists have long recognised that many
traits associated with chloroplasts exhibit cytoplasmic inheritance, indicating that these traits
are not encoded by nuclear genes. In 1963, chloroplasts were shown to have their own DNA.
Among different plants, the chloroplast genome
ranges in size from 80,000 to 600,000 bp, but
most chloroplast genomes range from 120,000
to 160,000 bp. Chloroplast DNA (cpDNA) is
usually contained on a single, double-stranded
DNA molecule that is circular, is highly coiled
and lacks associated histone proteins. As in
mtDNA, multiple copies of the chloroplast
genome are found in each chloroplast, and there
are multiple organelles per cell; so there are several hundred to several thousand copies of
cpDNA in a typical plant cell.
Molecular Markers
A molecular marker is defined as a particular
segment of DNA that differs among individuals
at the nucleotide level. Molecular markers may or
may not correlate with phenotypic expression of
Molecular Markers
47
Target sequence
5
3
3
5
94-96C
nd
2 Cycle
8 copies
Annealing
Exponential
Amplification
30 -35 cycles
30-55C
3rd Cycle
16 copies
35th cycle
236 copies
Extension
5
72C
3
How It Works?
The PCR reaction requires the following
components:
DNA Template: It is the sample DNA that
contains the target sequence. At the beginning
of the reaction, high temperature is applied to
the original double-stranded DNA molecule
to separate the strands from each other, and
this process is termed as denaturation.
48
Molecular Markers
49
50
PCR-Based Techniques
51
PCR-Based Techniques
After the invention of polymerase chain reaction
(PCR) technology (Mullis and Faloona 1987;
Box 3.1), a large number of approaches for generation of variety of molecular markers were
described and used in genetic mapping. This is
primarily due to its obvious simplicity and high
probability of success. Further, usage of random
primers overcame the limitation of prior sequence
knowledge for PCR analysis and facilitated the
development of genetic markers for a range of
purposes. PCR-based techniques can further be
subdivided into two subcategories: (1) arbitrarily
primed PCR-based techniques or sequence
nonspecific techniques and (2) sequence-targeted
PCR-based techniques.
Isozymes
RFLP
RAPD
SSR
AFLP
Genetic diversity
Fine mapping
Map-based cloning
Complicated methodology
Each marker has less alleles
Mixture interpretation is more difficult
Require costly equipments to assay
Genetic diversity
Saturation mapping
Hybrid fixation
Genetic diversity
Map construction
Map construction
Genetic diversity
Applications
Conventional plant
breeding program
Multiple loci
Phenotype-based analysis
Disadvantages
Limited in number
Laborious and time-consuming procedures
Relatively few biochemical assays available
to detect enzymes
Advantages
Simple to assay
Lowest cost involved protocol
Robust and highly reproducible
SNP
Marker class
Morphological
markers
Table 3.1 Properties, advantages and limitations of markers used in genetic mapping
52
Genotyping of Mapping Population
Genome and
QTL-mapping
potential
Comparative
mapping
potential
Reproducibility
Transferability
Degree of
polymorphism
Amount of DNA
sample required
Ease of assay
Can be
automated?
Equipment cost
Development
cost
Assay cost
Easy
Yes
Difficult
Difficult
Expensive
Expensive
Expensive
Cheap
Cheap
Cheap
Very limited
Excellent
Good
Very good
Low to medium
Within species
Moderate
Moderate
Moderate
~10 ng
~10 mg
Few mg
of tissue
Easy
Difficult
Unlimited
Not applicable
(presence/
absence type
of detection)
Lowmedium
Limited by
the size of
genome and by
nucleotide
polymorphism
RAPD
Anonymous
Lowmedium
100 s
Rare to
extremely rare
RFLP
Anonymous/
genic
Limited by the
restriction site
(nucleotide)
polymorphism
Low
Limited by
the number
of enzyme
genes and
histochemical
enzyme assays
available
3050
Rare
Maximum
theoretical
number of
possible loci
in analysis
Number of loci
Null alleles
Isozymes
Genic
Features
Origin
Very limited
Medium to
high
Very good
Within species
Moderate
Expensive
Moderate
Moderate
Yes
~25 ng
Unlimited
Not applicable
(presence/
absence type
of detection)
Lowmedium
Limited by the
restriction site
(nucleotide)
polymorphism
AFLP
Anonymous
~10 ng
1,000 s
Not applicable
(presence/
absence type
of detection)
Lowmedium
Limited by
the size of
genome and
by nucleotide
polymorphism
Good
Within genus
or species
Medium to
high
Good
Moderate
Expensive
Very expensive
Very limited
Within genus
or species
Low to
medium
Very good
Moderate
Moderate
Moderate
~50 ng
Mediumhigh
10s
Occasional
to common
Limited by the
size of genome
and number of
simple repeats
in a genome
SSR
ISSR
Anonymous/genic Anonymous
Limited
Limited
Within genus
or species
High
Moderate
Moderate
Expensive
Moderate
Yes
~25 ng
Mediumhigh
Limited
Limited
Within genus
or species
High
Moderate
Moderate
Moderate
Easy
Semi-automated
~25 ng
Mediumhigh
10s
10s
Rare to extremely Rare to
rare
extremely rare
SCAR
CAPS
Anonymous/genic Anonymous/
genic
Limited by the
Limited by the
size of genome
size of genome
Limited
Moderate to
expensive
Within genus
or species
Medium to
high
Very good
Expensive
Expensive
Easy
Yes
~50 ng
Mediumhigh
10s
Rare to
extremely rare
SNP
Anonymous/
genic
Limited by the
size of genome
54
Restriction digestion and
Gel electrophoresis
DNA isolated
from individuals
Transfer of digested DNA
fragments to a membrane
(Southern blotting)
Autoradiography
(X-ray film sandwiched to
the membrane to detect
radioactive pattern)
Individual A
description
of
RFLP
marker
55
a
1
b
x
Random 10 bp oligonucleotide primer; for simplicity
only 3 loci are described in the genome
x Single base change destroys target
sequence for primer binding and
hence this locus will not amplify from individual B
PCR amplification of target gene
and agarose gel electrophoresis
A
1
2
3
56
EcoRI
MseI
TTAA
AATT
GAATTC
CTTAAG
G
CTTAAG
A
TAA
T
G
CTTAAG
C
A ACC
TAA
G
CTTAAG
AGC
Microsatellite-Based Marker
Technique
Microsatellites or short tandem repeats (STR) or
simple sequences repeats (SSR) or sequencetagged microsatellite site (STMS) are monotonous repetitions of very short nucleotide motif
(usually one to five base pairs). It occurs as interspersed repetitive elements in all eukaryotic
57
SSR motif in individual B
(AT)5
(TA)5
(AT)20
(TA)20
Differential number of repeats helps in polymorphism identification [note that individual B is having only
5 motifs and hence the PCR product was moved very rapidly whereas the PCR product of C moved slowly
because of its large size (20 motifs)]
58
59
Organelle Microsatellites
Plant organelle genomes such as chloroplast
DNA and mitochondrial DNA have been increasingly applied to study population genetic structure and phylogenetic relationships in plants. Due
to their uniparental mode of transmission
(Box 3.3), chloroplast and mitochondrial genomes
exhibit different patterns of genetic differentiation compared to nuclear alleles. Thus, for a
comprehensive understanding of plant population differentiation and evolution, three interrelated genomes must be considered.
Chloroplast Microsatellites
Numerous studies have shown that chloroplast
microsatellites consisting of relatively short and
several mononucleotide stretches (such as (dA)n
and (dT)n) are ubiquitous and polymorphic.
Chloroplast genome-based markers uncover
genetic discontinuities and distinctiveness among
or between taxa with slight morphological differentiation, which sometimes cannot be revealed
by nuclear DNA markers. The conservation and
homology of sequence in chloroplast genome
makes it possible to compare genes across the
plant kingdom and examine phylogenetic
relationships in taxa that have diverged for
hundreds of thousands to millions of years.
Chloroplast microsatellites are now becoming
firmly established as a high-resolution tool for
60
Dominant, Co-dominant
and Cytoplasmic or Uniparentally
Inherited Markers
For diploid organisms (organisms harbouring
two copies of each chromosome), the exact
genotype of each individual should have two
possible genotypes for the given marker. In
contrast, for markers such as RAPD, AFLP
and ISSR, it is only possible to describe
whether the given marker allele (e.g. A) is
present or not at the given locus. Therefore, in
such cases, one cannot distinguish the
heterozygous genotype (Aa) from the homozygous genotype (AA). It is clear that this genotyping method incurs a loss of information,
and such kinds of markers are referred to as
dominant markers. Alternatively, SSRs,
RFLPs, etc., are called as co-dominant markers since they can distinguish a heterozygote
(two bands for Aa (i.e. the bands produced by
both AA and aa are co-occurring) from each
of homozygotes AA and aa (different sizes of
single band for AA and aa)) (Fig. 3.6).
Dominant markers allow the analysis of
many loci per experiment without requiring
any prior information on their sequence.
For predominantly self-fertilising species,
heterozygosity could be disregarded, and
allele frequencies can be considered as equal
to observed frequencies. In contrast, co-dominant markers allow analysis of only one locus
per experiment, and hence the degree of data
per assay is usually lower. Nevertheless, they
are more informative since the allelic variations of that locus can be distinguished. As a
consequence, we can identify the linkage
b
AA
Aa
aa
AA
Aa
aa
Polymorphism Information
Content (PIC)
PIC value is commonly used in genetics as a
measure of polymorphism for a marker locus
used in linkage analysis. It is the probability
that one could identify which marker allele of
the parents has inherited to the offspring. PIC
can be calculated as described in Chap. 1 or
using the freely available program, CERVUS
v2.0. PIC value for co-dominant markers range
from 0.5 to 1.0 and for dominant markers it has
a maximum value of 0.5.
61
62
Sequence-Characterised Amplied
Regions (SCAR)
In order to utilise markers identified by arbitrary
markers (such as RAPD, AFLP, ISSR) for mapbased cloning and/or efficient marker-assisted
selection (MAS), identification of unambiguous
single locus is a must. In addition, the arbitrary
marker techniques are sensitive to changes in the
reaction conditions. In order to bridge the gap
between the ability to obtain linked markers to a
gene of interest in a short time and the use of
these markers for map-based cloning approaches
and for routine MAS, SCAR marker technique
was developed and applied. The SCARs are PCRbased markers that represent genomic DNA fragments at genetically defined loci. SCARs are
identified by PCR amplification using sequencespecific oligonucleotide primers (Paran and
Michelmore 1993). Development of SCARs
involves cloning the amplified products of arbitrary marker techniques and then sequencing the
two ends of the cloned products. The sequence is
thereafter used to design specific primer pairs of
1530 bp which amplify single major bands of the
size similar to that of cloned fragment. Polymorphism is either retained as the presence or
absence of amplification of the band or can appear
as length polymorphisms convert dominant
63
64
Sequence-Related Amplied
Polymorphism (SRAP)
The aim of SRAP technique (Li and Quiros
2001) is the amplification of open reading
frames (ORFs). It is based on two-primer
specific PCR amplification. The technique uses
primers of arbitrary sequence, which are 1721
nucleotides in length. It uses pairs of primers
with AT- or GC-rich cores to amplify intragenic
fragments for polymorphism detection. The
primers consist of the following elements: (1)
Core sequences, which are 1314 bases long,
where the first 10 or 11 bases starting at the 5end, are sequences of no specific constitution
(filler sequences), followed by the sequence
CCGG in the forward primer and AATT in the
reverse primer and (2) the core is followed by
three selective nucleotides at the 3-end. The
filler sequences of the forward and reverse
primers must be different from each other and
can be 10 or 11 bases long. For the first five
cycles, the annealing temperature is set at 35C.
The following 35 cycles are run at 50C. The
amplified DNA fragments are fractionated by
denaturing acrylamide gels and detected by
autoradiography or silver staining. SRAP combines simplicity, reliability, moderate throughput ratio and facilitate sequencing of selected
bands. SRAP targets coding sequences in the
Single-Strand Conformation
Polymorphism (SSCP)
Single-strand conformation polymorphism is the
mobility shift analysis of single-stranded DNA
sequences on neutral polyacrylamide gel electrophoresis, to detect polymorphisms produced by
differential folding of single-stranded DNA due
to subtle differences in sequence (often a single
base pair) (Orita et al. 1989). In the absence of a
complementary strand, the single strand experiences intra-strand base pairing, resulting in loops
and folds, that gives it a unique 3-D structure
65
66
Fig. 3.9 Schematic representation of development of (a) IRAP and (b) REMAP primers
Retrotransposon-Based Molecular
Markers
In plants with large genomes, retrotransposons
are the major class of repetitive DNA, comprising 4060% of the genome. Based on their structural organisation and amino acid similarities
among their encoded reverse transcriptases, retrotransposons can be divided into three categories. Long terminal direct repeats (LTRs) flank
two of these categories, and they encode proteins
similar to the retroviruses. These LTR retrotransposons are referred to as the gypsy-like and
copia-like retrotransposons. The third class of
retrotransposons, the LINE1-like or non-LTR
Inter-retrotransposon Amplied
Polymorphism (IRAP) and REtrotransposonMicrosatellite Amplied Polymorphism
(REMAP)
IRAP and REMAP are two amplification-based
marker methods which have been developed
based on the position of given LTRs within the
genome. These two markers have been developed
originally for BARE-I retrotransposon of Hordeum
genus, which is present in the barley genome in
numerous copies. The IRAP markers are generated by the proximity of two LTRs using outwardfacing primers annealing to LTR target sequences
(Fig. 3.9). In REMAP, amplification between
LTRs proximal to simple sequence repeats such
as constitutive microsatellites produces markers
(Fig. 3.9). Both IRAP and REMAP examine polymorphism in retrotransposon insertion sites, IRAP
between retrotransposons and REMAP between
retrotransposons and microsatellites (SSRs).
Retrotransposons can integrate in either orientation into the genome. For head-to-head and tailto-tail orientations, PCR products can be generated
using a single primer from elements sufficiently
close to one another. Intervening genomic DNA
for elements in head-to-tail orientation is amplified
using both 5 and 3 LTR primers. The REMAP
method relies on one outward-facing LTR primer
and a second primer from a microsatellite. Primers
were designed to the (GA)n/(CT)n/(CA)n/(CAC)n/
(GTG)n/and (CAC)n microsatellites and were
anchored (all but one) to the microsatellite 3 terminus by the addition of a single selective base at
the 3 end. In both techniques, polymorphism is
detected by the presence or absence of the PCR
product. Lack of amplification indicates the
absence of the retrotransposon at the particular
locus. As these markers were extremely polymorphic, they can prove useful for evaluating
intraspecific relationships. Copia-SSR marker
67
Sequence-Specic Amplication
Polymorphism (S-SAP)
The technique was first used to investigate the
location of BARE-1 retrotransposons in the barley
genome (Waugh et al. 1997). In principle, it is a
simple modification of the standard AFLP protocol. The final amplification is performed with
retrotransposon-specific
and
MseI-adaptorspecific primers. S-SAP has been extensively used
to generate markers to study genetic diversity and
to prepare linkage maps in several plants.
Retrotransposon-Based Insertion
Polymorphism (RBIP)
The technique was first developed using the
PDR1 retrotransposon in the pea (Flavell et al.
1998). It requires the sequence information of
the 5 and 3 regions flanking the transposon.
When a primer specific to the transposon is used
together with a primer designed to anneal to the
flanking region, they generate a product from
template DNA containing the insertion. On the
other hand, primers specific to both flanking
regions amplify a product if the insertion is
absent. Polymorphisms can be identified using
standard agarose gel electrophoresis or by
hybridisation with a reference PCR fragment.
Hybridisation is more useful for automated,
high-throughput analysis. It is technically
demanding and little bit costlier than other methods for detecting transposon insertions.
Transposable Display (TD)
TD permits the simultaneous detection of many
TEs from high copy number lines. The technique is a modification of the AFLP procedure
where PCR products are derived from primers
anchored in a restriction site (i.e. BfaI or MseI)
and a transposable element rather than in two
restriction sites (van den Broeck et al. 1998).
Individual transposons are identified by a ligation-mediated PCR that starts from within the
68
69
Recognition site
Mutation at recognition site
A
B
Physically shear restriction products
A
B
Purify RAD tags
A
B
Release RAD tags
A
B
Label and hybridize to identify or type RAD markers
A
B
Fig. 3.10 Schematic representation of RAD marker development
70
cDNA-AFLP
The cDNA-AFLP is a novel RNA fingerprinting
technique to display differentially expressed
genes (Bachem et al. 1996). The methodology
includes digestion of cDNAs by two restriction
enzymes followed by ligation of oligonucleotide
adapters and PCR amplification using primers
complementary to the adapter sequences with
additional selective nucleotides at the 3 end. The
cDNA-AFLP technique is more stringent and
reproducible than RAP-PCR. In contrast to
hybridisation-based techniques, such as cDNA
microarrays, cDNA-AFLP can distinguish
between highly homologous genes from individual gene families. Further, there is no requirement of any pre-existing sequence information in
cDNA-AFLP; thus, it is valuable as a tool for the
identification of novel process-related genes such
as stress-regulated genes.
cDNA-SSCP
The SSCP analysis of RT-PCR products can
be used to evaluate the expression status (presence and relative quantity) of highly similar
Role of Genomics
Genomics has brought an innovative level of
hope to development of novel types of markers
and unravelling the secrets of complex traits.
Genome and/or gene sequences themselves have
the potential to provide a comprehensive list of
the markers in an organism. Functional genomics
approaches can then be used to generate information about gene function, as well as data on
genetic interactions, not only among and between
gene complexes but also in response to environmental stimuli. At present, microarray technology
(see Box 3.4) is providing the most comprehensive assessment of gene function and variation. Our
ability to view the transcription of the genome is
improving rapidly, and as a result, the potential to
dissect complex traits is also developing. Already,
array technology has been instrumental in identifying groups of co-expressed genes in various
physiological states, including stages of development and disease. Although array technology is
valuable, these data are not conclusive or comprehensive as regards gene function and only provide one more piece (i.e. transcriptional profile) of
the puzzle. The translation of genes into proteins
is another key step in gene action, and it will be
essential to subject protein synthesis, as well as
protein interaction, to the same genome-wide
analysis to understand how genotype can influence
a complex phenotype. In other words, how the
growing collections of data at the DNA, RNA,
protein and metabolite levels can be combined to
dissect complex traits and diseases remains to be
seen. It has been proposed that the power available
through the merger of genetics and genomics
(called genetical genomics or eQTL; discussed
in chapter 7) might lead to further unravelling
of metabolic, regulatory and developmental
Role of Genomics
71
Gel Electrophoresis
The electrophoresis is used to describe the
migration of charged particle under the
influence of an electric field. Gel electrophoresis is the technique in which molecules are
forced across a span of gel, driven by an electrical current. On either end of the gel, there
are activated electrodes that provide the driving force. Therefore, a molecules properties
(especially size, charge (the possession of ionisable groups) and conformation) determine
how rapidly an electric field can move the
molecule through a gelatinous medium or a
matrix. The important factor here is the length
and conformation of DNA molecule; smaller
molecules travel farther.
72
Role of Genomics
73
Microarray
Microarray can be used to find the polymorphic SNP or SFP markers. Microarray works
by exploiting the ability of fluorescently
labelled given DNA fragment to bind (or
hybridise) specifically to the markers
(predefined DNA template) arranged in a regular pattern on a small chip. Depending on the
strength or degree of binding/hybridisation,
the colour intensity varies, and it is used to
generate the data. The major advantage of
microarray is several DNA samples can be
analysed in a single experiment and thousands
of data points can be generated.
Capillary Electrophoresis
TILLING
Capillary electrophoresis has largely replaced
the use of gel separation techniques due to significant gains in workflow, throughput and ease
of use. Fluorescently labelled DNA fragments
are separated according to molecular weight,
and it can be automated since it does not involve
gel casting. During capillary electrophoresis,
the PCR products or DNA enters the capillary
as a result of electrokinetic injection. A highvoltage charge applied to the buffered sequencing reaction forces the negatively charged
fragments into the capillaries. The extension
products are separated by size based on their
total charge. The electrophoretic mobility of
74
Anonymous markers
Automation
AFLP (1995)
cDNA sequencing (ESTs)
SCARs (1991)
Oligo scene
RAPD (1990)
PCR (1987)
RFLPs (1980s)
Protein scene
Classical era
CAPS (1993)
SSCPs (1989)
Gene-Based markers
Genomics era
Allozymes (1960s)
Research Problem
This is the key question that needs to be solved
before choosing the right marker technology.
Thus, the first step is to finalise what is the
biological question one wants to answer with
75
Quality of DNA
RFLP analysis requires large amounts of pure
quality DNA. Most PCR-based methods require
only tiny quantities of DNA. In many cases, PCR
is performed only to amplify the original amount
of target DNA. Hence, the marker technology
should also be selected with the available facilities and resources.
Discrimination Level
Further, it is also important to decide at what
taxonomic level is the genetic variation being
measured: within populations, between species or
between genera? Is the selected method appropriate for detecting the desired level of variation?
SSRs can provide sufficient variation between
genera; however, to generate same degree of variation between species, it is better to use SNPs.
Mode of Inheritance
Other questions related to inheritance of markers
in the segregating progenies such as should both
homozygotes and heterozygotes be identified?
Are co-dominant markers needed (single-locus
RFLPs, isozymes, SSRs) or will dominant
markers suffice (RAPD, AFLP)? also need to be
addressed before selecting the marker system.
If presence versus absence information is
sufficient, then any molecular marker technology
can be used; but if information about heterozygotes is needed (e.g. population and diversity
structure, knowledge on type of inheritance), then
co-dominant markers such as isozymes or microsatellites should be used.
Expertise Required
Techniques involving hybridisation or manual
sequencing are technically demanding, whereas
RAPDs or SSRs (once the primers are available)
are the least demanding techniques. Thus, expertise availability also decides the selection of
marker technology. Further, availability of or
access to laboratory facilities and equipments
and man power with a good grasp of many basic
laboratory skills are also required to choose the
appropriate marker technology.
Costs
In terms of costs, isozymes are the cheapest;
RAPD, RFLP and even AFLP are intermediate;
but sequencing or SNP is still more expensive.
The costs of all types of experiments should be
considered, because lack of reproducibility of
some markers may, in the end, result in higher
costs. For required skills, a visit to another laboratory where the relevant techniques are being
used can provide invaluable information. Of late,
costs for sequencing experiments have
significantly decreased. Many ESTs are already
available for several species. Microarrays, based
on either anonymous genomic characterisation or
gene expression, are becoming common.
Microarray technology is still very demanding,
technically and financially (in terms of equipment and consumables). Before deciding on it,
get acquainted with the techniques, requirements
and outputs. A better option might be to consider
outsourcing of sample analysis. SNPs are being
routinely used in human studies. They are still
76
Speed
Further, it is required to decide how quickly are
data needed? and how much time will the equipment allow? PCR-based methods certainly give
fast results when primers are available.
Hybridisation-based methods are slower.
Conventional DNA sequencing is slow, whereas
automated sequencing is faster.
Reproducibility
Yet another critical question to be finalised is are
robust methods required? For example, will the
markers be exchanged? is more than one laboratory involved? Isozymes, RFLPs, SSRs and
sequencing are robust, whereas RAPD is not.
77
Table 3.3 Expected segregation ratios for different marker systems in different population types
Population type
F2 progenies
Back cross progenies
BC1
BC2
Recombinant inbred lines or double
haploid lines or near isogenic lines
Dominant markers
(e.g. RAPD, AFLP, ISSR)
3:1 (B_:bb)
1:0 (D_)
1:1 (Ff:ff)
1:1 (HH:hh)
78
(O E )2
E
Bibliography
Literature Cited
Bachem CWB, van der Hoeve RS, de Bruijn SM,
Vreugdenhil D, Zabeau M, Visser RGF (1996)
Visualisation of differential gene expression using a
novel method of RNA fingerprinting based on AFLP:
analysis of gene expression during potato tuber development. Plant J 9:745753
Baird NA, Etter PD, Atwood TS, Currey MC, Shiver AL
et al (2008) Rapid SNP discovery and genetic mapping using sequenced RAD markers. PLoS One
3(10):e3376. doi:10.1371/journal.pone.0003376
Bibliography
Botstein D, White RL, Skolnick M, Davis RW (1980)
Construction of a genetic linkage map in man using
restriction fragment length polymorphisms. Am J
Hum Genet 32:314333
Brody JR, Calhoun ES, Gallmeier E, Creavalle TD, Kern
SE (2004) Ultra-fast high-resolution agarose electrophoresis of DNA and RNA using low-molarity conductive media. Biotechniques 37(4):598602
Caetano-Anolls G, Bassam BJDNA (1993) Amplification
fingerprinting using arbitrary oligonucleotide primers.
Appl Biochem Biotechnol 42:189200
Chang RY, ODonoughue LS, Bureau TE (2001) InterMITE polymorphisms (IMP): a high throughput transposon-based genome mapping and fingerprinting
approach. Theor Appl Genet 102:773781
Cronn RC, Adams KL (2003) Quantitative analysis of
transcript accumulation from genes duplicated by polyploidy using cDNA-SSCP. Biotechniques 34:726734
Flavell AJ, Knox M, Pearce SR, Ellis THN (1998)
Retrotransposon based insertion polymorphisms
(RBIP) for high throughput marker analysis. Plant J
16:643665
He DH, Lin ZX, Zhang XL, Nie YC, Guo XP, Zhang YX, Li
W (2007) QTL mapping for economic traits based on a
dense genetic map of cotton with PCR-based markers
using the interspecific cross of Gossypium hirsutum Gossypium barbadense. Euphytica 153(1):181197
Hu J, Vick BA (2003) Target region amplification polymorphism: a novel marker technique for plant genotyping. Plant Mol Biol Rep 21:289294
Huang J, Sun M (1999) A modified AFLP with fluorescence
labelled primers and automated DNA sequencer detection for efficient fingerprinting analysis in plants.
Biotechnol Tech 14:277278
Jordan SA, Humphries P (1994) Single nucleotide polymorphism in exon 2 of the BCP gene on 7q31-q35.
Hum Mol Genet 3:1915
Kalendar R, Grob T, Regina M, Suoniemi A, Schulman A
(1999) IRAP and REMAP: two new retrotransposonbased DNA fingerprinting techniques. Theor Appl
Genet 98:704711
Komori T, Nitta N (2005) Utilization of CAPS/dCAPS
method to convert rice SNPs into PCR-based markers.
Breed Sci 55:9398
Li G, Quiros CF (2001) Sequence-related amplified
polymorphism (SRAP), a new marker system based
on a simple PCR reaction: its application to mapping
and gene tagging in Brassica. Theor Appl Genet
103:455546
Makino R, Yazyu H, Kishimoto Y, Sekiya T, Hayashi K
(1992) F-SSCP: fluorescence-based polymerase chain
reaction single-strand conformation polymorphism
(PCR-SSCP) analysis. PCR Methods Appl 2:1013
McCallum CM, Comai L, Greene EA, Henikoff S (2000)
Targeted screening for induced mutations. Nat
Biotechnol 18:455457
Michaels SD, Amasino RMA (1998) A robust method for
detecting single nucleotide changes as polymorphic
markers by PCR. Plant J 14:381385
79
Mullis KB, Faloona F (1987) Specific synthesis of DNA
in vitro via polymerase chain reaction. Methods
Enzymol 155:350355
Orita M, Iwahana H, Kanazawa H, Hayashi K, Sekiya T
(1989) Detection of polymorphisms of human DNA by
gel electrophoresis as single-strand conformation polymorphism. Proc Natl Acad Sci USA 86:27662770
Paran I, Michelmore RW (1993) Development of reliable
PCR-based markers linked to downy mildew resistance genes in lettuce. Theor Appl Genet 85:985999
Schuelke M (2000) An economic method for the
fluorescent labelling of PCR fragments. Nat Biotechnol
18:233234
Schwartz DC, Cantor CR (1984) Separation of yeast chromosome-sized DNAs by pulsed field gradient electrophoresis. Cell 37:6775
Tanksley SD, McCouch SR (1997) Seed banks and molecular maps: unlocking genetic potential from the wild.
Science 277:10631066
Tautz D, Renz M (1984) Simple sequences are ubiquitous
repetitive components of eukaryotic genomes. Nucleic
Acids Res 12(10):41274138
van den Broeck D, Maes T, Sauer M, Zethof J, De
Keukeleire P, DHauw M, Van Montagu M, Gerats T
(1998) Transposon Display identifies individual transposable elements in high copy number lines. Plant J
13:121129
Vos P, Hogers R, Bleeker M, Reijans M, van de Lee T,
Hornes M, Frijters A, Pot J, Peleman J, Kuiper M,
Zabeau M (1995) AFLP: a new technique for DNA
fingerprinting. Nucleic Acids Res 23:44074414
Wang X, Zhiyuan F, Sanwen H, Peitian S, Yumei L, Limei
Y, Mu Z, Dongyu Q (2000) An extended random
primer amplified region (ERPAR) marker linked to a
dominant male sterility gene in cabbage (Brassica
oleracea var. capitata). Euphytica 112:267273
Waugh R, McLean K, Flavell AJ, Pearce SR, Kumar A,
Thomas WTB, Powell W (1997) Genetic distribution
of Bare-1-like retrotransposable elements in the
barley genome revealed by sequence-specific
amplification polymorphisms (SSAP). Mol Gen Genet
253:687694
Weining S, Langridge P (1991) Identification and mapping of polymorphisms in cereals based on the polymerase chain reaction. Theor Appl Genet 82:209216
Weising K, Gardner RC (1999) A set of conserved PCR
primers for the analysis of simple sequence repeat
polymorphisms in chloroplast genomes of dicotyledonous angiosperms. Genome 42:911
Welsh J, McClelland M (1990) Fingerprinting genomes
using PCR with arbitrary primers. Nucleic Acids Res
18:72137218
Welsh J, Chada K, Dalal SS, Ralph D, Cheng R, McClelland
M (1992) Arbitrarily primed PCR fingerprinting of
RNA. Nucleic Acids Res 20:49654970
Williams JGK, Kubelik AR, Livak KJ, Rafalski JA, Tingey
SV (1991) DNA polymorphisms amplified by arbitrary primers are usefll as genetic markers. Nucleic
Acids Res 18:65316535
80
Wu KS, Jones R, Danneberger L, Scolnik P (1994)
Detection of microsatellite polymorphisms without
cloning. Nucleic Acids Res 22:32573258
Further Readings
Agarwal M, Shrivastava N, Padh H (2008) Advances in
molecular marker techniques and their applications in
plant sciences. Plant Cell Rep 27:617631
Eathington SR et al (2007) Molecular markers in a commercial breeding program. Crop Sci 47(S3):S154S163
Jena KK, Mackill DJ (2008) Molecular markers and their
use in marker assisted selection in rice. Crop Sci
48:12661277
Lorz H, Wenzel G (2005) Molecular marker systems in
plant breeding and crop improvement, Biotechnology
in agriculture and forestry 55. Springer, New York
Van Bueren L et al (2010) The role of molecular markers
and marker assisted selection in breeding for organic
agriculture. Euphytica 175:5164
N.M. Boopathi, Genetic Mapping and Marker Assisted Selection: Basics, Practice
and Benefits, DOI 10.1007/978-81-322-0958-4_4, Springer India 2013
81
82
Basics of Genetic/Linkage Mapping: Mendelian Ratios, Meiosis, Crossing Over and Partial Linkage
83
84
Interphase
Prophase I
P
Q
R
P p
Q q
R r
p
q
r
Metaphase I
P
Q
R
Anaphase I
p
q
r
P
Q
R
P
q
R
p
Q
r
p
q
r
Chiasma
Crossing over has
occurred; Recombinant
chromotids
Homologous
chromosomes
P
Q
R
p
Q
r
P
q
R
p
q
r
Prophase II
P
Q
R
p
Q
r
P
q
R
p
q
r
Recombinant gametes
Telophase II
Basics of Genetic/Linkage Mapping: Mendelian Ratios, Meiosis, Crossing Over and Partial Linkage
Fig. 4.2 The effect of
crossover on linked genes
If there is
no cross over
85
If cross over
occurs between A and B
Prophase II
A
AB
B
B
AB
Telophase II
A
B
aB
AB
a
b
ab
b
Ab
a
ab
Genotypes
2AB:2ab
b
ab
Genotypes
1AB:1aB:1Ab:1ab
86
Recombination frequencies
Between miniature wings (m) and Vermilion wings (v)
Between miniature wings (m) and yellow body (y)
Between vermilion eyes (v) and White eyes (w)
Between white eyes (w) and yellow body (y)
= 3.0%
= 33.7%
= 29.4%
= 1.3%
y w
0 1.3
30.7
33.7
Mapping Functions
Physical map
chaI
87
Genetic map
glkI
chaI
glkI
his4
SUPS3
leu2
Centromere
his4
SUPS3
leu2
Centromere
pgkI
pgkI
pet18
pet18
cryI
cryI
MAT
MAT
thr4
thr4
SUP61
SUP61
ABTI
ABTI
Mapping Functions
From the above explanation, it is clear that two
genes are said to be linked if they are located on
the same chromosome by assuming that different
chromosomes segregate independently during
meiosis. Therefore, for two genes located at different chromosomes, we may assume that their alleles
88
89
1 1 + 2r
ln
.
4 1 2r
90
91
92
Sources of Error
It is necessary to be aware that genetic map estimation, like any estimation procedure, is prone to
error. Error may arise due to many factors, including missing data, chiasma interference, genotyping error and segregation distortion. Missing data
can lead to an incorrect marker order, particularly
in dense regions of a map. Some scoring failures
are likely to be the results of random processes.
However, there is also an element of systematic
bias, and we often see a particular marker for
which several plants are not scored. In such a
case, we may wish to delete the marker from our
analysis. For less systematic cases, we may wish
to infer missing values through some computational method. In the presence of chiasma interference, the Haldane map function is not valid,
since it assumes no interference has taken place.
However, many map functions account for chiasma interference in varying degrees. For example, the Rao map function is a versatile function
that accounts for interference along a sliding
scale. Although the Rao map function is not
widely implemented in software tools (see
Box 4.3 for list of software that deals genetic
mapping), the Kosambi map function, which
accounts for interference, is supported by many
Sources of Error
93
94
search methods that can facilitate fast and accurate use of objective functions.
The outcome of a mapping experiment depends
on the composition of the sample population. The
larger the mapping population, the more confidence
we have in the estimates of recombination frequencies and map distances. For most purposes
populations of size in the range 80400 are used.
Remember that the population type also influences
the standard errors of the estimates. It is good to
realise that, for example, an experiment with
100 RILs will result in a (slightly) different map
when it was compared with sampling of an F2 population and the best map corresponding to each
sample. Although the variation between these
maps with respect to marker order may be nil, the
resulting total map length and the inter-marker distances are quite variable. This demonstrates that
the ultimate true linkage map does not exist.
Chromosomal Assignment
Once the linkage groups are identified and refined
from the data sets, the next step is assigning chromosome number to each linkage group. It is
usually done with the help of cytogenetic stocks.
Nullisomic/disomic/trisomic lines are used to
identify which chromosome of the given species
contains the markers that constitute the given linkage group. Assignment of markers to specific chromosomes can also be accomplished through PCR
using template DNA from each of the nullisomic
lines (or disomic or trisomic or tetrasomic lines
depending on the availability) in the given species.
It is also possible to assign the chromosome using
microisolated translocation chromosomes as a template in the PCR with the primer of the given
marker. Alternatively, deletion mapping using
structural aberrations of specific chromosomes can
also be employed in this context. In many species,
the chromosomes are designated in sequential
order based on their relative sizes. Recently, assignment of markers to the individual chromosomes or
chromosome arms is being extensively undertaken
with the help of fluorescent in situ hybridization
(FISH). Further, such FISH analysis helps in
comparison of physical and genetic map and
identification of introduced chromosomal segments
among related species. In polyploid species, it is
95
f2 backcross
f3 self
ri self
ri sib
The second line of the raw file should contain a list of three numbers separated by spaces,
such as
46 362 2
The first of these values indicates the number
of progeny for which data are included in the
(continued)
96
46 362 2 case
If you do not wish to use case-sensitive
genotypes, do not include the word case.
To specify the coding scheme itself, include
on the end of the above line the word symbols
followed by the coding scheme you wish to
use, defined in terms of the coding scheme
above. For example, if you wish to use the
following scheme with an RI data set,
1 Homozygote for parental genotype a
2 Homozygote for parental genotype b
0 Missing data for the individual (or line)
at this locus
then you would use a second line like
46 362 2 symbols 1 = A 2 = B 0 = Note that when interpreting this line,
MAPMAKER is in fact quite finicky about
spaces and case distinctions (in order to keep
MAPMAKER from ever misunderstanding
exactly what you mean). In particular, NO
SPACES should surround the = signs.
To use with a backcross data set the scheme
a Homozygote for parental genotype a
A Heterozygote
- Missing data for the individual (or line)
at this locus
you should use a line like
46 362 2 case symbols a = A A = H
The main restriction on coding schemes is
that the only allowed symbols are letters,
numbers and the characters - and +.
After the first two header lines, the raw file
should then present the genetic locus data in
the following simple format: For each locus,
you list (1) the name of the locus, preceded by
an asterisk (*); (2) one or more spaces
(or tabs etc.); and (3) the genotypic data for
all individuals, in order. For example
*locus1 BA-HHHAAABBB-HHAA
would provide data for a locus named locus1
with individual #1 having the B genotype,
individual #2 having the A genotype and so
forth. Data for each new locus should begin on
a new line (with blank lines allowed), although
the genetic data for any one locus may be
broken by any number of spaces, tabs and
(continued)
97
scoring up to
500th RILs
*ssr2
B B -
A B A B
scoring up to
500th RILs
.
.
.
*ssr200
A -
A A B B B
scoring up to
500th RILs
(continued)
98
Once the data file is prepared in the abovesaid procedure in Office Excel, save this file as
*.txt (text tab delimited) kind of file type.
Open the folder containing the above-said
*.txt file and change the file extension as *.raw
using folder options.
Important notes:
1. The * indicates a file name of your interest. For example, the file name for the
above-said data is specified as RIL.
2. If you could not find the file extension for
the specified file name, then click the folder
options, click the View tab and unclick
the radio button Hide extension for known
file types. By doing so, you can visualise
the file extension in the folder for the
specified file namejust change the file
extension alone (i.e. RIL.txt is to be
changed as RIL.raw).
Running Mapmaker
Precisely how you should start MAPMAKER
depends on your computer. It should be noted
that MAPMAKER downloaded from http://
www.broad.mit.edu/ftp/distribution/software/
mapmaker3/ can be installed only in Windows
XP or their previous operating system. It is not
supported by other high-end operating systems
such as Window Vista and Window 7. Just get
into the mapmaker folder and double-click the
mapmaker icon to get into the command
prompt.
When MAPMAKER starts running, you
will first see its start-up banner and a prompt
1> for the first command.
Command that should be typed into
MAPMAKER is represented in the below
procedure in bold italics, while MAPMAKER
output is presented in regular type.
The first step in almost every MAPMAKER
session is to load a data file for analysis. If you
are starting out an analysis on a new data set,
99
analysed for their fitness into a single linkage group. For example, if SSR1 to SSR5
belong to chromosome 1, then the command
to be used is
3 > sequence 1 2 3 4 5
However, there are 200 markers in this
data file, and suppose we dont know the
chromosomal position of each marker. If that
is the case, this data set is too many to work
with at once since doing all possible orders
of all these markers at once would take a long
time. The next step is instructing the program
to divide the markers in the sequence into
linkage groups; for this, type MAPMAKERs
group command. To determine whether any
two markers are linked, MAPMAKER calculates the maximum likelihood distance and
corresponding LOD score between the two
markers: If the LOD score is greater than
some threshold, and if the distance is less
than some other threshold, then the markers
will be considered linked. By default, the
LOD threshold is 3.0, and the distance threshold is 80 Haldane cM. For the purpose of
finding linkage groups, MAPMAKER considers linkage transitive. That is, if marker A
is linked to marker B, and if B is linked to C,
then A, B and C will be included in the same
linkage group. It will be too complicated if
the above-said data set is used in this analysis. In the below example, a simple data set is
explained which contains 13 markers. As you
can see, MAPMAKER has divided this 13
marker data set into two linkage groups,
which it names group1 and group2, and a
list of unlinked markers (if there are no
unlinked markers in the given data set, you
may not find it).
4 > group
Linkage groups at min LOD
3.00, max distance 80.0
group1 = 1 2 3 5 7
group2 = 4 6 8 9 10 11 12
unlinked 13
(continued)
100
Best 20 orders:
1: 1 3 2 5 7 Like: 0.00
2: 3 1 2 5 7 Like: -6.00
3: 5 7 2 3 1 Like: -20.20
4: 5 7 2 1 3 Like: -26.26
5: 2 5 7 3 1 Like: -27.25
6: 2 5 7 1 3 Like: -28.39
7: 2 3 1 5 7 Like: -28.85
8: 5 2 3 1 7 Like: -32.33
9: 2 1 3 5 7 Like: -34.12
10: 5 7 1 3 2 Like: -35.55
11: 5 2 1 3 7 Like: -37.61
12: 1 3 5 2 7 Like: -37.76
13: 3 1 5 2 7 Like: -39.09
14: 5 7 3 1 2 Like: -40.38
15: 1 3 5 7 2 Like: -40.87
16: 3 1 5 7 2 Like: -41.55
17: 5 2 7 3 1 Like: -43.67
18: 5 2 7 1 3 Like: -44.78
19: 5 1 3 2 7 Like: -47.63
20: 2 5 3 1 7 Like: -52.28
order1 is set
Note that while MAPMAKER examines
all 5!/2 possible orders, by default only the
20 most likely ones are reported. For each of
these 20 orders, MAPMAKER displays the
log-likelihood of that order relative to the
best likelihood found. Thus, the best order 1
3 2 5 7 is indicated as having a relative loglikelihood of 0.0. The second best order 3 1
2 5 7 is significantly less likely than the best,
having a relative log-likelihood of -6.0. In
other words, the best order of this group is
supported by an odds ratio of roughly
1,000,000:1 (10 to the 6th power to one) over
any other order. We consider this good evidence that we have found the first order is the
right order.
101
102
8 9 10 11 12
Note that the above two tests could have
been
automatically
performed
using
MAPMAKERs suggest subset command.
9 > sequence 4 6 8 9 10 11 12
sequence #4 = 4 6 8 9 10 11 12
10 > list loci
Linkage
Num Name Genotypes Group
4 SSR4 273 codom group2
6 SSR6 275 codom group2
8 SSR8 306 codom group2
9 SSR9 327 codom group2
10 SSR10 297 codom group2
11 SSR11 324 codom group2
12 SSR12 319 codom group2
11 > lod table
Bottom number is LOD score;
top
number
is
centimorgan
distance:
4 6 8 9 10 11
6 63.1
3.33
8 16.8 56.0
39.06 4.33
9 56.3 17.8 54.8
6.77 36.70 7.68
10 106.3 27.7 - 43.3
0.89 22.51 15.08
11 14.9 74.0 6.3 65.4 43.78 2.20 80.87 5.76
12 28.2 43.1 18.4 24.1 89.1
30.1
22.24 9.13 39.84 32.39 2.22
23.90
As before (did with small linkage groups),
we can also change MAPMAKERs sequence
to specify the subset we wish to test and then
type the compare command. This time, the
results are even more conclusive, with order1
more likely than any other. The sequence of
commands to be used here are:
9 > sequence {8 9 10 11 12}
10 > compare
11 > sequence order1
12 > map
Note that this time we do this using a special shortcut, order1, instead of specifying
the marker sequence as shown in order1. This
is to show that in both ways we can specify the
markers to be analysed by sequence command.
To determine the map position of the remaining two markers in group2, we will use the following procedure: Starting with the known
order of 5 markers, we will place the other two
(one at a time) into every interval in this order
and then recalculate the maximum likelihood
map of each resulting 6 marker order. In this
analysis, MAPMAKER recalculates all
recombination fractions for all intervals in
each map (not just the ones involving the
newly placed markers). This function is performed by MAPMAKERs try command. In
its output, MAPMAKER again displays relative log-likelihood of each position for the
inserted markers. The relative log-likelihood
of 0 indicates the best position, while the negative log-likelihoods indicate the odd against
placement in each other interval.
13 > sequence {8 9 10 11 12}
sequence #5 = {8 9 10 11 12}
13 > compare
Best 20 orders:
1: 11 8 12 9 10 Like: 0.00
2: 10 11 8 12 9 Like: -14.57
3: 8 11 12 9 10 Like: -15.23
4: 10 9 11 8 12 Like: -27.20
5: 11 8 12 10 9 Like: -29.97
6: 10 8 11 12 9 Like: -30.14
7: 9 10 11 8 12 Like: -32.23
8: 8 11 10 9 12 Like: -39.80
9: 10 9 8 11 12 Like: -39.91
10: 9 11 8 12 10 Like:
-40.05
11: 11 8 10 9 12 Like:
-40.25
12: 11 8 9 12 10 Like:
-44.73
13: 8 11 12 10 9 Like:
-45.21
(continued)
103
14: 10 11 8 9 12 Like:
-46.57
15: 8 11 9 12 10 Like:
-47.46
16: 9 10 8 11 12 Like:
-47.94
17: 10 8 11 9 12 Like:
-49.61
18: 8 11 10 12 9 Like:
-52.71
19: 9 8 11 12 10 Like:
-52.74
20: 11 8 10 12 9 Like:
-53.07
order1 is set
14 > sequence order1
sequence #6 = order1
15 > try 4 6
4 6
--------------| 0.00 -42.68 |
11 | |
|-35.57 -118.6 |
8 | |
|-19.65 -70.19 |
12 | |
|-46.80 -28.09 |
9 | |
|-51.35 0.00 |
10 | |
|-43.40 -21.09 |
|---------------|
INF |-44.66 -45.03 |
--------------BEST -619.33 -612.03
In this case, we see that marker 4 should be
preferably placed before marker 11. INF is
the probability that a marker is anywhere
ELSE but not on this sequence. In the above
test, we see that a log-likelihood of 44.66 supports linkage between 4 and the rest of the
group. We also see that marker 6 strongly prefers to be in-between markers 9 and 10. Even
the next most likely position for marker 6 is
more than 10 to the 21.09th power times less
104
105
106
107
9. MadMapper (http://cgpdb.ucdavis.edu/
XLinkage/MadMapper/)
10. THREaD Mapper (http://cbr.jic.ac.uk/
dicks/software/threadmapper/index.
html)
11. QTL IciMapping (http://www.isbreeding.
net/oldweb/download_software_ICIM.
aspx)
In practice, it is almost certainly best to
use a mixture of approaches in developing
and refining a map. This is not only because
each one brings something unique to the
analysis but also because we do not know
which approach will succeed best for a new
data set and we do not know enough about
the behaviour of each tool to judge this in
advance. It is strongly believed that map
estimation is an iterative process, where
researchers should first grasp the global pattern of their data set before revaluating and
revising the grouping and ordering of markers rather that performing a rigid, linear
three-stage methodology of grouping, ordering and spacing.
108
Bibliography
Literature Cited
Bateson W, Saunders ER, Punnett R (1905) Experimental
studies in the physiology of heredity. Rep Evol Comm
R Soc 2:155
Bovenhuis H, Meuwissen THE (1996) Detection and mapping of quantitative trait loci. Animal Genetics and
Breeding Unit. UNE, Armidale. ISBN 186389 323 7
Bulmer MG (1971) The effect of selection on genetic variability. Am Nat 105:201
Correns C (1913) Selbststerilitat und Individualstoffe.
Biol Centralbl 33:389423
Haldane JBS, Smith CAB (1947) A new estimate of the
linkage between the genes for colour-blindness and
haemophilia in man. Ann Eugen 14:1031
h t t p : / / w w w. n c b i . n l m . n i h . g o v / b o o k s h e l f / b r.
fcgi?book=genomes
Iwata H, Ninomiya S (2006) AntMap: constructing genetic
linkage maps using an ant colony optimization algorithm. Breed Sci 56:371377
Janssens FA (1909) La theorie de la chiasmatypie.
Nouvelle interpretation des cinises de maturation.
Cellule 22:387411
Kohel RJ, Richmond TR, Lewis CF (1970) Texas Marker
1. Description of genetic standards for G. hirsutum L.
Crop Sci 10:670671
Lander ES, Green P, Abrahamson J, Barlow A, Daly MJ,
Lincoln SE, Newburg L (1987) MAPMAKER: an
interactive computer package for constructing primary
genetic linkage maps of experimental and natural populations. Genomics 1:174181
Further Readings
Bailey NTJ (1961) Introduction to the mathematical
theory of genetic linkage. Oxford University Press,
London
Cheema J, Dicks J (2009) Computational approaches and
software tools for genetic map estimation in plants.
Brief Bioinfo 10(6):595608
McPeek MS (1996) An introduction to recombination and
linkage analysis. http://www.stat.wisc.edu/courses/
st992-newton/smmb/files/broman/mcpeek96.pdf
Whitehouse HLK (1973) Towards an understanding
of the mechanism of heredity. St. Martins Press,
New York
Wu R, Gallo-Meagher M, Littell RC, Zeng Z (2001)
General polyploid model for analyzing gene segregation in outcrossing tetraploid species. Genetics
159:869882
Phenotyping
N.M. Boopathi, Genetic Mapping and Marker Assisted Selection: Basics, Practice
and Benefits, DOI 10.1007/978-81-322-0958-4_5, Springer India 2013
109
110
Phenotyping
111
112
Phenotyping
Heritability of Phenotypes
113
Heritability of Phenotypes
Collecting accurate phenotypic data that are
relevant to the TPE has always been a major
challenge for the improvement of quantitative
traits. The success of this endeavour is intimately
connected with the heritability of the trait, namely,
the portion of the phenotypic variability accounted
for by additive genetic effects that can be inherited
through sexually propagated generations. Trait
heritability varies according to: (1) the genetic
make-up of the materials under investigation, (2)
the conditions under which the materials are investigated and (3) the accuracy and precision of the
phenotypic data. With only a few notable exceptions, most of the traits determining the performance of crops usually have low (~0.300.40)
or, at best, intermediate (~0.400.60) heritability.
114
Phenotyping
Bibliography
115
s 2a
100 %
s 2e
2
s a+
Bibliography
Literature Cited
Monneveux P, Ribaut JM (2012) Drought phenotyping in
crops: from theory to practice. CIMMYT/Generation
challenge programme, Mexico. Freely available at:
https://www.integratedbreeding.net/drought-phenotyping-crops-theory-practice
Further Readings
Pask AJD, Pietragalla J, Mullan DM, Reynolds MP (2012)
Physiological breeding II: a field guide to wheat phenotyping. CIMMYT, Mexico
Reynolds MP, Pask AJD, Mullan DM (2012) Physiological
breeding I: interdisciplinary approaches to improve
crop adaptation. CIMMYT, Mexico
Shashidhar HE, Henry A, Hardy B (2012) Methodologies
for drought studies in rice. International Rice Research
Institute, Los Baos
QTL Identication
QTL: A Prelude
Most of the important agronomic traits are quantitatively inherited and are controlled by several
genes (i.e. polygenic). Thus, the nature of quantitative traits is that their expression is controlled
by tens, hundreds or even thousands of quantitative trait loci (QTL), and in general, they are having only a small effect on the trait. QTL is a
genomic region that comprises gene(s) which
govern(s) the expression of the quantitative trait.
Since the advent of molecular markers, researchers and breeders have aimed to identify functional
markers (refer chapter 3 for different kinds
of markers) associated with these QTL for implementation of marker-assisted selection. Historically, QTL detection started with linkage mapping
in biparental populations (refer chapter 2 for
population types (Sax 1923; Thoday 1961)).
Identifying a gene or QTL within a plant genome
is like finding the proverbial needle in a haystack.
However, QTL analysis can be used to divide the
haystack in manageable piles and systematically
search them. In simple terms, QTL analysis is
based on the principle of detecting an association
between phenotype and the genotype of markers.
Markers are used to partition the mapping population into different genotypic groups based on the
presence or absence of a particular marker locus
and to determine whether significant differences
exist between groups with respect to the quantitative trait being measured. Thus, statistically a
N.M. Boopathi, Genetic Mapping and Marker Assisted Selection: Basics, Practice
and Benefits, DOI 10.1007/978-81-322-0958-4_6, Springer India 2013
117
Methods
Advantages
Limitations
Likelihood approach,
regression approach or
combination of above two
approaches
QTL location can be identified
Reference
Single-marker analysis
One marker is involved at a time to find
the QTL-marker association
Features
Principle
118
QTL Identification
QTL: A Prelude
119
120
QTL Identification
is designed to map QTL within a single linkage group, and it produces a plot of QTL probability as a function of map distance. This type
of plot seems intuitively more interpretable than
the plot of the likelihood ratio statistic or LOD
score produced by other programs. However, it
seems to be most suited to the analysis of
single chromosomes for which other programs
have indicated the possibility of multiple QTL.
Multimapper is designed to work with QTL
Cartographer as a companion program.
The QTL Cafe is a program being developed in Java to make it available for multiple
computer platforms. It is currently available as
an applet that runs in a Java-enabled World
Wide Web browser.
Epistat is a program for DOS designed
primarily for the detection and analysis of
interactions between QTL. It does not perform
interval mapping and therefore does not require
mapped markers. It is an interactive program,
displaying graphic results in response to singlekeystroke commands.
QTL IciMapping: It is an integrated software for building genetic linkage maps and
mapping QTL. The modules are built very
user-friendly and this software is being
updated regularly.
121
1x +
0+b
y=b
e
y = b0 + b1x + e,
where y is the phenotypic value of a line, b0 is the population
mean, b1 is the additive effect of the locus on the trait, and e is a
residual error term. x is directly related to the genotypic code at
the locus being tested for the line considered, it is -1 (for female
parent) or 1 (for donor or male parent).
550.5
471.5
361.0
AA
Aa
aa
Marker classes
122
Interval Mapping
Lander and Botstein in 1989 developed simple
interval mapping (SIM), which overcomes the
disadvantages of analysis of variance at marker
loci. SIM is currently the most popular approach
for QTL mapping in experimental crosses. This
method makes use of linkage maps and analyses
intervals between adjacent pairs of linked markers along chromosomes simultaneously, instead
of analysing single markers. The use of linked
markers for analysis compensates for recombination between the markers and the QTL and is
considered statistically more powerful compared
to single-point analysis. The intervals that are
defined by ordered pairs of markers are searched
in increments (e.g. 2 cM), and statistical methods
are used to test whether a QTL is likely to be
present at the location within the interval or not.
It is important to realise that interval mapping
statistically tests for a single QTL at each increment across the ordered markers in the genome.
Interval mapping searches through the ordered
genetic markers in a systematic, linear (also
referred to as one-dimensional) fashion, testing
the same null hypothesis at each increment.
QTL Identification
Interval Mapping
123
Maximum likelihood QTL between loci G and H
LOD score
Marker F
25
G 15 H 10 I
35
Locus position
124
1
2
A
F
10.5
12.5
B
K
14.5
U
3.5
V
13.4
Q
15.0
13.0
N
O
6.1
H
C
W
4.1
10.2
8.0
5.0
0.5
5.7
L
12.0
15.0
QTL Identification
R
2.3
8.0
17.2
5.4
T
mapping population used for phenotypic evaluation must be available for marker genotyping and
subsequent QTL analysis, which may be difficult
with completely or semi-destructive bioassays
(e.g. screening for resistance to necrotrophic fungal pathogens).
In general terms, the identified QTL may also
be described as major or minor. This definition
is based on the proportion of the phenotypic variation explained by a QTL (based on the r2 value):
Major QTL will account for a relatively large
amount (e.g. >10%), and minor QTL will usually
account for <10%. Sometimes, major QTL may
refer to QTL that are stable across environments,
whereas minor QTL may refer to QTL that may
be environmentally sensitive, especially for QTL
that are associated with disease resistance or
drought tolerance. In more formal terms, QTL
are classified as: (1) suggestive, (2) significant
and (3) highly significant. This classification was
mainly proposed to avoid large numbers of false
positive claims and also ensure that real linkage
was not missed. Significant and highly significant
QTL were given significance levels of 5 and
0.1%, respectively, whereas a suggestive QTL is
Interval Mapping
125
126
QTL Identification
127
128
QTL Identification
129
4
3
2
3
130
Individual
label
BC1-1
BC2-2
BC1-3
BC1-n
QTL Identification
NAU1246
2
0
2
NAU3684
0
.
2
NAU3875
.
.
2
BNL3971
2
0
2
DFF
55
45
61
58
PH
68.7
84.2
65.7
71.5
YLD
15.2
20.5
17.5
22.8
131
132
QTL Identification
133
134
QTL Identification
Interval Mapping
To perform the interval mapping, select the
option Interval Mapping from Analysis
menu. Mention the destination directory to
save the graphic of interval mapping results.
Since we are doing a lot of statistical tests
when doing a QTL analysis, you have to take
account of that fact in choosing a threshold
value of the likelihood ratio statistic for declaring that youve found a QTL. You can accept
the default value, use one of your own or select
one through permutations (which will take the
longest but produce the most reliable threshold value). The number of permutation tests
can be set as 3001,000 or more. QTL
Cartographer will automatically calculate the
threshold when you press Go tab, and the
resulting LOD score will be fixed as threshold
for interval mapping. As mentioned above, the
threshold value can be fixed manually in the
appropriate tab that can be seen in the same
window. Note that the default significance
threshold is an LRT value of 11.5, which
equals an LOD score of 2.5 (refer text for
details). Once this threshold value is set, the
interval mapping can be performed. The other
parameter you may want to change is the walk
speed. Thats the parameter that determines
the interval along the map at which QTL calculations are done. If you have a very dense
map, you can set the interval to be quite small,
and youll have a much more precise idea of
where any QTL you locate may be, but it will
take the program much longer to do the calculations. If there is no idea on this walk speed,
let us leave the walk speed at the default
2 cM.
The graphics of all the chromosomes can
be obtained by selecting the All Chromos
option from the tab Chrom. Interval mapping for each chromosome can also be carried
out separately by selecting the particular chromosome (First Chrom, Second Chrom and
(continued)
135
136
QTL Identification
137
138
QTL Identification
139
140
QTL Identification
Statistical Signicance
Regardless of the method used to estimate and
locate single or multiple QTL, once the test
statistics are calculated, the likelihood of the
event is assessed. The statistical basis of these
comparisons relies on model assumptions,
the most common of which requires the quantitative trait values to be normally distributed. In
reality, however, the distribution of the trait
values is not normal and needs to be considered
as a mixture of (normal) distributions. Violating
the normality assumption has an impact on the
distribution of the statistic used to test for a QTL,
which makes standard statistical procedures
potentially inaccurate.
One approach to obtaining the distribution
(or behaviour, in the long term) of the test statistic
is to use a computer simulation to produce the
data. Thousands of data sets, taken from the same
statistical model, are simulated and the test statistics calculated. Together, these test statistics show
the behaviour of the test in the long run and,
therefore, represent the statistical distribution of
the particular test statistic. From this distribution,
one chooses the level of statistical significance or
threshold above which results are considered
statistically significant (or valid). This approach
is indeed useful if the model used to simulate the
data is the true model. However, the model rarely
describes the complicated relationships that occur
Permutation Testing
Churchill and Doerge in 1994 proposed permutation testing to obtain empirical distributions for
test statistics. In a permutation test, the data is
randomly shuffled over the marker data. Analysis
141
Bootstrapping
Bootstrapping, described by Visscher et al. in
1996, is an alternative resampling procedure.
From the original dataset, N individual observations are drawn with replacement. An observation
is a phenotype and its marker type; hence, unlike
in permutation testing, the observed combinations
remain together. Note that some observation may
appear twice in the bootstrap sample, whereas
other may not appear at all. It shows that confidence
is approximated very well with this method, with
only 200 bootstrap samples used. A bootstrap
method is typically used to determine an empirical confidence interval for the QTL location,
assuming that the QTL effect exists. In QTL analysis, usually many markers are tested, often for
multiple traits and in multiple families. The risk
of false positives is very high with so many tests.
If a 5% significance level would be used, we
would expect 5% false positives. Therefore, a
more stringent significance level is usually applied
for genome-wide QTL detection, for example,
0.1%. Hence, for 200 tests, we would need a
significance level of 0.05/200 = 0.00025 to have
a chance of false positives of about 5%. Usually, a
significance level of around 0.1% is applied.
142
QTL Identification
143
144
Meta-QTL Analysis
Since the first publication of a QTL localisation
in tomato using molecular data by Paterson et al.,
in 1988, more and more species and traits have
QTL Identification
145
146
QTL Identification
Genomics-Assisted Breeding
In the last decade, some scientific milestones,
including genome sequencing projects, EST databases and microarray technologies, have enhanced
the understanding of plant genomes and allowed for
the identification of genes responsible for a desired
trait. Besides using random markers derived from
anonymous polymorphic sites in the genome, it has
become possible to generate functional markers;
they are derived from polymorphisms within the
transcribed regions of the genome. Such markers
are completely linked to the desired trait allele and
have also been termed perfect markers. The main
limitation of applying random, non-perfect DNA
markers such as RFLPs, AFLPs or microsatellite
Array Mapping
markers is the limited number of detectable polymorphisms, low throughput and high costs of assaying each locus. The development of SNPs allows
higher throughput, but still marker development and
PCR reactions are required. Thus, it was suggested
that marker-assisted breeding and selection will
gradually evolve into genomics-assisted breeding
(the term genomic selection is also used in some
publications). Currently, array mapping, association
mapping and EcoTILLING are often discussed as
methodologies within the context of genomicsassisted breeding and refer chapter 10 for more
details.
Array Mapping
With the completion of the genomic sequence of
several model crop plants (since Arabidopsis thaliana, the first plant genome, was deciphered), plant
genomics moved on to the era of functional genomics. The mere sequence of a genome is of limited
value in revealing the function of genes. Gene
expression needs to be studied in the next step and
DNA microarrays have become the main technological approach to expression studies. Microarrays
(also known as biochips, DNA chips and gene
chips) were developed by Schena and co-workers
in 1995. There are several ways in which genes
can be arrayed, the two most common technologies being cDNA arrays and oligonucleotide
arrays. To conduct an oligonucleotide array, oligonucleotides are synthesised in situ for setting up
the array, requiring knowledge of sequence data.
cDNA arrays are also applicable to non-model
organisms, as they only require a large cDNA
library and the development of ESTs. ESTs are
end segments of sequences from cDNA clones that
correspond to mRNA, that is, parts of expressed
genes. To conduct a cDNA array, several thousand
ESTs are needed. A unique set of these ESTs is
amplified by PCR and used to conduct the array.
Irrespective of cDNA arrays or oligonucleotide
arrays, the basic steps are the following: (1) mRNA
from cells or tissues in a sample is extracted, (2)
converted into cDNA and fluorescently labelled,
(3) hybridised with the array by robotically spotting the probe onto a planar surface (often glass
147
148
Association Mapping
In plants, most of the QTL analyses have been
conducted in highly structured populations with
known pedigrees (such as F2 or backcross populations). However, in general, such structured
populations have two major limitations. First, the
limited number of recombination events results
in poor resolution for quantitative traits. Second,
only two alleles at any given locus can be studied
simultaneously. In order to increase the resolution of mapping populations, large populations
that have undergone several rounds of random
mating should be created. These rounds of mating increase the potential number of recombination events, and structured populations such as
recombinant inbred lines are potential resources
in this context. Despite these efforts, the resolution for many QTL is still several centimorgan
(cM), corresponding to hundreds of genes.
QTL Identification
Association Mapping
149
150
QTL Identification
151
there is more statistical power to evaluate epistasis. The advantages of association mapping in
terms of resolution, speed and allelic range are
complementary to the strengths of F2-based QTL
mapping, namely, marker efficiency and statistical power. There are two commonly used programs for association mapping: TASSEL (http://
www.maizegenetics.net/tassel) and STRUCTURE
(http://pritch.bsd.uchicago.edu/structure.html).
Readers are requested to visit these websites and
manuals for detailed procedure for association
mapping, which are self-explanatory and simple
to do. The free website, http://www.extension.org/
pages/62755/association-mapping-and-tasselsoftware-tutorial, may also be visited for further
technical tips.
152
EcoTILLING
EcoTILLING is based on the methodology of
TILLING (Targeting Induced Local Lesions IN
Genomes), which was developed as a strategy in
reverse genetics (McCallum et al. 2000).
TILLING is a methodology that identifies DNA
polymorphisms regardless of phenotypic consequence, allows the identification of single-base-
QTL Identification
153
154
QTL Identification
155
Segregation Distortion
The first step in any QTL-mapping experiment
is usually to construct populations that originate from homozygous, inbred parental lines.
The resulting F1 lines will tend to be heterozygous at all markers and QTL. From the F1 population, crosses are made (e.g. backcross, F2
intercross and crosses to generate recombinant
inbred lines), and the segregation of markers
and QTL are statistically modelled. In general,
experimenters assume that markers are segregating randomly, but if markers are subject to
segregation distortion, it is not possible to
anticipate how the resulting estimates of recombination will be affected, as well as any potential QTL locations. Two important issues should
be considered when assessing these statistical
results. The first consideration is sample size.
The number of individuals studied provides
information for the estimation of phenotypic
means and variances. A large sample of individuals provides the opportunity to observe
recombinant events (thus to have a knowledge
on segregation distortion) and to estimate
parameters with greater accuracy and, therefore, a greater ability to detect QTL.
Missing data and markers with distorted segregations may make ordering of markers difficult
to decide. Especially, markers deviating
significantly from expected Mendelian segregation ratios and markers with less than 100 data
points are excluded from the QTL analysis. High
marker density is usually seen as a guarantee of
being a high-standing QTL analysis regardless
of the proportion of dominant versus co-dominant markers or the reliability in the order of
markers. At the same time, the abundance of
dominant markers (RAPDs, AFLPs) may cause
problems in the construction of maps and in the
analysis of QTL by interval mapping procedures.
156
Phenotyping
The accuracy of phenotypic evaluation is of the
utmost importance for the accuracy of QTL
mapping (see chapter 5). A reliable QTL map can
only be produced from reliable phenotypic data.
Replicated phenotypic measurements or the use
of clones (via cuttings) can be used to improve
the accuracy of QTL mapping by reducing background noise. Thorough studies should include
phenotypic evaluations that have been conducted
in both field and glasshouse trials, and QTLmapping studies should be independently
confirmed or verified. Such confirmation studies
(referred to as replication studies) may involve
independent populations constructed from the
same parental genotypes or closely related genotypes used in the primary QTL-mapping study.
Sometimes, larger population sizes may be used.
Furthermore, some recent studies have proposed
that QTL positions and effects should be evaluated in independent populations because QTL
mapping based on typical population sizes results
in a low power of QTL detection and a large bias
of QTL effects. Unfortunately, due to constraints
QTL Identification
157
Statistical Issues
As we discussed, a QTL is a region of any genome
that is responsible for variation in the quantitative
trait of interest. The goal of identifying all such
regions that are associated with a specific complex phenotype might, at first, seem quite simple,
especially with all the genomic and computational tools available to help us. Unfortunately,
the task is difficult because of the sheer number
of QTL, and the possible epistasis or interactions
between QTL, and because of the many additional sources of variation. To combat this, QTL
experiments can be designed with the aim of
containing the sources of variation to a limited
number so that dissection of a complex phenotype might be possible. In general, a large sample
of individuals has to be collected to represent the
total population, to provide an observable number of recombinants and to allow a thorough
assessment of the trait under investigation. This
is the first key step in QTL analysis, and it is
ignored in most of the studies.
Composite interval mapping and multiple
QTL mapping achieve the same result by reducing
the number of potential models under consideration. Both methods extend the ideas of interval
mapping to include additional markers as cofactorsoutside a defined window of analysisfor
the purpose of removing the variation that is
associated with other (linked) QTL in the genome.
The limitations of both approaches are that they
are restricted to one-dimensional searches across
the genetic map and are challenged at times by
the multiplicity of epistatic QTL effects. There is
also a risk of putting too many markers in the
model as cofactors, and care should be taken to
preserve the amount of information that is available for estimation of the QTL effect.
158
QTL Identification
the search for different models and their comparisons with the information gained from completing the QTL genotype information. The power in
breaking a problem into two independent parts is
not new as it was dealt with by Jansen in 1993
and lies in the fact that information is gained in
the first part that can be used in the second part.
Once the QTL genotypes are estimated, Sen and
Churchill explore all possible models using an
approach that allows distinct models of different
QTL numbers to be considered. As the QTL
genotypes are calculated independently from the
QTL effect and location, previous issues of
epistasis and linked QTL are eliminated because
the state of the QTL genotype and QTL number
is known before the estimation of their effects
and interactions. Multi-trait QTL mapping can
also benefit from the computational framework
of Sen and Churchill by simply extending from a
single phenotype to multiple correlated phenotypes and by dissecting the problem in a similar
manner. Although the Sen and Churchill view
has been shown to benefit QTL mapping, it might
have an even larger potential for accommodating
other types of problem and data structure
(for details, see Doerge 2002).
The most obvious applications of QTL analysis are MAS in crop breeding and QTL cloning
for transgenic technology. The success (or
efficiency) in both endeavours primarily depends
on the reliability and accuracy of the QTL analysis where information has been obtained.
Chromosomal QTL regions are quite often large
and can include many open reading frames or
favourable QTL alleles in repulsion. This situation can exacerbate linkage drag in the application of QTL analysis for plant breeding or
introgression into elite germplasm of undesirable
characters that are linked to a desirable QTL.
Thus, a principal objective of QTL analysis is
confining QTL to narrow chromosomal regions,
which implies joint consideration of the type of
experimental design or segregating population,
its size, number, informativeness and level of polymorphism of DNA markers and the statistical
methodologies both to build up the linkage map
and to perform the QTL analysis. These are the
methodological features that should be considered
159
160
QTL Identification
among loci at two-locus, three-locus and higherorder levels often have major effects on adaptability and have a considerable influence on phenotype.
If there is gene interaction, populations can differentiate not only for population means but also
for local average effects. The consequence of this
differentiation is that the local average effects of
alleles change relative to each other so that an
allele favoured by selection in one population
may be removed by selection in other populations.
The importance of two-locus genetic model and
inclusion of measures of genetic population differentiation, it was theoretically shown that the
potential role of additive dominant and dominant dominant epistasis in reproductive isolation
and inbreeding depression at the QTL level. It was
also concluded that the same forces that reduce
the apparent contribution of genetic interactions
to the variance within populations lead to populations differentiating from the local average effects
of alleles. Epistasis between QTL assayed in populations segregating for an entire genome has
been found at a frequency close to that expected
by chance alone. Yet, when RILs, DHs and
isogenic lines are used, epistasis is detected more
frequently. Therefore, QTL mapping may underestimate the number of non-additive interactions
for three reasons. First, when advanced backcross
progenies are used, it is not useful for detecting
epistatic QTL since every backcross generation
greatly reduces the number of genotypic combinations because the donor genotype is being
recovered. For example, the frequency of individuals with phenotype AB derived from the twolocus double heterozygoteAaBb by self-pollination
will be 9/16, while by backcrossing it will be 1
or1/4 (testcross). Second, even large F2 mapping
populations will contain few individuals in the
two-locus double homozygous classes, limiting
the statistical power detecting non-additive deviations for these genotypes. Finally, searching for
epistatic interactions involves many statistical
tests, so significance thresholds must be increased
accordingly. Unless epistatic interactions contribute largely to the total variance, they will not show
up in F2 populations. Kao et al. (1999) described a
method for simultaneous mapping of multiple
interacting QTL, but owing to computational con-
161
Practical Utility
In practical point of view, the following common
question is often raised: Is the information from a
QTL analysis enough for being successful in MAS
for QTL? The experimental results showed mixed
response. Schneider et al. (1997) have reported that
MAS improved drought resistance performance by
11% under stress and 8% under non-stress in common beans. A MAS study for malting quality in
barley, based on two QTL, gave contrasting results
(Han et al. 1997). Whereas tandem genotypic and
phenotypic selection proved useful for one quantitative trait locus, a second putative quantitative trait
locus identified in the original mapping population
vanished in the population used for selection. The
proportion of genetic variance explained by the
QTL, individually and together, in the QTL experiment is a first key point. The second key point is
that G E and epistatic interactions at any quantitative trait locus may be involved in the phenotypic
value. Concerning the first point, it is often difficult
to determine from the literature how much of the
genetic variance is explained by the QTL, either
individually or together, because only the total
phenotypic variance is reported. It is therefore not
possible to decide whether any variation left unexplained is caused by other QTL or the environment.
Taking into account that for QTL alleles of small
effect the magnitude of the bias will be larger than
for QTL alleles of large effect, one should be
especially cautious with QTL of small effect.
Fortunately, in some cases, a small number of QTL
have been reported as contributing to a large proportion of the trait variance. This would explain
why MAS experiments have generally been successful when using the marker information for
introgressing or accumulating QTL alleles of large
effect. At the same time, the purpose of the QTL
analysis is not only MAS but also the genetic
dissection of the quantitative trait. Therefore, all
QTL have to be identified regardless of whether
their effect is large or small, or environmentally
sensitive or not. This task requires information
162
Bibliography
Literature Cited
Churchill GA, Doerge RW (1994) Empirical threshold
values for quantitative trait mapping. Genetics 138(3):
963971
Comai L, Young K, Till BJ, Reynolds SH, Greene EA,
Codomo CA, Enns LC, Johnson JE, Burtner C, Odden
AR, Henikoff S (2004) Efficient discovery of DNA
polymorphisms in natural populations by Ecotilling.
Plant J 37:778786
Edwards MD, Stuber CW, Wendel JF (1987)
Molecular marker facilitated investigation of quantitative trait loci in maize. I. Numbers, genomic distribution and types of gene action. Genetics 116:
113125
Etzel C, Guerra R (2002) Meta-analysis of geneticlinkage of quantitative trait loci. Am J Hum Genet
71:5665
Goffinet B, Gerber S (2000) Quantitative trait loci: a
meta-analysis. Genetics 155:463473
Guo SW, Thompson EA (1992) Performing the exact test
of Hardy-Weinberg proportion for multiple alleles.
Biometrics 48:361372
Han F, Ullrich SE, Kleinhofs A, Jones BL, Hayes PM,
Wesenberg DM (1997) Fine structure mapping of the
barley chromosome- 1 centromere region containing
malting-quality QTLs. Theor Appl Genet 95:
903910
Hansen M, Kraft T, Ganestam S, Sll T, Nilsson NO
(2001) Linkage disequilibrium mapping of the bolting
gene in sea beet using AFLP markers. Genet Res
77:6166
Jansen RC (1993) Interval mapping of multiple quantitative trait loci. Genetics 135:205211
Jansen J, De Jong AG, Van Ooijen JW (2001) Constructing
dense genetic linkage maps. Theor Appl Genet
102:11131122
Jiang C, Zeng ZB (1995) Multiple trait analysis of genetic
mapping for quantitative trait loci. Genetics
140:11111117
Jiang C, Zengt ZB (1995) Multiple trait analysis of genetic
mapping for quantitative trait loci. Genetics
140(3):11111127
Kao C-H et al (1999) Multiple interval mapping for quantitative trait loci. Genetics 152:12031216
QTL Identification
Bibliography
Further Readings
Asns MJ (2002) Present and future of quantitative trait
locus analysis in plant breeding. Plant Breed
121:281291
Broman KW (2001) Review of statistical methods for
QTL mapping in experimental crosses. Lab Anim
30(7):4452
Delvin B, Risch N (1995) A comparison of linkage disequilibrium measures for fine-scale mapping.
Genomics 29:311322
Doerge RW (2002) Mapping and analysis of quantitative
trait loci in experimental populations. Nat Rev
3:4353
Hospital F (2009) Challenges for effective marker-assisted
selection in plants. Genetica 136:303310, http://
www.knowledgebank.irri.org/ricebreedingcourse/
bodydefault.htm#QTL_mapping.htm
163
Jorde LB (2000) Linkage disequilibrium and the search
for complex disease genes. Genome Res 10:
14351444
Kang MS (2002) Quantitative genetics, genomics, and
plant breeding. In: Papers from the symposium on
quantitative genetics and plant breeding in the 21st
century, Louisiana State University, 2628 Mar 2001,
CAB International 2002
Kendziorski CM et al (2006) Statistical methods for
expression quantitative trait loci (eQTL) mapping.
Biometrics 62:1927
McMullen MD et al (2009) Genetic properties of the
maize nested association mapping population. Science
325:737740
Wrschum T (2012) Mapping QTL for agronomic traits in
breeding populations. Theor Appl Genet 125:201210
Xu Y, Crouch JH (2008) Marker-assisted selection in plant
breeding: from publications to practice. Crop Sci
48:391407
Fine Mapping
N.M. Boopathi, Genetic Mapping and Marker Assisted Selection: Basics, Practice
and Benefits, DOI 10.1007/978-81-322-0958-4_7, Springer India 2013
165
166
Fine Mapping
Comparative Mapping
such as pulsed field gel electrophoresis, rare-cutting restriction enzymes and Southern blotting
facilities. Large genomic DNA inserts derived
from the given crop genome are usually cloned
into high-capacity vectors such as cosmids, yeast
artificial chromosomes (YAC), bacterial artificial
chromosomes (BAC), bacteriophage P1-derived
artificial chromosomes (PAC) and mammalian
artificial chromosomes (MAC). Using such vectors, insert DNA of 45 to 800 kbp can be cloned.
Such large insert libraries facilitate the development of small insert libraries which will be
sequenced to determine the order of nucleotides
in those small inserts using state-of-the-art automated DNA sequencing technologies (such as
pyrosequencing, massively parallel signature
sequencing (MPSS), polony sequencing and
sequencing with Illumina or SOLiD; see chapter 10).
Then, the sequencing results are ordered or
assembled as contigs, and from this assembly, the
complete physical map of the genome is prepared. Such physical map can be compared with
the genetic map, and new markers (such as SNPs
and/or INDELs) can be obtained from the physical map for fine or localised mapping of the target
QTL region in the genetic map.
Comparative Mapping
Genetic or physical maps constructed in one species can be compared by means of common markers (or common single gene traits) with closely
related species. Such common markers are
referred to as anchored markers. These comparative maps can be used to study genome evolutionhow the genome has been rearranged
through timeand to make inferences about gene
organisation, repeated sequences, etc. Further,
map-based cloning (see below) may be easier in
some species than othersfor example, rice (with
a small genome) versus wheat (with a massive
genome). Conservation of the gene order within a
chromosomal segment between different species
is referred to as colinearity, whereas conservation
of the order of genes in DNA fragments that are
bigger than 50 kb is referred to as microlinearity.
Deletion, inversions and duplications are detected
167
168
Fine Mapping
169
170
Fine Mapping
Map-Based Cloning
Successful isolation of genes underlying the target QTL using the information on QTL map and
physical map is referred to as map-based cloning.
There are at least three important steps in mapbased cloning, since it may vary depending on
the crop and purpose:
1. Mapping of target QTL and identification of
more closely linked markers through fine mapping. For preliminary QTL mapping, a population size of 60150 individuals with 100200
markers that span the entire genome is sufficient.
However, for fine mapping, it is essential to
increase the population size to >1,000 with more
number of informative/polymorphic markers.
2. Physical localization of the target QTL on the
physical map using the markers sequence
information (referred to as chromosome landing). This identifies the genomic fragment
which is flanked by the target markers. The
identified genomic region is then scanned
towards the putative candidate genes (referred
to as chromosome walking). It is usually done
by screening a large insert genomic library
with the closely linked marker and isolate the
clones that hybridise with the marker. This is
followed by creating new markers (usually
sequences at the end of the clone) and screening the segregating population (often this population is large (1,0003,000 individuals))
with the new markers. The goal is to find a set
of markers that co-segregate with the gene
under the QTL. Co-segregation means that
whenever one allele of the gene is expressed,
the markers associated with that allele are also
present (i.e. recombination is not occurred
between the gene and the marker). Such
identified genes are called positional candidate genes, which are in the region of genome
scan as likely to host a QTL.
3. Gene identification, characterization and validation: Co-segregation confirms that the genes are
within the two flanking markers. Step 2 usually
finds large number of putative candidate genes
(which are identified by predicting open reading
frames (ORFs) in the DNA sequence of the
171
Validation of QTLs
The markers identified in preliminary genetic
mapping studies are seldom suitable for markerassisted selection without further testing, validation and additional development. Markers that
are not adequately tested before use in MAS programs may not be reliable for predicting phenotype and will therefore be useless. Generally, the
steps required for the development of markers for
use in MAS include high-resolution mapping,
validation of markers and possibly marker conversion, testing the markers in related germplasm
accessions and testing the genes isolated from the
map-based cloning using transgenic tests. The
procedure of fine mapping and its importance
have been discussed above and the rest is discussed hereunder.
172
Fine Mapping
Bibliography
Literature Cited
Monna L, Kitazawa N et al (2002) Positional cloning of
rice semi-dwarfing gene, sd1: rice GreenRevolution
Gene encodes a mutant enzyme involvedin gibberellin synthesis. DNA Res 9:1117
Jansen RC, Nap JP (2001) Genetical genomics: the added
value from segregation. Trends Genet 17:388391
Further Readings
Holloway B, Li B (2010) Expression QTLs: applications
for crop improvement. Mol Breed 26:381391
Kliebenstein D (2009) Quantitative genomics: analyzing
intraspecific variation using global GeneExpression polymorphisms or eQTLs. Annu Rev Plant Biol 60:93114
ParanI ZD (2003) Quantitative traits in plants: beyond the
QTL. Trends Genet 19(6):303306
Marker-Assisted Selection
Advantages of MAS
MAS can theoretically enhance breeders selection efficiency because:
1. It can be performed on seedling material,
thus reducing the time required before a
plants genotype is known. In contrast, many
N.M. Boopathi, Genetic Mapping and Marker Assisted Selection: Basics, Practice
and Benefits, DOI 10.1007/978-81-322-0958-4_8, Springer India 2013
173
8 Marker-Assisted Selection
174
2.
3.
4.
5.
6.
7.
8.
9.
10.
Limitations in MAS
MAS is not universally advantageous and cannot
be applied to all the traits in all the crops. Some
limitations of the technique are briefly discussed
hereunder:
1. MAS may be more expensive than conventional techniques, especially for start-up
expenses and labour costs. In certain situations, conventional breeding method may suit
well to meet out the breeding objective. An
important consideration for MAS, often not
reported, is that while markers may be cheaper
to use, there is a large initial cost in their
development.
2. Recombination between the marker and the
gene of interest may occur, leading to false
positives. For example, if the marker and the
gene of interest are separated by 5 cM and
selection is based on the marker pattern, there
is an approximately 5% chance of selecting
the wrong plant. This is based on the general
guideline that across short distances, 1 cM of
genetic distance is approximately equal to 1%
recombination. The breeder will need to
decide the error rate that is acceptable in the
MAS program, keeping in mind that errors are
also usually involved in phenotypic evaluation. To avoid this last problem, it may be necessary to use flanking markers on either side
of the QTL of interest to increase the probability that the desired gene is selected.
175
176
8 Marker-Assisted Selection
177
P1 (S)
P2 (R)
F1
Selfing
Identify individuals
having the desired
marker allele ; lines
having S banding and
heterozygotes are
removed.
Combine the marker results with other selection criteria advance those individuals
Percentage of recurrent
parent genome
75.0
87.5
93.8
96.9
98.4
99.2
8 Marker-Assisted Selection
178
Donor parent
F1
BC1F1
Grow the plants and genotype for chosen markers (foreground, recombinant and background selection)
Select the BC1F1 progenies based on recovery
of target QTL and background markers
Recurrent parent x
Selected BC1F1
BC2F1
Continue the same process until BC3F1
Selected BC3F1
gene even after many generations of backcrossing. This can lead to linkage drag, where
deleterious traits from the donor parent are
inadvertently transferred to the recipient parent along with the target trait. Ensuring the
cleanest transfer of the target trait includes the
following steps: (a) the availability of several
closely linked markers on each side of the
target trait. This is easy for transgenic traits in
crops where a dense set of mapped markers is
available but could be harder to achieve if the
markertrait linkage is not strong, and especially in the case of quantitative traits where
the region to introgress may be quite large. (b)
Enough plants are screened for the linked
markers at each generation to increase the
chances of recombination close to the target
region. This is done typically in two successive steps: (1) In the BC1 generation, the focus
is on finding the closest possible recombinations on one side of the target trait (besides
ensuring that the proper alleles on the other
side are still present). Enough plants are
selected at this stage to still allow for background selection (see below). (2) In the BC2
generation, the same takes place for the other
side of the target trait. (c) Selfing will then be
needed to fix the introgressed region. That will
be done at the end of the background selection
process, which may take an additional generation. This selection of a very clean introgression can thus be done quickly in two
generations of backcrossing. One caution is
that the size of the final donor region surrounding the introgressed gene will depend on
the intensity of the effort, especially in terms
of number of BC1 and BC2 plants that are
screened. Enough plants need to be screened
not only to find a close recombination at each
step (usually markers that flank the target
QTLs are used as recombinant markers) but
also to have enough plants remaining for a
sufficient background selection.
2. Background selection, in which the breeder
selects for recurrent parent marker alleles in
all genomic regions except the target locus,
and the target locus is also additionally
selected based on phenotype. Background
179
180
8 Marker-Assisted Selection
181
Marker-Assisted Recurrent
Selection (MARS)
In marker-assisted recurrent selection (MARS),
the breeders take advantage of favourable alleles
originating from both parents involved in the
crossing program. QTL alleles impacting the
major traits of interest to the breeders are
identified within breeding populations and accumulated through successive intercrossing using
only genotypic selection. Recombined lines are
then subjected to a final phenotypic screen to
select the best varieties to release. This allows the
8 Marker-Assisted Selection
182
Parent 1
Parent 2
F1
F2
F3
F3:4
GENOTYPING
PHENOTYPING
QTL ANALYSIS
MODELLING AND SELECTION OF QTLS FOR RECOMBINE
IDENTIFY F3 DERIVED PROGENIES FOR RECOMBINE
GENOTYPE 8 16 SEEDS PER PROGENY OF F3:6 AND SELECT
BEST 8 PLANTS (e.g. A H) TO CROSS
A x B
C x D
E x F
G x H
1ST recombination cycle
F1
F1
F1
F1
F1
generation of progenies with an optimum combination of key alleles from both parents that could
never be obtained by chance recombination alone.
Thus, MARS has a clear breeding objective, as
opposed to QTL discovery conducted in good x
bad crosses. The concept is to identify QTL
effects for polygenic traits (usually minor) that
are specific to that population and to recombine
them via genotypic selection to generate superior
progenies for variety development. To do this,
de novo QTL detection is performed with each
population of interest and the best lines are
recombined to obtain a progeny that performs
183
3.
4.
5.
6.
184
8 Marker-Assisted Selection
Mapping-As-You-Go (MAYG)
In 2004, Podlich et al. suggested the MappingAs-You-Go (MAYG) approach, to overcome the
problem of inaccurate estimation of QTLs and
their effects. MAYG is a mapping-MAS strategy
that accounts for the presence of epistasis and
genotype by environment (G E) interactions.
The effectiveness of the MAYG approach has
been investigated through simulation. In the
MAYG approach, estimates of QTL allele effects
are continually revised by remapping new elite
germplasm generated during cycles of MAS, thus
ensuring that QTL estimates remain relevant to
the current set of germplasm in the breeding
program. It is considered as a mapping-MAS
strategy that explicitly recognises that alleles of
QTL for complex traits can have different values
as the current breeding material changes with
time. The integration of genetic mapping and
MAS offers two major advantages: (1) ability to
carry out markertrait association analysis using
breeding populations directly rather than having
to follow time-consuming development of genetic
populations and (2) combining markertrait association development and validation. This saves
time, both in the process itself but also in the generation of the necessary genetic materials.
Bibliography
185
2.
3.
4.
5.
6.
7.
Bibliography
Literature Cited
186
Morris M, Dreher K, Ribaut JM, Khairallah M (2003)
Money matters (II): costs of maize inbred line conversion schemes at CIMMYT using conventional and
marker-assisted selection. Mol Breed 11:235247
Tanksley SD, Nelson JC (1996) Advanced backcross
QTL analysis: a method for the simultaneous discovery and transfer of valuable QTLs from unadapted
germplasm into elite breeding lines. Theor Appl Genet
92:191203
Further Readings
Beavis WD (1998) QTL analysis: power, precision, and
accuracy. In: Paterson AH (ed) Molecular dissection of
complex traits. CRC Press, Boca Raton, pp 145161
Frisch M, Melchinger AE (2001) Marker-assisted backcrossing for introgression of a recessive gene. Crop
Sci 41:14851494
Frisch M, Bohn M, Melchinger AE (1999a) Minimum
sample size and optimal positioning of flanking markers in marker-assisted backcrossing for transfer of a
target gene. Crop Sci 39:967975
Frisch M, Bohn M, Melchinger AE (1999b) Comparison
of selection strategies for marker-assisted backcrossing of a gene. Crop Sci 39:12951301
Frisch M et al (2000) PLABSIM: software for simulation
of marker-assisted backcrossing. J Hered 91:8687
Hospital F (2003) Marker-assisted breeding. In: Newbury
HJ (ed) Plant molecular breeding. Blackwell
Publishing/CRC Press, Oxford/Boca Raton, pp 3059
Kearsey MJ, Farquhar AGL (1998) QTL analysis in
plants; where are we now? Heredity 80:137142
8 Marker-Assisted Selection
Knapp S (1998) Marker-assisted selection as a strategy
for increasing the probability of selecting superior
genotypes. Crop Sci 38:11641174
Knight J (2003) Crop improvement: a dying breed. Nature
421:568570
Morgante M, Salamini F (2003) From plant genomics to
breeding practice. Curr Opin Biotechnol 14:214219
Neeraja C, Maghirang-Rodriguez R, Pamplona A, Heuer S,
Collard B, Septiningsih E et al (2007) A marker-assisted
backcross approach for developing submergence-tolerant rice cultivars. Theor Appl Genet 115:767776
Peleman JD, van der Voort JR (2003) Breeding by design.
Trends Plant Sci 8:330334
Podlich DW, Winkler CR, Cooper M (2004) Mapping as
you go: an effective approach for marker-assisted
selection of complex traits. Crop Sci 44:15601571
Ribaut JM, Hoisington D (1998) Marker-assisted selection:
new tools and strategies. Trends Plant Sci 3:236238
Smith S, Beavis W (1996) Molecular marker assisted
breeding in a company environment. In: Sobral BWS
(ed) The impact of plant molecular genetics.
Birkhauser, Boston, pp 259272
Thomas WTB (2003) Prospects for molecular breeding of
barley. Ann Appl Biol 142:112
Xu Y (2003) Developing marker-assisted selection strategies
for breeding hybrid rice. Plant Breed Rev 23:73174
Xu Y, Crouch JH (2008) Marker-assisted selection in plant
breeding: from publications to practice. Crop Sci
48:391407
Young N (1999) A cautiously optimistic vision for markerassisted breeding. Mol Breed 5:505510
Yousef GG, Juvik JA (2001) Comparison of phenotypic
and marker-assisted selection for quantitative traits in
sweet corn. Crop Sci 41:645655
resistance is clearly dominating among publications since they are mainly controlled by major
genes and detection of such QTLs is more or less
accurate. However, few studies reported the
successful application of MAS for improved
yield, quality traits, abiotic stress tolerance,
variety detection or growth character (see below).
Another important fact among MAS studies is
that the main marker technologies applied are
predominantly microsatellite markers. Though
almost all the publications are results from public
breeding programs, it would be incorrect to
conclude that MAS is mainly conducted in public
breeding programs. What has to be considered is
that publishing is of little or no importance for
private plant breeders, while it is one of the main
aims in public research institutes and at universities. The following section provides success
stories made in different crops that employed
MAS, and the list is not exhaustive. Due to space
constraints, only few examples in each crop have
been shown, merely to showcase that MAS has
been widely employed in crop plants for their
genetic improvement. Please refer the further
readings to get more examples.
Tomato
This was the first crop in which both QTL
mapping and MAS has been demonstrated.
Tanksley et al. in 1981 have first demonstrated
the real MAS-based selection on metric characters using isozyme markers in early generations
N.M. Boopathi, Genetic Mapping and Marker Assisted Selection: Basics, Practice
and Benefits, DOI 10.1007/978-81-322-0958-4_9, Springer India 2013
187
188
of tomato lines. Lecomte et al. (2004) introgressed five QTLs controlling fruit quality in
tomato from a parental line into three improved
lines through marker-assisted backcross program.
Maize
This was the second crop that has successfully
been used to show that isozyme markers can also
be used for genetic improvement of yield in 1982
by Stuber. In another study, Yousef and Juvik
(2002) showed that QTLs identified in a mapping
population can very well exert the same effects in
different genetic backgrounds and across two
environments. By introgressing three marker QTL
alleles associated with enhanced seedling emergence into elite lines utilising marker-assisted
backcrossing, this trait was successfully enhanced
in sweet corn. The AB-QTL method, which can be
used for the simultaneous identification and transfer of favourable QTL alleles, has successfully
been used to improve yield in elite maize lines (Ho
et al. 2002) and also Bouchez et al. (2002) successfully introgressed favourable QTLs for grain
yield into maize elite lines. As abiotic stress resistance is a complex trait, only few successful MAS
applications in breeding for such traits have been
published. An example is the results of a markerassisted backcross experiment conducted at
CIMMYT to improve grain yield in tropical maize
under water-limited conditions (Ribaut and Ragot
2006). Other important examples for the successful
application of MAS in maize are the use of microsatellite markers for the conversion of normal
maize lines into Quality Protein Maize (QPM),
containing more lysine and tryptophan than the
native lines (Babu et al. 2004), or the introgression
of favourable QTL for earliness and grain yield
between maize elite lines (Bouchez et al. 2002).
Wheat
Examples of commercially released genetic material include Patwin (Hard White Spring wheat),
the first variety developed by MAS released by the
University of California at Davis (http://www.
plantsciences.ucdavis.edu/plantbreeding/main/
history.htm), which contains the introgressed stripe
rust resistance gene Yr17 and leaf rust resistance
gene Lr37 (Helguera et al. 2003). Similarly, several
other related genes Lr1, Lr9, Lr24 and Lr47 were
introgressed into common wheat cultivars by MAS
(Nocente et al. 2007). Marker-assisted pyramiding
of two cereal cyst nematode resistance genes from
Aegilops variabilis in wheat has also been reported
(Barloy et al. 2007). In wheat, there is extensive
use of DNA markers for cereal cyst nematode
(Heterodera avenae Woll.) resistance (Eagles et al.
2001). The extensive use of MAS in CIMMYT
wheat breeding programs is reported elsewhere.
Large wheat MAS programs have also been
developed in Australia for around 20 genes or
chromosome regions used in cultivar development.
During the last few years, remarkable progress in
implementation of MAS strategies for cultivar
development has been achieved by the MAS
Wheat Consortium in the United States, including
the completion of 80 MAS projects (visit the
consortium website for more detail).
Rice
Ashikari et al. (2005) provide a good example of
successful gene pyramiding experiments. First,
the introgression of one QTL for grain number
and one QTL for plant height separately in the
same genetic background improved both traits.
Second, the lines generated by pyramiding both
QTLs in the same genetic background exhibited
trait values slightly lower than expected based
on single introgression lines, but overall, the
addition of genetic loci was still beneficial and
permitted improvement of the yield of a strain of
rice. There are many other successful examples
in numerous species, including pyramiding of
Xa7 and Xa21 for the improvement of disease
resistance to bacterial blight in hybrid rice (Zhang
et al. 2006). Up to now, MAS in rice breeding has
mainly been utilised for the pyramiding of disease
resistances, namely, bacterial blight and blast
(Narayanan et al. 2002). In 2002, two cultivars
resistant to bacterial leaf blight were released in
Indonesia, which have been selected using MAS.
Barley
In Australia, a marker linked (0.7 cM) to the Yd2
gene for resistance to barley yellow dwarf virus
was successfully used to select for resistance in a
barley backcross breeding scheme (Jefferies et al.
2003). Field test data showed that BC2F2-derived
lines containing the linked marker had fewer leaf
symptoms and higher grain yield when infected
by the virus compared to lines lacking the marker.
Castro et al. (2003) provided an example of gene
pyramiding in barley by combining a qualitative
gene with QTL alleles for resistance to barley
stripe rust. Preliminary results indicated combining qualitative and quantitative resistance genes
improved resistance levels in the presence of a
virulent race of the pathogen.
Soybean
Soybean yields were increased by using markerassisted backcrossing to introgress a yield QTL
from a wild accession into commercial genetic
backgrounds (Concibido et al. 2003). Although
the yield enhancement was observed in only two
189
190
Contrasting Stories
In some cases, MAS is not as efficient as expected.
Most of the time, this depends on how stable are
QTL effects, which may be altered in different
ways. In some cases, the QTL effect vanishes
after MAS or introgression (Shen et al. 2001).
One can then wonder whether the QTL was a
false positive (ghost QTL) or a true positive for
which the effect (expression) depended on one or
several of the interactions listed below. There is
also a tendency for supposedly additive QTL
effects not to really sum up! Refer Hospital
(2009) for more details on reasons for failures of
MAS in crop plants.
Bibliography
Bibliography
Literature Cited
Babu ER, Mani VP, Gupta HS (2004) Combining high
protein quality and hard endosperm traits through
phenotypic and marker assisted selection in maize.
In: Proceedings of the 4th international crop science
congress, Brisbane
Bainotti C, Fraschina J, Salines JH, Nisi JE, Dubcovsky
J, Lewis SM, Bullrich L, Vanzetti L, Cuniberti
M, Campos P, Formica MB, Masiero B, Alberione E,
Helguera M (2009) Registration of BIOINTA 2004
wheat. J Plant Regist 3:165169
Barloy D, Lemoine J, Abelard P, Tanguy AM, Rivoal R,
Jahier J (2007) Marker assisted pyramiding of two
cereal cyst nematode resistance genes from Aegilops
variabilis in wheat. Mol Breed 20:3140
Barr AR, Jefferies SP, Warner P, Moody DB, Chalmers KJ,
Langridge P (2000) Marker-assisted selection in theory
and practice. In: Proceedings of the 8th international
barley genetics symposium, vol I. Adelaide, Australia,
pp 167178
Beaver JS, Porch TG, Zapata M (2008) Registration of
Verano white bean. J Plant Regist 2:187189
Bouchez A, Hospital F, Causse M, Gallais A, Charcosset
A (2002) Marker-assisted introgression of favorable
alleles at quantitative trait loci between maize elite
lines. Genetics 162:19451959
191
Bustamam M, Tabien RE, Suwarno A, Abalos MC, Kadir
TS, Ona I, Bernardo M, Veracruz CM, Leung H (2002)
Asian rice biotechnology network: improving popular
cultivars through marker-assisted backcrossing by the
NARES. Poster presented at the international rice congress, 1620 Sept 2002, Beijing
Castro AJ et al (2003) Mapping and pyramiding of qualitative and quantitative resistance to stripe rust in barley. Theor Appl Genet 107:922930
Concibido VC, Diers BW, Arelli PR (2004) A decade of
QTL mapping for cyst nematode resistance in soybean. Crop Sci 44:11211131
Concibido VC et al (2003) Introgression of a quantitative
trait locus for yield from Glycine soja into commercial
soybean cultivars. Theor Appl Genet 106:575582
Eagles HA, Bariana HS, Ogbonnaya FC, Rebetzke GJ,
Hollamby GJ, Henry RJ, Henschke PH, Carter M
(2001) Implementation of markers in Australian wheat
breeding. Aust J Agric Res 52:13491356
Fraley R (2006) Presentation at Monsanto European
investor day, 10 Nov 2006. Available at www.monsanto.com/investors/presentations.asp
Hardin B (2000) Rice breeding gets marker assists.
Available at www.ars.usda.gov/is/AR/archive/dec00/
rice1200.pdf. Verified 19 Nov 2012
Hayes PM, Corey AE, Mundt C, Toojinda T, Vivar H
(2003) Registration of Tango barley. Crop Sci
43:729731
Helguera M, Khan IA, Kolmer J, Lijavetzky D, Zhong-Qi
L, Dubcovsky J (2003) PCR assays for the Lr37Yr17-Sr38 cluster of rust resistance genes and their use
to develop isogenic hard red spring wheat lines. Crop
Sci 43:18391847
Helms TC, Nelson BD, Goos RJ (2008) Registration of
Sheyenne soybean. J Plant Regist 2:2020
Ho C, McCouch R, Smith E (2002) Improvement of
hybrid yield by advanced backcross QTL analysis in
elite maize. Theor Appl Genet 105:440448
Jantaboon J, Siangliw M, Im-mark S, Jamboonsri W,
Vanavichit A, Toojinda T (2011) Ideotypes breeding
for submergence tolerance and cooking quality by
MAS in rice. Field Crops Res 123(3):206213
Jefferies SP, King BJ, Barr AR, Warner P, Logue SJ,
Langridge P (2003) Marker-assisted backcross introgression of the Yd2 gene conferring resistance to barley yellow dwarf virus in barley. Plant Breed
122:5256
Lecomte L, Duff P, Buret M, Servin B, Hospital F, Causse
M (2004) Marker- assisted introgression of five QTLs
controlling fruit quality traits into three tomato lines
revealed interactions between QTLs and genetic backgrounds. Theor Appl Genet 109:658668
Liang F, Deng Q, Wang Y, Xiong Y, Jin D, Li J, Wang B
(2004) Molecular marker-assisted selection for yieldenhancing genes in the progeny of 9311 O.
rufipogon using SSR. Euphytica 139:159165
Mudge J, Cregan PB, Kenworthy JP, Kenworthy WJ, Orf
JH, Young ND (1997) Two microsatellite markers that
flank the major soybean cystnematode resistance
locus. Crop Sci 37:16111615
192
Narayanan NN, Baisakh N, Vera Cruz CM, Gnanamanickam
SS, Datta K, Datta SK (2002) Molecular breeding for
the development of blast and bacterial blight resistance
in rice cv. IR50. Crop Sci 42:20722079
Navarro RL, Warrier GS, Maslog CC (2006) Genes are
gems: reporting agri-biotechnologya sourcebook
for journalists. In: International crops and research
institute for the semi-arid tropics, Patancheru, Andhra
Pradesh, India
Nocente F, Gazza L, Pasquini M (2007) Evaluation of leaf
rust resistance genes Lr1, Lr9, Lr24, Lr47 and their
introgression into common wheat cultivars by markerassisted selection. Euphytica 155(3):329336
Ribaut JM, Ragot M (2006) Marker-assisted selection to
improve drought adaptation in maize: the backcross
approach, perspectives, limitations, and alternatives.
J Exp Bot 58:351360
Shen L, Courtois B, McNally KL, Robin S, Li Z (2001)
Evaluation of near-isogenic lines of rice introgressed
with QTLs for root depth through marker-aided selection. Theor Appl Genet 103:7583
Singh VK et al (2012) Incorporation of blast resistance
into PRR78, an elite Basmati rice restorer line,
through marker assisted backcross breeding. Field
Crops Res 128:816
Stuber CW (1982) Improvement of yield and ear number
resulting from selection at allozyme loci in a maize
population. Crop Sci 22:737
Tanksley SD, Medino-Filho DH, Rick CM (1981) The
effect of isozyme selection on metric characters in an
interspecific backcross of tomato: basis of an early
screening procedure. Theor Appl Genet 60:291296
Xu K, Xu X, Fukao T, Canlas P, Maghirang-Rodriguez
R, Heuer S, Ismail AM, Baileyerres J, Ronald PC,
Mackill DJ (2006) Sub1A is an ethylene-response-
Further Readings
Anthony VM, Ferroni M (2012) Agricultural biotechnology and smallholder farmers in developing countries.
Curr Opin Biotechnol 23:278285
Ashikari M, Sakakibara H, Lin S, Yamamoto T, Takashi T,
Nishimura A et al (2005) Cytokinin oxidase regulates
rice grain production. Science 309:741745
Brumlop S, Finckh MR (2010) Applications and potentials
of marker assisted selection (MAS) in plant breeding.
Final report of the F+E project Applications and
Potentials of Smart Breeding (FKZ 350 889 0020) On
behalf of the Federal Agency for Nature Conservation
December 2010. http://www.bfn.de/0502_skripten.html
Hospital F (2009) Challenges for effective marker-assisted
selection in plants. Genetica 136:303310
Ribaut JM, Hoisington D (1998) Marker assisted selection:
new tools and strategies. Trends Plant Sci 3(6):236239
Zong G, Ahong W, Lu W, Guohua L, Minghong G, Tao S,
Bin H (2012) A pyramid breeding of eight grain-yield
related quantitative trait loci based on marker-assistant
and phenotype selection in rice (Oryza sativa L.).
J Genet Genomics 39(7):335350
10
Molecular Techniques
To realise the importance of rapidly accumulating
data as well as to understand the functioning of
the cell at the organism level, there is a need
for high-throughput molecular techniques. The
studies that use such techniques are collectively
called as functional genomics. The term functional genomics is defined as the development
and application of global or genome-wide experimental approaches to assess gene function by
using the information and components provided
by structural genomics. Several approaches have
been used to explore the probable function of
the genes, as well as to monitor their expression
in relation to various other genes, and they are
explained hereunder.
Expression Proling
A major part of functional genomics is the
analysis of gene expression. Having knowledge
of when and where a gene product, that is,
RNA and/or protein, is expressed can give vital
information about the particular gene in question.
The very first step in generating a genome-wide
expression profile is the preparation of expressed
sequence tags (EST) profiles. ESTs are DNA
N.M. Boopathi, Genetic Mapping and Marker Assisted Selection: Basics, Practice
and Benefits, DOI 10.1007/978-81-322-0958-4_10, Springer India 2013
193
10
194
195
196
10
Subtractive Hybridisation
Subtractive hybridisation is a popular technique
for gene discovery from non-model organisms
without an annotated genome sequence. They are
valuable tools for identifying differentially regulated genes important for cellular growth and
differentiation. Over the last decade, numerous
subtractive hybridisation techniques have been
developed and used to isolate significant genes in
many systems. The simple suppression subtractive hybridisation (SSH; see below) is a widely
used method for separating DNA molecules that
distinguish two closely related DNA samples.
Two of the main SSH applications are cDNA
subtraction and genomic DNA subtraction. It is
based primarily on a suppression polymerase
chain reaction (PCR) technique and combines
normalisation and subtraction in a single procedure. The normalisation step equalises the abundance of DNA fragments within the target
population, and the subtraction step excludes
sequences that are common to the populations
being compared. This dramatically increases the
probability of obtaining low-abundance differentially expressed cDNAs or genomic DNA
Subtractive Hybridisation
197
198
10
Microarray
Microarray
The microarray is also called as DNA chips or
biochips. DNA chips are made up of silicon or
nylon or glass on which DNA fragments are fabricated. The sources of DNA fragments may be
obtained from cDNA clones, EST clones, genomic
clones or DNA amplified from open reading
frames. Size of the single DNA chips varies from
1 to 3.24 cm2. But within this small size, we can
display nearly all the genes of a crop plant.
DNA chip technologies utilise microscopic
arrays (microarrays) of molecules immobilised
on solid surfaces for hybridisation analysis.
Advanced arraying technologies such as photolithography, micro-spotting and ink-jetting, coupled with sophisticated fluorescence detection
systems and bioinformatics, permit molecular
199
200
Oligonucleotide-Based Chips
This type of DNA chips contains a high density
of short oligonucleotide microarrays, which are
prepared by photolithography. Such arrays contain 100,000400,000 oligonucleotides immobilised within an area of 1.6 cm2. This allows the use
of targeted regions of genomic DNA for sequencing or for a large-scale analysis of single nucleotide polymorphisms (SNPs).
DNA-Based Chips or cDNA Arrays
This type of DNA chips contains a high density
of DNA microarrays, most often derived from
cDNA (hence, they are currently made by robotically spotting a large number of PCR-amplified
DNA fragments onto glass or nylon surfaces).
The hybridisation is carried out with fluorescently
labelled mRNA or its corresponding cDNA, and
the hybridised duplexes are identified by colour
fluorescence detection methods. These DNA
10
Microarray
201
202
10
3. Functional Genomics
Microarrays for gene expression analysis provide
an integrated platform for functional genomics.
Samples of mRNA form a variety of cells and tissues that are used for microarray analysis and
would yield information about specific changes
in gene expression patterns. The mRNA samples
of interest are labelled and used for hybridisationbased microarray analysis, yielding quantitative
data on the expression of thousands of cellular
genes. Parallel measurement of transcript levels
for thousands of genes is one of the most widespread uses of DNA chip technology. Both oligonucleotide and cDNA microarrays are very useful
for estimating levels of transcripts.
4. Reverse Genetics
DNA chips can also be used for characterisation
of mutant populations exposed to various selection pressures, to collect information about the
fitness value of a variety of alleles for each of the
large number of genes in a species. This is done
particularly in organisms where complete
sequence of the genome is already available and
studying the impact of deletions/insertions followed by analysis of their fitness. (such an
approach where we start a study with DNA
sequence and conclude it with the analysis of
phenotype is described as reverse genetics).
This can be achieved if the mutants are first subjected to a selection pressure and then characterised. This can be illustrated using the example of
yeast, where the genome has been completely
sequenced and was shown to carry 6,000 open
reading frames (ORFs). Unique molecular
sequences or bar codes can be introduced in
each of the above 6,000 ORFs in the yeast
genome. A mixture of yeast strains containing
individual bar codes for all 6,000 genes is then
subjected to a selection pressure. Samples of
cells are taken, and bar code sequences are
Microarray
203
204
8. Proteomics
Like genomics, the proteomics relates to the
study of proteinprotein interactions. DNA chips
can also be used for this area of study. Protein
linkage maps can also be created using genomic
sequence information. Proteinprotein interactions can be studied using the yeast two-hybrid
system. In this system, two fusion proteins are
used for the activation of transcription of a
reporter gene in yeast. The first fusion protein
10
Microarray
205
206
454 Pyrosequencing
The 454 system was the first next-generation
sequencing platform available as a commercial
product. In this approach, libraries may be constructed by any method that gives rise to a mixture of short, adaptor-flanked fragments. Clonal
sequencing features are generated by emulsion
PCR, with amplicons captured to the surface of
28-mm beads. After breaking the emulsion,
beads are treated with denaturant to remove
untethered strands and then subjected to a
hybridisation-based enrichment for ampliconbearing beads (i.e. those that were present in an
emulsion compartment supporting a productive
PCR reaction). A sequencing primer is hybridised to the universal adaptor at the appropriate
position and orientation, that is, immediately
adjacent to the start of unknown sequence.
Sequencing is performed by the pyrosequencing method. In brief, the amplicon-bearing beads
are pre-incubated with Bacillus stearothermophilus (Bst) polymerase and single-stranded binding
protein and then deposited on to a micro-fabricated array of picoliter scale wells (with dimensions such that only one bead will fit per well) to
render this biochemistry compatible with arraybased sequencing. Smaller beads are also added,
bearing immobilised enzymes which are also
required for pyrosequencing (e.g. ATP sulfurylase and luciferase). During the sequencing, one
side of the semi-ordered array functions as a flow
cell for introducing and removing sequencing
reagents, whereas the other side is bonded to a
fibre-optic bundle for CCD (charge coupled
device)-based signal detection. At each of several
hundred cycles, a single species of unlabelled
10
Microarray
207
AB SOLiD
This platform has its origins in the system
described by J. Shendure and colleagues in 2005
and in work by McKernan and colleagues at
Agencourt Personal Genomics (Beverly, MA,
USA), which is acquired by Applied Biosystems
(Foster City, CA, USA) in 2006. Libraries may
be constructed by any method that gives rise to a
mixture of short, adaptor-flanked fragments,
though much effort with this system has been put
into protocols for mate-paired tag libraries with
controllable and highly flexible distance distributions. Clonal sequencing features are generated
by emulsion PCR, with amplicons captured to the
surface of 1-mM paramagnetic beads. After breaking the emulsion, beads bearing amplification
products are selectively recovered and then immobilised to a solid planar substrate to generate a
dense, disordered array. Sequencing by synthesis
is driven by a DNA ligase, rather than a polymerase. A universal primer complementary to
adaptor sequence is hybridised to the array of
amplicon-bearing beads. Each cycle of sequencing involves the ligation of a degenerate population of fluorescently labelled octamers. The octamer
mixture is structured, in that the identity of
specific position(s) within the octamer (e.g. base
5) correlates with the identity of the fluorescent
label. After ligation, images are acquired in four
channels, effectively collecting data for the same
base positions across all template-bearing beads.
Then, the octamer is chemically cleaved between
positions 5 and 6, removing the fluorescent label.
Progressive rounds of octamer ligation enable
sequencing of every 5th base (e.g. bases 5, 10,
15, 20). Upon completing several such cycles, the
extended primer is denatured to reset the system.
Subsequent iterations of this process can be
208
HeliScope
The Helicos sequencer, based on work by Quakes
group, also relies on cyclic interrogation of a dense
array of sequencing features. However, a unique
aspect of this platform is that no clonal amplification
is required. Instead, a highly sensitive fluorescence
10
Microarray
209
Microchip-Based Electrophoretic
Sequencing
Significant progress has been made toward developing methods whereby conventional electrophoretic sequencing can be carried out on a
micro-fabricated device. The primary advantages
of this approach include faster processing times
and substantial reductions in reagent consumption. An ideal device for this purpose would integrate all aspects of sample processing, with
microfluidic transport of the reaction volume
between steps, for example, clonal amplification
by nanoliter-scale PCR from a single cell or a
single template molecule; template purification;
cycle sequencing reaction; isolation and concentration of extension fragments; and injection into
a microchannel for electrophoretic separation
(potentially parallelised, e.g. with 384 or more
channels concentrically arranged around a
rotating fluorescence scanner). Many of the key
challenges have already been overcome in proofof-concept experiments. Although it is unclear in
the immediate moment whether these efforts will
be able to keep pace with cyclic-array sequencing
and other strategies, it is worth bearing in mind
that the Sanger biochemistry coupled to electrophoretic separation remains by far the best option
for DNA sequencing in terms of read-length and
accuracy; we simply lack methods to parallelise
it to the extent possible with cyclic-array strategies. One could imagine that lab on-a-chip
nucleic acid analysis could supplant conventional
DNA sequencing for low-scale applications and
may also prove useful in the context of point-ofcare diagnostics.
210
Sequencing by Hybridisation
The basic concept of sequencing by hybridisation
is that the differential hybridisation of labelled
nucleic acid fragments to an array of oligonucleotide probes can be used to precisely identify
variant positions. Usually, the oligos tethered to
the array are designed as a tiling representation of
the reference sequence corresponding to the
genome of interest. As that of the approach taken
by Affymetrix (Santa Clara, CA, USA) and
Perlegen (Mountain View, CA, USA) (in performing extensive SNP discovery in, e.g. human,
mouse and yeast), each possible single-base substitution is represented on the array by an independent feature. Roche NimbleGen (Madison,
WI, USA), in performing sequencing by hybridisation of microbial genomes, takes a two-tier
approach, with an initial array directed at performing approximate localisation, and a second
custom array directed at pinpointing and
confirmation of variant positions. Although
microarrays are clearly useful and cost effective
for genomic resequencing as well as a range of
other genome-scale applications (see above), it is
unclear what will happen as next-generation
sequencing technologies begin to compete for
many of the same applications (e.g. resequencing, but also expression analysis, structural variation analysis, DNA-protein binding).
In terms of sequencing, limitations of
microarrays include the following: (1) Sequences
that are repetitive or subject to cross hybridisation cannot easily be interrogated; (2) it remains
unclear how de novo sequencing can be achieved
with hybridisation-based strategies; and (3)
without very careful data analysis, false positives pose an important problem, and it is not
clear how to obtain the equivalent of redundant
coverage that is possible with conventional and
cyclic-array sequencing. Thus far, sequencing
by hybridisation has likely had its greatest impact
in the context of genome-wide association
studies, which rely on array-based interrogation
(i.e. genotyping by hybridisation) of a highly
defined set of discontinuous genomic coordinates. A different (and earlier) take on the idea of
sequencing by hybridisation involves serial or
10
Microarray
211
212
10
Microarray
For example, protein-encoding genes are characterised by an open reading frame, which includes
a start codon and a stop codon in the same reading frame.
Specific sequences mark the splice sites at the
beginning and end of introns; other specific
sequences are present in promoters immediately
upstream of start codons. Still other sequences
are associated with particular functions in certain
classes of proteins. Computer programs have
been developed that scan the DNA for these
sequences and identify genes on the basis of their
presence and position. Some of these programs
are capable of examining databases of EST and
protein sequences to see if there is evidence that
a potential gene is expressed.
It is important to recognise that the programs
that have been developed to identify genes on the
basis of DNA sequence are not perfect. Therefore,
the numbers of genes reported in most genome
projects are estimates. The presence of multiple
introns, alternative splicing, multiple copies of
some genes and much non-coding DNA between
genes makes accurate identification and counting
of genes difficult.
Homology Searches
One computational method (often the first
employed) for determining gene function is to
conduct a homology search, which relies on
213
214
10
215
216
10
cDNA-AFLP
cDNA-AFLP
For many years the isolation of genes for which
products and mutants were not known was only
possible by differential screening of cDNA libraries. The first in vitro technique for the determination of transcript patterns was differential display
reverse transcription PCR (DDRT-PCR). For the
first time it was possible to determine simultaneously a large part of the transcripts present in a
eukaryotic cell within a single experiment with
high sensitivity. The technique was applied
widely, and for several years no other method
was available by which comprehensive transcript
patterns of eukaryotic cells could be obtained.
Later, Fischer and his group combined DDRTPCR and amplified fragment length polymorphism (AFLP), a method developed by Vos et al.
in 1995 for the characterisation of genomic DNA.
The new technique, termed restriction fragment
length polymorphism-coupled domain-directed
differential display (RC4D), provided a useful
tool to detect differentially expressed members of
individual gene families. The cDNA-AFLP technique is based on the selective PCR amplification
of adapter-ligated restriction fragments derived
217
218
three steps: (1) restriction of cDNA and ligation of oligonucleotide adapters, (2) selective
amplification of sets of restriction fragments
using PCR primers bearing selective nucleotides at the 30 end and (3) gel analysis of the
amplified fragments. Restriction of plant
cDNA with a combination of two restriction
enzymes, a tetra cutter and a hexa cutter,
allows a significant fraction of the cDNA population to be cleaved and to be represented as
a discrete banding pattern on a sequencing
gel. In genomic AFLP with plant DNA, three
selective bases on the end of each primer are
required to give a useful banding pattern. The
lower complexity of cDNA allows the use of
two selective bases for each primer giving a
total of 256 possible primer combinations.
The largest cDNA-AFLP products visible on a
polyacrylamide sequencing gel are around
1,000 bp in size, the lower end of the gel representing approx. 100 bp. In this size window,
an average of 40 bands can be observed for
each primer combination, corresponding to a
total of approx. 10,000 bands.
2. cDNA-AFLP with One Restriction Enzyme
A systematic comparison of known potato
cDNA sequences showed that approx. 45%
are cleaved by the AseI/TaqI restriction
enzyme combination. Thus, in so far as only
one pair of enzymes is applied, about half of
the transcripts present in a cell will not be
detected by the standard cDNA-AFLP technique. To obtain more comprehensive patterns, the cDNA-AFLP protocol has modified
and showed that the rarely cutting enzyme can
be omitted, and meaningful banding patterns
can be produced using TaqI alone. Samples
derived from buds of red and white flowers of
the common morning glory (Ipomoea purpurea) were compared using 96 different primer
combinations, each of which gave approximately 50 bands, corresponding to a total of
approximately 5,000 bands.
3. iAFLP
iAFLP (introduced AFLP) is a quantitative
high-throughput expression profiling method
specifically designed to measure the concentrations of known transcripts in numerous
10
Applications
cDNA-AFLP and its application to plants was
first described by Bachem et al. in 1996, who
analysed differential gene expression in a synchronised potato in vitro tuberisation system.
During screening with different primer combinations, two lipoxygenase cDNA fragments were
isolated on the basis of their differential expression during potato tuber formation. Both transcripts are highly tuber specific and are expressed
strongly in 15-d-old tubers, but not in stolons,
leaves or petioles and only at very low levels in
stems. The dramatic induction of a lipoxygenase
gene just after the start of tuberisation led the
authors to speculate that the expression of at least
one of these enzymes might directly be linked to
the tuber development process. Following this
initial report, a small number of papers have
described the use of cDNA-AFLP fingerprinting
in plant and animal systems. Habu et al. in 1997
compared mRNA samples obtained from the
flower buds of two lines of Ipomoea purpurea.
Fourteen cDNA fragments (approximately 0.3%)
amplified differently in the two samples. Two of
these were shown to have been derived from a
gene that was actively expressed in the buds of
red flowers but not in those of white flowers.
Sequence analysis showed that this cDNA carries
a sequence highly homologous to the chalcone
RFLP-Coupled Domain-Directed
Differential Display (RC4D)
Many genes and their protein products have a
modular structure where the presence of certain
domains (family-specific domains, FSDs) defines
membership in different gene families. This is
well characterised for the chlorophyll a/b binding
proteins and for many transcription factors.
Restriction fragment length polymorphism-coupled domain-directed differential display (RC4D,
which was first described by Fischer and his team
in 1995) is a method specifically designed to
analyse expression of multi-gene families at different developmental stages, in diverse tissues or
in different organisms. RC4D combines cDNAAFLP technology with a gene family-specific
version of DDRT-PCR. In RC4D, instead of arbitrary decameric primers, longer primers directed
against an FSD are used, allowing cDNAs belonging to the same gene family to be selectively
amplified. As the amplification products are relatively uniform in length, restriction fragment
length polymorphism (RFLP) is introduced by
digestion with a frequently cutting restriction
enzyme. This reduces the amplicon size from
approximately 1 kbp to several hundred base
pairs, which is optimal for separation on acrylamide gels. Family members can thus easily be
distinguished by size. The RC4D protocol can be
explained briefly as cDNA is synthesised from
mRNA with an oligo(dT) primer bearing a PCR
downstream primer binding sequence at its 5
end. PCR is performed with the downstream
primer and an upstream primer specific for a
family-specific domain (FSD). This results in a
mixture of truncated family member cDNAs. The
amplicon is digested with a frequently cutting
restriction enzyme, and double-stranded linkers
are ligated to the cohesive ends. PCR with a
linker primer and an FSD primer results in a
219
220
T-DNA Tag
The process of gene tagging using T-DNA as the
insert has been used effectively to isolate genes,
especially in Arabidopsis. T-DNA insertional
mutagenesis has also been used to produce 22,090
primary transgenic rice plants having approximately 25,700 tags. Another efficient T-DNA
tagging system for japonica rice has also been
described in which over 1,000 T-DNA tags in rice
genome have been characterised. It clearly
revealed that preferential insertion has occurred
in gene-rich regions.
Transposon Tags
Transposons, first recognised by Barbara
McClintock in maize, have become a powerful
tool for gene isolation. The mutagenic potential
of mobile elements and their ability to tag the
mutated sequences along with their widespread
distribution have been exploited for use as tools
for gene isolation as these properties help in the
cloning of genes. The application of transposon
tagging was initially restricted to plants, such as
maize (Zea mays) and snapdragon (Antirrhinum),
with active and well-characterised endogenous
10
transposons. But, now maize transposon systems have been used for mutagenesis in heterologous transgenic plant species which otherwise
lack an active endogenous transposon family.
For example, the Ac element was introduced
into rice, and checking for hygromycin resistance identified the transposed plants, since the
autonomous Ac element had been cloned
between the promoter and the hph-coding
region. A strategy, using the maize Ac-Ds system,
has also been effectively used for gene tagging in
case of rice. Retrotransposons, transposable
elements that transpose via an RNA intermediate
and are structurally similar to integrated copies of
retroviruses, have also been shown to be efficient
gene tags as demonstrated by the introduction
of tobacco retrotransposon Tto1 into rice and
its autonomous transposition through reverse
transcription.
Classical genetic approaches to identify
genes, as mentioned earlier, are generally based
on the creation of mutations leading to a recognisable phenotype reflecting the gene function,
such as in gene tagging. However, this is not
always possible, since many genes show functional redundancy, and thus mutation in one gene
or locus could be compensated for by the functioning of one or more other family members.
Moreover, certain genes function at different
stages of development. Mutations in such genes
could cause early lethality or could be highly
pleiotropic. This can thus prevent the identification
of the role of the gene. Trapping techniques have
been developed keeping these limitations in
mind. Entrapment strategies rely on the use of
inserts, such as transposons or T-DNA, containing reporter gene constructs, whose expression is
dependent on cis-acting regulatory sequences at
the site of insertion. The inserts then allow for
the identification of genes, based on their expression pattern, even though they might not display
an obvious mutant phenotype. Three basic types
of gene traps are constructed using reporter
genes such as those encoding b-glucuronidase
(GUS) and green fluorescent protein (GFP):
enhancer trap, promoter trap and gene trap.
Another approach used to access gene function
is activation tagging. This technique is based on
MicroRNAs
221
MicroRNAs
Post-transcriptional Gene Silencing
Epigenetic regulation of gene expression is a heritable change in gene expression that cannot be
explained by changes in gene sequence. It can
result in the repression or activation of gene
expression and is therefore referred to as gene
silencing or gene activation, respectively. Until
the end of the 1980s, only modifications of DNA
or protein that lead to transcriptional repression
or activation, or to the formation of prions, were
classified as epigenetic. During the 1990s, however, a number of gene-silencing phenomena that
occur at the post-transcriptional level were discovered in plants, fungi, animals and ciliates,
introducing the concept of post-transcriptional
gene silencing (PTGS) or RNA silencing. PTGS
results in the specific degradation of a population
of homologous RNAs. It was first observed after
introduction of an extra copy of an endogenous
gene (or of the corresponding cDNA under the
control of an exogenous promoter) into plants.
Because RNAs encoded by both transgenes and
homologous endogenous gene(s) were degraded,
the phenomenon was originally called co-suppression. A similar phenomenon in the fungus
Neurospora crassa was named quelling. Later,
several groups showed that PTGS can also affect
transgenes that are not homologous to endogenous genes, suggesting that this phenomenon is
not a simple regulatory mechanism that controls
the expression of endogenous genes. Fire et al. in
1998 identified a related mechanism, RNA interference (RNAi), in animals. RNAi results in the
specific degradation of endogenous RNA in the
presence of homologous dsRNA either locally
injected or transcribed from an inverted repeat
transgene. Injected dsRNA, as well as transgenes
expressing dsRNA, also triggers silencing of
homologous (trans)genes in plants. This strongly
suggests that a mechanistic link between PTGS,
222
10
Biochemical Techniques
Biochemistry involves the study of chemical processes that occur in the living organisms with the
ultimate aim of understanding the nature of life in
molecular terms. There are several biochemical
techniques that have their role in unravelling the
molecular basis of life. One- and two-dimensional electrophoresis is the most widely used
techniques in protein identification and characterisation. Mass spectrometry is mainly used to
predict protein structure and function (proteomics) and small metabolites (metabolomics). There
are large numbers of biochemical techniques that
have potential application in MAS, and only a
few major techniques are discussed hereunder.
Plant Proteomics
Proteins are the workhorses of the cell and have
important functions in both normal and abnormal
states. In order to understand how proteins interact and regulate various cellular processes, it is
important to understand their expression behaviour under a wide range of experimental conditions. Unlike the genome which contains a fixed
number of genes, the levels of protein within the
cells are highly dynamic. Proteins are constantly
processed within the cell in response to external
stimuli and undergo a wide range of posttranslational modifications. As a result, it is hard to
accurately determine the exact number or quantities of proteins which are present within the biological systems. In addition, protein families are
extremely diverse and have considerable differences in their physical sizes, chemical and structural properties, affinity constants and relative
abundance within the cells. As a result, accurately
Plant Proteomics
223
10
224
Why Proteomics?
Many types of information cannot be obtained
from the study of QTLs or genes alone. For
example, proteins (intern metabolites), not genes,
are responsible for the phenotypes of cells. It is
impossible to elucidate mechanisms of growth
and development, disease, aging and effects of
the environment solely by studying the genome.
Only through the study of proteins can protein
modifications be characterised and the targets of
drugs identified.
1. Annotation of the Genome
One of the first applications of proteomics will
be to identify the total number of genes in a
given genome. This functional annotation of a
genome is necessary because it is still difficult
to predict genes accurately from genomic data.
One problem is that the exonintron structure
of most genes cannot be accurately predicted
by bioinformatics. To achieve this goal, genomic
information will have to be integrated with
data obtained from protein studies to confirm
the existence of a particular gene.
2. Protein Expression Studies
In recent years, the analysis of mRNA expression by various methods has become increasingly popular. These methods include SAGE
and DNA microarray technology (see above).
However, the analysis of mRNA is not a
direct reflection of the protein content in the
cell. Consequently, many studies have now
shown a poor correlation between mRNA
and protein expression levels. The formation
of mRNA is only the first step in a long
sequence of events resulting in the synthesis
of a protein. First, mRNA is subject to posttranscriptional control in the form of alternative splicing, polyadenylation and mRNA
editing. Many different protein isoforms can
be generated from a single gene at this step.
Second, mRNA then can be subject to regulation at the level of protein translation.
Proteins, having been formed, are subject to
posttranslational modification. It is estimated
that up to 200 different types of posttranslational protein modification exist. Proteins
3.
4.
5.
6.
225
Functional Proteomics
Functional proteomics is a broad term for
many specific, directed proteomics approaches.
In some cases, specific subproteomes are isolated by affinity chromatography for further
analysis. This could include the isolation of protein complexes or the use of protein ligands to
isolate specific types of proteins. This approach
allows a selected group of proteins to be studied
and characterised and can provide important
information about protein signalling, disease
mechanisms or proteindrug interactions.
Protein Analysis
Types of Proteomics
Protein Expression Proteomics
The quantitative study of protein expression
between samples that differ by some variable is
known as expression proteomics. In this approach,
protein expression of the entire proteome or of
subproteomes between samples can be compared.
Information from this approach can identify
novel proteins in signal transduction or identify
disease-specific proteins.
Structural Proteomics
Proteomics studies whose goal is to map out the
structure of protein complexes or the proteins
present in a specific cellular organelle are known
as cell map or structural proteomics. Structural
proteomics attempts to identify all the proteins
within a protein complex or organelle, determine where they are located and characterise all
proteinprotein interactions. An example of
structural proteomics is the analysis of the
nuclear pore complex. Isolation of specific subcellular organelles or protein complexes by
purification can greatly simplify the proteomic
analysis. This information will help join together
the overall architecture of cells and explain how
expression of certain proteins gives a cell its
unique characteristics.
226
10
Alternatives to Electrophoresis
in Proteomics
The limitations of 2-DE have inspired a number
of approaches to bypass protein gel electrophoresis. One approach is to convert an entire protein
227
228
10
Mass Spectrometry
MS enables protein structural information, such as
peptide masses or amino acid sequences, to be
obtained. This information can be used to identify
the protein by searching nucleotide and protein
databases. It also can be used to determine the type
and location of protein modifications. The harvesting of protein information by MS can be divided
into three stages: (1) sample preparation, (2) sample ionisation and (3) mass analysis.
Sample Preparation
In most of proteomics, a protein is resolved from
a mixture by using a 1- or 2-D polyacrylamide
gel. The challenge is to extract the protein or its
constituent peptides from the gel, purify the sample and analyse it by MS. The extraction of whole
proteins from gels is inefficient; however, if a
protein is in-gel digested with a protease, many
of the peptides can be extracted from the gel. A
method for in-gel protein digestion was developed and is now commonly applied to both 1and 2-D gels. In-gel digestion is more efficient at
sample recovery than other common methods
such as electroblotting. In addition, the conversion of a protein into its constituent peptides provides more information than can be obtained
from the whole protein itself. For many applications, the peptides recovered following in-gel
digestion need to be purified to remove gel contaminants. Common impurities from electrophoresis such as salts, buffers and detergents can
interfere with MS. In addition, peptide samples
Sample Ionisation
For biological samples to be analysed by MS, the
molecules must be charged and dry. This is
accomplished by converting them to desolvated
ions. The two most common methods for this are
electrospray ionisation (ESI) and matrix-assisted
laser desorption/ionisation (MALDI). In both
methods, peptides are converted to ions by the
addition or loss of one or more protons. ESI and
MALDI are soft ionisation methods that allow
the formation of ions without significant loss of
sample integrity. This is important because it
enables accurate mass information to be obtained
about proteins and peptides in their native states.
(a) Electrospray Ionisation: In ESI, a liquid sample flows from a microcapillary tube into the
orifice of the mass spectrometer, where a
potential difference between the capillary and
the inlet to the mass spectrometer results in the
generation of a fine mist of charged droplets.
As the solvent evaporates, the sizes of the
droplets decrease, resulting in the formation of
desolvated ions. A significant improvement in
ESI technology occurred with the development
of nanospray ionisation. In nanospray ionisation, the microcapillary tube has a spraying
orifice of 12 mm and flow rates as low as
510 nl/min. The low flow rates possible with
nanospray ionisation reduce the amount of
sample consumed and increase the time available for analysis. For ESI, there are several
ways to deliver the sample to the mass spectrometer. The simplest method is to load
individual microcapillary tubes with sample.
Because a new microcapillary tube is used for
each sample, cross-contamination is avoided.
In ESI, peptides require some form of
purification after in-gel digestion, and this can be
accomplished directly in the microcapillary
229
Mass Analysis
Mass analysis follows the conversion of proteins
or peptides to molecular ions. This is accomplished by the mass analysers in a mass spectrometer, which resolve the molecular ions on the
basis of their mass and charge in a vacuum.
(a) Quadrupole Mass Analysers: One of the most
common mass analysers is the quadrupole
mass analyser. Here, ions are transmitted
through an electric field created by an array
of four parallel metal rods, the quadrupole.
A quadrupole can act to transmit all ions or as
a mass filter to allow the transmission of ions
of a certain mass-to-charge (m/z) ratio. If
multiple quadrupoles are combined, they can
be used to obtain information about the
amino acid sequence of a peptide. For a more
230
10
Peptide Fragmentation
As peptide ions are introduced into the collision
chamber, they interact with the collision gas (usually nitrogen or argon) and undergo fragmentation primarily along the peptide backbone. Since
peptides can undergo multiple types of fragmentation, nomenclature has been created to indicate
what type of ions has been generated. If, after
peptide bond cleavage, the charge is maintained
on the N-terminus of the ion, it is designated a
231
232
10
Phosphoprotein Enrichment
Proteomics Approach to Protein
Phosphorylation
Posttranslational modification of proteins is a
fundamental regulatory mechanism, and characterisation of protein modifications is paramount
for understanding protein function. MS is one of
the most powerful tools for the analysis of protein modifications because virtually any type of
protein modification can be identified. Although
we focus here on protein phosphorylation, the
analysis of other types of protein modification by
MS can also been done.
Protein phosphorylation is one of the most
common of all protein modifications and has been
found in nearly all cellular processes. MS can be
used to identify novel phosphoproteins, measure
changes in the phosphorylation state of proteins
in response to an effector and determine phosphorylation sites in proteins. Identification of
phosphorylation sites can provide information
about the mechanism of enzyme regulation and
the protein kinases and phosphatases involved.
A proteomics approach to protein phosphorylation has the advantage that instead of studying
changes in the phosphorylation of a single
protein in response to some perturbation, one
can study all the phosphoproteins in a cell (the
phosphoproteome) at the same time. A common
approach to studying protein phosphorylation
events is the use of in vivo labelling of phosphoproteins with inorganic 32P. The phosphoproteomes of cells that differ in some way (e.g.
normal vs. water stressed) can be analysed by
growing cells in inorganic 32P and creating cell
lysates. Changes in the phosphorylation state of
233
234
10
etry (CEMS) and Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR-MS)
systems has been demonstrated for metabolite
profiling. The first of these, CEMS, is a highly
sensitive methodology that can detect low-abundance metabolites and that provides good analyte
separation, whereas the second, FT-ICRMS,
relies solely on very high-resolution mass analysis, which potentially enables the measurement
of the empirical formula for thousands of
metabolites; however, it is somewhat limited by
the lack of chromatographic separation. NMR
approaches, which rely on the detection of magnetic nuclei of atoms after application of a constant magnetic field, are the main alternative to
MS-based approaches for metabolite profiling.
These are well-developed and well-validated
methods, and the computer software associated
with NMR instrumentation is, consequently,
also advanced. Furthermore, despite limitations
in its sensitivity and, therefore, in metabolite coverage, it retains an advantage over MS-based
approaches for certain biological questions. For
example, it can be used non-invasively (i.e. on
living cells) because the pH of the vacuole is
different from that found elsewhere in the cell.
NMR can provide subcellular information, and it
is easier to derive atomic information for flux
modelling from NMR than from MS-based
approaches.
Physiological Techniques
Several numbers of physiological criteria (including physiological traits determining yield under
normal and unfavourable environments and
genetic basis of such physiological traits) need to
be evaluated before starting up a molecular breeding programme. The use of physiological trait as
indirect selection index for yield (such as tillering,
xylem vessel diameter, leaf dimensions, stomatal
or cuticular water loss, harvest index) in breeding
programme has been discussed elsewhere. As that
of previous sections, only few physiological techniques are explained below, though large arrays of
techniques are available to increase the efficiency
of QTL mapping and MAS.
Physiological Techniques
235
236
10
Genomics-Assisted Breeding
237
Nitrogenous compound
Proteins
Betaine
Glutamate
Aspartate
Glycine
Choline
Putrescine
Organic acid
Oxalate
Malate
Genomics-Assisted Breeding
A number of resources for major crop species
including detailed, high-density genetic maps,
cytogenetic stocks, contig-based physical maps
and deep coverage and large-insert libraries are
now available to the public. These tools have
facilitated the isolation of genes via map-based
cloning, the localisation of quantitative trait loci
(QTLs) and the sequencing and annotation of
large genomic DNA fragments in several plant
species. Complete genome sequences of crop
plants such as Arabidopsis and rice have become
available through public databases. Further,
whole-genome or gene space sequencing projects for several plant species such as maize
(http://www.maizegenome.org/), sorghum, wheat
(http://www.wheatgenome.org/), tomato (http://
sgn.cornell.edu/help/about/tomato_sequencing.
html), tobacco (http://www.intl-pag.org/13/abstracts/
PAG13_P027.html), poplar (http://genome.jgi-psf.
org/Poptr1/), Medicago (http://www.medicago.org/
genome/) and lotus (http://www.kazusa.or.jp/lotus/)
are now ready to use. The widespread use of
transcriptome sampling strategies is a complementary approach to genome sequencing and results
in a large collection of expressed sequence tags
(ESTs) for almost all the important plant species
(http://www.ncbi.nlm.nih.gov/dbEST/dbEST_
summary.html). Comparative sequence analysis
can be used in some cases to facilitate isolation
of genes in species lacking ESTs. However,
EST resources have some limitations, such as
unidentified contaminants, chimeric sequences,
238
Functional Markers
During the past decades, molecular mapping has
identified chromosome regions carrying important genes in crop plants using SSR, RFLP, AFLP,
RAPD, DArT and other markers. However, these
usually neutral genetic markers can be some
10
Comparative Genomics
Comparative Genomics
The number of sequenced plant genomes and
associated genomic resources is growing rapidly
with the advent of both an increased focus on
plant genomics from funding agencies and the
239
240
10
241
242
10
Bibliography
243
Bibliography
Literature Cited
Bachem CWB, van der Hoeven RS, de Bruijn SM,
Vreugdenhil D, Zabeau M, Visser RGF (1996)
Visualization of differential gene expression using a
novel method of RNA fingerprinting based on AFLP:
analysis of gene expression during potato tuber development. Plant J 9:745753
Edman P (1949) A method for the determination of amino
acid sequence in peptides. Arch Biochem 22(3):475
Fire A, Xu S, Montgomery MK, Kostas SA, Driver SE,
Mello CC (1998) Potent and specific genetic interference by double-stranded RNA in Caenorhabditis elegans. Nature 391:806811
244
Fischer A, Saedler H, Theissen G (1995) Restriction fragment length polymorphism-coupled domain-directed
differential dis-play: a highly efficient technique for
expression analysis of multigene families. Proc Natl
Acad Sci USA 92:53315335
Habu Y, Fukuda-Tanaka S, Hisatomi Y, lida S (1997)
Amplified restriction fragment length polymorphismbased mRNA fingerprinting using a single restriction
enzyme that recognizes a 4-bp sequence. Biochem
Biophys Res Commun 234:516521
Ji H, Hodges E et al (2007) Genome-wide in situ exon capture
for selective resequencing. Nat Genet 39:15221527
Liu Y, He Z, Appels R, Xia X (2012) Functional markers
in wheat: current status and future prospects. Theor
Appl Genet 125:110
Shendure J et al (2005) Accurate multiplex polony sequencing
of an evolved bacterial genome. Science 309:17281732
Vos P, Hogers R, Bleeker M, Reijans M, van de Lee T,
Hornes M, Freijters A, Pot J, Peleman J, Kuiper M,
Zabeau M (1995) AFLP: a new concept for DNA fingerprinting. Nucleic Acids Res 21:44074414
Velculescu VE, Zhang L, Vogelstein B, Kinzler KW (1995)
Serial analysis of gene expression. Science 270:484487
10
Further Readings
Buzdin A, Lukyanov S (eds) (2007) Nucleic acids hybridization. Springer, New York
Rhee S, Dickerson J, Xu D (2007) Bioinformatics and its
applications in plant biology. Annu Rev Plant Biol
57:335360
Shendure J, Hanlee J (2008) Next-generation DNA
sequencing. Nat Biotechnol 26(10):11351145
Tyagi AK, Khurana JP, Khurana P, Raghuvanshi S, Gaur
A, Kapur A, Gupta V, Kumar D, Ravi V, Vij S, Khurana
P, Sharma S (2004) Structural and functional analysis
of rice genome. J Genet 83:7999
Varshney RK, Graner A, Sorrells ME (2005) Genomicsassisted breeding for crop improvement. Trends Plant
Sci 10(12):621630
Yamamoto M et al (2001) Use of serial analysis of gene
expression (SAGE) technology. J Immun Method
250:4566
Ye SQ et al (2000) MiniSAGE: gene expression profiling
using serial analysis of gene expression from 1 mg
total RNA. Anal Biochem 287:144152
11
Rice
Rice (Oryza sativa L.) is an intimate part of the
culture, food habits and economy of many societies
and is one of the most important crops for mankind. It is the basic food of more than three
billion people, and it accounts for 5080% of
their daily calorie intake. To meet the growing
demand for food and to sustain food security for
people in low-income countries, rice production
has to be raised by another 70% over the next
three decades. This means raising the rice yield
from the current level if these countries can
maintain their rice-growing area at current levels.
For the irrigated ecosystem, the rice yield
will be difficult to rise from the current levels of
56 t/ha. The potential for increasing yield in the
rainfed ecosystem is vast, as the current yield is
only about 2.0 t/ha (compared to 5.0-t attainable
yields) and nearly 40% of the total rice area is
grown under rainfed conditions and future
increases in rice production will rely on rainfed
ecosystems. Hence, this section describes the
importance of MAS in genetic improvement of
rice under water-limited environments. As that of
this complex drought-tolerance trait, MAS can
also be applied to genetically improve other
complex characteristics such as pest and disease
resistance, nutrient improvement and other quality
and agronomic traits.
N.M. Boopathi, Genetic Mapping and Marker Assisted Selection: Basics, Practice
and Benefits, DOI 10.1007/978-81-322-0958-4_11, Springer India 2013
245
246
11
Phenology
If a pattern of drought occurrence can be identified,
the plant can escape drought by having the most
sensitive phenological stages coinciding with the
periods of lower risks of drought stress either
through manipulation of the plant duration or
through manipulation of the cropping calendars.
For example, in a terminal stress situation, a
common phenomenon in South Asia, breeding
Rice
Root System
The possession of deep and thick root system
which allows access to water deep in the soil
profile is considered crucially important in determining drought resistance. The trait may be less
important in rainfed lowland rice, where hardpans
may severely restrict root growth. Here, the
ability to penetrate a hard layer is considered
important. This trait may also be useful in upland
rice where high penetration resistance may limit
rooting depth and where soils will harden as they
dry. The penetration of roots through uniform
hard layers is probably achieved through the
possession of large root diameter which resists
buckling, but when the impedance is due to a
coarse textured sandy or stony horizon, thin roots
would penetrate more easily. The investment of
carbon in a deep root system may have a yield
implication because of loss of carbon allocation
to the shoot. The rapid development of deep or
thick root systems may, therefore, be of limited
value if terminal drought occurs early in the crop
cycle, but it is certainly important for intermittent
and later terminal drought situations. It is also
important to note that root growth is influenced
by the environment. Chemical or physical adverse
conditions such as low water potential or high/
low soil temperature directly inhibit root growth.
Biological factors in the rooting environment
such as root-feeding nematodes, termites, mites
and aphids can severely reduce root proliferation
or rooting depth and thereby affect drought resistance. The shoot environment can also indirectly
influence root growth either via carbon supply
or signalling process (e.g. light interception,
247
Osmotic Adjustment
Osmotic adjustment (OA) is increasingly recognised in several crop plants as an effective component of drought resistance, which has a positive
direct or indirect effect on plant productivity
under drought stress. Generally, when cells are
subjected to slow dehydration, compatible
solutes are accumulated in the cytosol resulting
in the maintenance of cell water content against
the reduction in apoplastic water potential. The
compatible solutesvarious sugars, organic
acids, amino acids, sugar alcohols or ions (most
commonly K+)differ with plant species and
genera. The main solutes that are responsible
for OA in rice under water-deficit conditions
were not elucidated. Rice does not accumulate
glycine betaine because of a deficiency in choline
monooxygenase and betaine aldehyde dehydrogenase, the key enzymes that involved in glycine
betaine synthesis. Rice accumulates proline, but
248
Dehydration Tolerance
Dehydration tolerance (the ability of leaves to
tolerate desiccation level water stress) assists the
plant organs to survive short-term water deficits.
The lowest leaf water potential that leaves reach
just prior to death (lethal leaf water potential) has
been used to determine dehydration tolerance.
During terminal stress, dehydration tolerance
may allow plants to maintain metabolic activity
for longer time and to translocate more stored
assimilates to the grain. Plants with the ability
to adjust osmotically or tolerate dehydration
may delay leaf rolling, delay stomatal closure
and maintain leaf expansion with little cost, which
should promote resistance particularly in the
11
Shoot-Related Drought-Resistance
Traits
Leaf Rolling
Several mechanisms of drought resistance are
associated with the shoots of rice. Leaf rolling
(drought avoidance) reduces the water loss in
addition to reducing the leaf area exposed to heat
and light radiation. Varieties differ in their ability
to roll leaves under similar water deficit. There is
some evidence that enhanced ability to roll leaves
confers a yield advantage under drought conditions.
However, most breeders consider the triggering
of leaf rolling as an indication of a plant suffering
and select against its early manifestation.
Green Leaf Area
It has been suggested that plants which are able
to retain green leaf area are better able to recover
after drought and give good yield. Leaf drying,
often used in field scoring, is the reverse side
of the stay-green ability and has been shown
to be correlated with leaf relative water content.
However, it has proved difficult to separate the
green leaf retention from the possible underlying
mechanisms of drought resistance since the
process of drought recovery in terms of mechanisms, importance or genetic variation is poorly
understood.
Stomatal Closure and Canopy
Temperature
Another mechanism of drought avoidance in the
rice shoot is fast stomatal closure which acts to
reduce water losses. Varietal differences in the
sensitivity of stomatal conductance to leaf water
Rice
249
Epicuticular Wax
It has been repeatedly shown that total crop dry
matter production is linearly and positively
related to crop transpiration. This relationship
is partly derived from the fact that the control of
both transpiration and CO2 exchange is dependent
on stomatal activity. However, loss of water can
also occur through non-stomatal pathways for
which no return in CO2 fixation is expected. Nonstomatal resistance to water loss from leaves
can also be considered a drought-avoidance
mechanism. An important non-stomatal pathway
is the leaf cuticle. Research suggests that rice has
a low cuticular resistance to water loss compared
with other grasses but variation between varieties
exists, and this may have potential in breeding
for improvement in drought resistance. The
fact that traditional upland rice cultivars have
relatively higher epicuticular wax supports the
hypothesis that high epicuticular wax is an important
drought-resistance attribute in rice. The specific
effects of the amount, the composition and the
form of cuticular wax in rice were explored, but
the quantification of these factors with respect
to rice performance under drought stress is still
250
Other Traits
The value of improving the use of absorbed
light, resistance to photoinhibition and capacity
for non-photochemical quenching to improve
drought resistance of rice has been described.
In addition, a genetic basis for difference in
resistance to photoinhibition in rice has been
demonstrated. These traits are physiologically,
biochemically and genetically complex in themselves and interact with each other. Since abscisic
acid (ABA) has been shown to be involved in
regulating stomatal conductance, OA and root
conductivity, interest has been shown in measuring
ABA contents in order to establish relationships
with drought resistance. Varietal differences
in leaf ABA content and sensitivity to applied
ABA also exist in rice.
In summary, a utilisable secondary trait in
breeding for drought resistance in rice should be
(1) genetically associated with grain yield under
drought, (2) highly heritable, (3) stable and feasible
to measure and (4) not associated with yield loss
under ideal growing conditions. However, despite
the description of several above-said traits,
these traits are rarely selected for in traditional
rice improvement programs because phenotypic
selection for these traits involves complex,
difficult and labour-intensive protocols; the tremendous diversity of environments and water
availability; and the large genotype environment interactions which complicate selection.
Knowledge from physiological studies indicated
that the ability of the root systems in exploiting
deep soil moisture and the capacity for OA
during water stress are considered as major
drought-resistance traits in rice. They can also be
negatively correlated due to tight genetic linkage
of some of the controlling genes as was shown
for OA and root morphology. Therefore, the
impact of one trait in isolation may be difficult to
establish. One promising approach is to map genetic
loci (quantitative trait loci, QTL) influencing
11
Rice
251
127 (RFLP)
127 (RFLP)
Co39/Moroberekan281 F7 RILs
(52)
Linkage map
coverage (cM) Traits
Root thickness
Rootshoot ratio
Root dry weight
per tiller
Deep root weight
Maximum root depth
Drought avoidance
(leaf rolling)
Number of penetrating
roots
Total number of roots
Root penetration
index
Tiller number
Dehydration tolerance
Osmotic adjustment
Relative water content
4
19
6
10
5
1
2
8
4
18
Across
population
QTL identified
Across trials/
No. of QTL experiments
18
16
14
14
36
32
35
19
13
18.5
35
35
Maximum
phenotypic
variance (%) References
56
Champoux et al.
(1995)
38
11
Co39/Moroberekan281 F7 RILs
(202)
Parents
Populationa
Co39/Moroberekan281 F7 RILs
(203)
Table 11.1 Details of mapping population, linkage map characteristics and QTL identified for drought-resistant traits in rice from selected publications
252
Recent Advances in MAS in Major Crops
150
BC3F3(142)
135 DH
(90, 84,
56 & 109)
IR62266/
IR60080
IR64/Azucena
2,457
1,370
Days to flowering
Plant height
Grain yield
Harvest index
Days to maturity
Root thickness
Root volume
Root dry weight
Maximum root length
Seminal root
length
Relative seminal
root length
Adventitious
root number
Relative adventitious
root number
Lateral root length
Relative lateral
root length
Lateral root number
Relative lateral
root number
Osmotic adjustment
7
1
4
1
2
1
2
2
1
1
1
1
1
1
1
1
1
1
12
19
24.6
20.0
15.7
19.7
20.4
26.9
29.1
30.7
12.9
25.0
11.7
12.3
14.4
11.9
15.0
18.2
13.9
13.4
Venuprasad et al.
(2002)
DH doubled haploids, RIL recombinant inbred lines, BC backcross progenies, RFLP restriction fragment length polymorphism, RAPD random amplified polymorphic DNA,
SSR simple sequence repeats, cDNA complimentary DNA, AFLP amplified fragment length polymorphism
a
Subset of population used for phenotyping is indicated in parenthesis
IR1552/Azucena
Rice
253
254
11
Rice
255
256
11
Rice
257
258
11
Cotton
Cotton
Cotton (Gossypium spp.) is an important commercial and natural fibre crop of global importance and generates high employment at various
stages. Though synthetic/man-made fibres have
made inroads, cotton deserves the prime position in India with cultivation. It has been in
cultivation in India for more than 5,000 years.
Globally, India ranks first in cotton area but
occupies second position in production, next
to China. Cotton has significant contributions
in Indian economy by earning more than 30%
of foreign exchange.
India has the distinction of growing all the
four cotton cultivable species, namely, Gossypium
arboreum, G. herbaceum, G. barbadense and
G. hirsutum. Among the four species, the tetraploid
(or allopolyploid) species G. hirsutum L. and G.
barbadense L. accounted for 90 and 8% of the
world cotton production, respectively. Though
India is the major cultivating and consuming
country, commercial cotton lint produced in India
is in narrow fibre quality spectrum, and hence
several 1,000 bales of cotton lint that fit to modern
textile industries are being imported. Thus, it is
imperative to improve the fibre quality of the
cotton cultivars.
Conventional breeding methods have contributed much to the development of high-yielding
cotton cultivars. But, the efficiency of fibre
259
260
11
Population
type
F2
RILs
F2
RILS
BC, F2
BC3F2
RIL
Species involved
G. hirsutum G. barbadense
G. hirsutum G. hirsutum
G. hirsutum G. hirsutum
G. hirsutum G. hirsutum
G. hirsutum G. barbadense
G. hirsutum G. tomentosum
G. hirsutum G. hirsutum
Uniformity ratio
Fibre elongation
Micronaire
Lint percentage
Boll size
Lint percentage
Reniform nematode
resistance
Fibre fineness
Fibre strength
Fibre length
Earliness
Micronaire
2.5 % span length
Elongation percentage
Bundle strength
Fibre length
Fibre thickness
Fibre elongation
Chr.14
Chr.7, Chr.13,Chr.18, Chr.24, Chr.25
Chr.4, Chr.7, Chr.14, Chr.18, Chr.23,
Chr.25
Chr.3, Chr.4, Chr.5, Chr.7, Chr.14,
Chr.16, Chr.19, Chr.25
Chr.4, Chr.7, Chr.13, Chr.14, Chr.25
Chr.4, Chr.7, Chr.13, Chr.14, Chr.15,
Chr.18, Chr.25
Table 11.2 Selected examples in QTL mapping for agronomic, yield and fibre quality traits in cotton
13.4
11.5
19.1
11.9
27.8
20.6
87.1
35
19
15
Maximum phenotypic
variance observed (%)
13.3
9.7
12.0
14.7
12.6
14.0
12.3
8.1
13.3
38.6
9.7
13.7
Wu et al. (2009)
References
Jiang et al. (1998)
Cotton
261
262
11
Cotton
263
264
Complexities in Integration
of Functional Genomics with QTL
Fibre gene function is highly conserved in the
genomes of wild and cultivated species, as well
as diploid and tetraploid species, despite millions
of years of evolutionary history. The phenotypic
variation in fibre properties therefore is more likely
one of quantitative differences in gene expression
as opposed to differences in the genotype at the
DNA level. Hence, further studies are required to
understand the number of copies of the genes,
their regulation and specific function in fibre
development. Though systematic transcriptomic
approaches can be combined with QTL analyses
(discussed below), these studies do not address
the occurrence of alternative splicing or the
posttranslational modifications of the proteins.
In addition, proteins can move in and out of other
macromolecular complexes and thus modifying
their functionality. This level of complexity cannot
be tackled using transcriptomics alone, and
hence it is vital to include proteomics in MAS.
On the other hand, biochemical functions of only
a small proportion of the identified proteins have
been demonstrated and/or determined based on
the assumptions that proteins sharing conserved
domains have the same activity. Hence, the leftover
11
Cotton
Map-Based Cloning
As QTL mapping results accumulate over the
next years, attention will turn to clone QTL and
then to using them. This requires higher resolution of QTL mapping, combined with a dense
marker map. A centimorgan (cM), corresponding
to a crossover of 1%, can be a span of 101,000 kbp
and can vary across species or even within the
chromosome of the given species. This region
may contain both desirable and undesirable
genes, and hence to avoid the linkage drag of
undesirable traits, it is important to establish
the causal relationship between the QTL and
phenotype using positional or map-based cloning. The physical size of a cM in cotton is not
prohibitive to map-based cloning, but the lengthy
genetic map will require a large number of markers
in order to be sufficiently close to most genes for
chromosome walking. A new high-throughput
marker, SNPs, is gaining its importance in this
context, but huge initial investment for its generation necessitates simple innovative and economic
marker techniques. It is also important to note
that instead of using anonymous DNA markers,
development and use of gene-specific functional
markers such as SRAP, TRAP and PAAP (see
chapter 3) may increase the efficiency of mapbased cloning.
Further, map-based cloning in polyploids such
as cotton introduces a new technical challenge
not encountered in diploid (or highly diploidised)
organisms, for example, that virtually all singlecopy DNA probes occur at two or more unlinked
265
266
11
Improved Databases
There is a great need to expand bioinformatic infrastructure for managing, curating and annotating
Mungbean
Mungbean
Pulses are important protein resources that help
meet the nutritional requirements of poor people
living in developing countries. Among them,
mungbean (Vigna radiata (L.) Wilczek) is one of
the most widely cultivated species throughout
the southern half of Asia, and particularly it is the
widely cultivated crop in the rainfed areas. It is
adapted to short growth duration, low water requirements, several nutrient deficient soils or poor
soil fertility. It is popularly grown as a component
in various cropping systems because of its ability
to fix nitrogen in association with soil bacteria,
267
268
11
Mungbean
269
270
11
Tomato
Tomato
Tomatoes (Lycopersium esculentum L.) are considered to be one of the most economically
important crops of all those that exist in the world.
Tomatoes are juicy berry fruits of the nightshade
family (Solanaceae). They came originally from
Central and South America. They are nutritious
vegetables that provide good quantities of vitamins A and C as well as essential minerals and
other nutrients. Furthermore, fresh and processed
tomatoes are the richest sources of the dietary
antioxidant lycopene, which arguably protects
cells from oxidants that have been linked to cancer. Tomato is also a source of other compounds
with antioxidant activities, including chlorogenic
acid, plastoquinones, rutin, tocopherol and
xanthophylls.
Economically speaking, tomatoes are worth a
tremendous amount of money because they give
more yields. Tomatoes are also one of the main
ingredients in hundreds of dishes and products
that are sold in supermarkets throughout the
developing and developed world. This means that
the demand of tomatoes (i.e. where ever high
demand for tomatoes as they are a main ingredient in dishes) is extremely high. The production
of tomatoes is ranked first in India, where small
business owners and farmers are dominated by
producing tomatoes. They highly value and
favour the choice to produce tomatoes because of
their high value in money as this makes up a very
large part of their income.
Tomatoes are also a popular choice by people
who wish to grow fruits and vegetables in their
own gardens. Not only can they be used raw in
salads, but they are also an essential part of
many recipes as well as many products such as
tomato ketchup and chutney. They can also be
grown both indoors in greenhouses and outdoors, although tomatoes that are grown outside
tend to have higher nutrient contents than those
grown in greenhouses. Tomatoes have many
advantages over growing other types of vegeta-
271
272
11
Tomato
273
274
11
Tomato
275
276
11
Hot Pepper
Hot Pepper
Hot pepper (Capsicum annuum) is an important
horticultural crop, not only because of its economic importance but also due to nutritional
and medicinal value of its fruit. These are the
excellent source of natural colours and antioxidants. A wide spectrum of antioxidant vitamins,
carotenoids, capsaicinoids and phenolic compounds are present in hot pepper fruits. The intake
of these compounds in food is an important
health-protecting factor preventing widespread
human diseases. Acreage under hot peppers is
increasing due to a shift in production trend from
other crop-based farming to nontraditional crop
production which in turn is due to a decline in
income from regular cropping program. During
the last decade, the area under protected cultivation (poly/plastic tunnels) of vegetables like
hot pepper, tomato and cucumber is increasing
steadily. Hot pepper is one of the potential crops
to be grown in poly/plastic tunnels.
277
278
20.048.2% of phenotypic variation. The isolatespecific QTLs explained 6.017.4% of phenotypic variation. The result confirms a
gene-for-gene relationship between C. annuum
and P. capsici for root rot resistance (Truong
et al. 2012). QTLs for phytophthora root rot
resistance were previously identified on chromosome 11 in other studies. Thus, the results
indicate that at least a few specific gene functions are important components of root rot resistance to different P. capsici races/isolates in the
YCM334 Tean population. Identification of
isolate-specific resistance QTLs in P. capsiciC.
annuum interactions will help breeders in selecting appropriate resistant lines for future hybridisation. Breeders may need to breed for resistance
against a specific isolate from different regions
and then pyramid a number of specific genes to
confer resistance into a cultivar. The approach
for further studies could be to develop nearisogenic lines carrying different combinations of
QTLs and challenging the isogenic lines with
different pathogen isolates.
Pungency in peppers is due to the presence of
capsaicinoid molecules, which are only produced
in Capsicum species. Capsaicinoids, the molecules
that cause a pungent, burning sensation when hot
peppers are consumed, are produced exclusively
in the genus Capsicum. This organoleptic quality
is due to the activation of the TRPV1 (VR1)
receptor. The primary capsaicinoids are capsaicin,
dihydrocapsaicin and nordihydrocapsaicin.
The presence of capsaicinoids makes pungent
peppers valuable as a spice. In contrast, the
absence of capsaicinoids is important when nonpungent peppers are grown as a vegetable crop.
The major gene Pun1 is required for the production
of capsaicinoids. Three distinct mutant alleles
of Pun1 have been found in three cultivated
Capsicum species, one of which has been widely
utilised by breeders. A robust collection of
molecular markers for the set of alleles were
identified that can differentiate four Pun1 alleles.
Those markers were tested on a diverse panel of
pepper lines and in an F2 population segregating
for pungency (Wyatt et al. 2012). These markers
will be useful for pepper breeding, germplasm
characterisation and seed purity testing. Those
11
Bibliography
Literature Cited
Ali ML, Pathan MS, Zhang J, Bai G, Sarkarung S, Nguyen
HT (2000) Mapping QTLs for root traits in a recombinant inbred population from two indica ecotypes in
rice. Theor Appl Genet 101:756766
Boopathi NM, Senthil A, Chandrikala R, Singh A,
Shanmugasundaram P, Sadasivam S, Babu RC (2002)
Mapping quantitative trait loci and marker assisted
Bibliography
selection for the improvement of drought tolerance in
rice. Madras Agric J 89(1012):553562
Champoux MC, Wang G, Sarkarang S, Mackill DJ,
OToole JC, Huang N, McCouch SR (1995) Locating
genes associated with root morphology and drought
avoidance in rice via linkage to molecular markers.
Theor Appl Genet 90:961981
Chen H, Qian N, Guo W, Song Q, Li B, Deng F, Dong C,
Zhang T (2010) Using three selected overlapping RILs
to fine-map the yield component QTL on Chro.D8 in
Upland cotton. Euphytica 176:321329
Foolad MR, Panthee DR (2012) Marker-assisted selection
in tomato breeding. Crit Rev Plant Sci 31(2):93123
Foolad MR, Merk HL, Ashrafi H (2008) Genetics, genomics
and breeding of late blight and early blight resistance
in tomato. Crit Rev Plant Sci 27:75107
Gomez S, Boopathi NM, Kumar SS, Ramasubramanian T,
Chengsong Z, Jeyaprakash P, Senthil A, Babu RC
(2010) Molecular mapping and location of QTL for
drought resistance traits in indica rice (Oryza sativa
L.) lines adapted to target environments. Acta Physiol
Plant 32(2):355364
Gutierrez OA, Robinson AF, Jenkins JN, McCarty JC,
Wubben MJ, Callahan FE, Nichols RL (2011)
Identification of QTL regions and SSR markers associated with resistance to reniform nematode in Gossypium
barbadense L. accession GB713. Theor Appl Genet
122:271280
Humphry ME, Konduri V, Lambridges CJ, Magner T,
McIntyre CL, Aitken EAB, Liu CJ (2002) Development
of a mungbean (Vigna radiata) RFLP linkage map and
its comparison with lablab (Lablab purpureus) reveals
a high level of synteny between the two genomes.
Theor Appl Genet 105:160166
Isemura T, Kaga A, Tabata S, Somta P, Srinives P et al
(2012) Construction of a genetic linkage map and
genetic analysis of domestication related traits in
mungbean (Vigna radiata). PLoS One 7(8):e41304.
doi:10.1371/journal.pone.0041304
Jenkins JN, Wu J, Guo Y, McCarty JC (2010) Use of fiber
and fuzz mutants to detect QTL for yield components,
seed, and fiber traits of upland cotton. Euphytica
172:2134
Jiang CX, Wright RJ, El-Zik KM, Paterson AH (1998)
Polyploid formation created unique avenues for
response to selection in Gossypium (cotton). Proc Natl
Acad Sci USA 95(8):44194424
Kamoshita A, Babu RC, Boopathi NM, Fukai S (2008)
Phenotypic and genotypic analysis of drought
resistance traits for development of rice cultivars
adapted to rainfed environments. Field Crops Res
109(13):123
Lambrides CJ, Lawn RJ, Godwin ID, Manners J, Imrie
BC (2000) Two genetic linkage maps of mungbean
using RFLP and RAPD markers. Aust J Agric Res
51:415425
Lilley JM, Ludlow MM, McCouch SR, OToole JC (1996)
Locating QTL for osmotic adjustment and dehydration
tolerance in rice. J Exp Bot 47:14271436
279
McCouch SR, Kochert G, Yu ZH, Wang ZY, Khush GS,
Coffman WR, Tanksley SD (1988) Molecular mapping
of rice chromosomes. Theor Appl Genet 76:815829
Menancio-Hautea D, Kumar L, Danesh D, Young ND
(1993) A genome map for mungbean [Vigna radiata
(L.) Wilczek] based on DNA genetic markers (2n = 2x
= 22) In: OBrien JS (ed) Genetic maps 1992. A compilation of linkage and restriction maps of genetically
studied organisms. Cold Spring Harbor Laboratory
Press, Cold Spring Harbor, pp 6.2596.261
Panthee DR, Foolad MR (2012) A reexamination of
molecular markers for use in marker-assisted breeding
in tomato. Euphytica 184:165179
Ray JD, Yu LX, McCouch SR, Champoux MC, Wang G,
Nguyen HT (1996) Mapping quantitative trait loci
associated with root penetration ability in rice (Oryza
sativa L.). Theor Appl Genet 92:627636
Reinisch AJ, Dong J, Brubaker CL, Stelly DM, Wendelt
JF, Paterson AH (1994) A detailed RFLP map of cotton, Gossypium hirsutum Gossypium barbadense:
chromosome organization and evolution in a disomic
polyploid genome. Genetics 138:829847
Robin S, Pathan MS, Courtois B, Lafitte R, Carandang S,
Lanceras S, Amante M, Nguyen HT, Li Z (2003)
Mapping osmotic adjustment in an advanced backcross
inbred population of rice. Theor Appl Genet
107:12881296
Shen L, Courtois B, McNally KL, Robin S, Li Z (2001)
Evaluation of near-isogenic lines of rice introgressed
with QTLs for root depth through marker-aided selection. Theor Appl Genet 103:7583
Sun FD, Zhang JH, Wang SF, Gong WK, Shi YZ, Liu AY,
Li JW, Gong JW, Shang HH, Yuan YL (2012) QTL
mapping for fiber quality traits across multiple generations and environments in upland cotton. Mol Breed
30:569582
Tanksley SD, Ganal MW, Prince JP, Devicente MC,
Bonierbale MW, Broun P, Fulton TM, Giovannoni JJ,
Grandillo S, Martin GB et al (1992) High-density
molecular linkage maps of the tomato and potato
genomes. Genetics 132:11411160
Truong HTH et al (2012) Identification of isolate-specific
resistance QTLs to phytophthora root rot using an
intraspecific recombinant inbred line population of
pepper (Capsicum annuum). Plant Pathol 61(1):
4856
Venuprasad R, Shashidhar HE, Hittalmani S, Hemamalini
GS (2002) Tagging quantitative trait loci associated
with grain yield and root morphological traits in rice
under contrasting moisture regimes. Euphytica
128:293300
Wu J, Gutierrez OA, Jenkins JN, McCarty JC, Zhu J
(2009) Quantitative analysis and QTL mapping for
agronomic and fibre traits in an RI population of upland
cotton. Euphytica 165:231245
Wyatt LE et al (2012) Development and application of a
suite of non-pungency markers for the Pun1 gene in
pepper (Capsicum spp.). Mol Breed. doi:10.1007/
s11032-012-9716-9
280
Zhang Z, Rong J, Waghmare VN, Chee PW, May OL,
Wright RJ, Gannaway JR, Paterson AH (2011) QTL
alleles for improved Wber quality from a wild
Hawaiian cotton, Gossypium tomentosum. Theor Appl
Genet 123:10751088
Zheng BS, Yang L, Zhang WP, Mao CZ, Wu YR, Yi KK,
Liu FY, Wu P (2003) Mapping QTLs and candidate
genes for rice root traits under different water-supply
conditions and comparative analysis across three populations. Theor Appl Genet 107:15051515
11
Further Reading
Boopathi NM, Thiyagu K, Urbi B, Santhoshkumar M,
Gopikrishnan A, Aravind S, Swapnashri G, Ravikesavan
R (2011) Marker-assisted breeding as next-generation
strategy for genetic improvement of productivity and
quality: can it be realized in cotton? Int J Plant Genom
2011. doi:10.1155/2011/670104
12
N.M. Boopathi, Genetic Mapping and Marker Assisted Selection: Basics, Practice
and Benefits, DOI 10.1007/978-81-322-0958-4_12, Springer India 2013
281
282
One of the yet unrealised promises of molecular markers is their utility for improvement of
complex quantitative traits, which are often
controlled by more than one gene and exhibit low
heritability and often strong G E interactions.
The failure in using molecular markers for complex traits is due to various reasons, including
QTLs being unreliable or population or environment specific, QTLs not strong enough in terms
of linkage to warrant their use for marker-assisted
breeding, lack of marker validation or marker
polymorphism in breeding populations and problems associated with linkage drag. However, it
should be possible to use markers for improving
complex traits assuming that additional necessary efforts are made to develop reliable markers,
including minimising the environmental effects
and maximising the relationship between genotype and phenotype (e.g. by repeating experiments in multiple environments), breaking up
complex traits into their individual components
and identifying QTL-linked markers for such
components, and identifying QTLs using actual
breeding populations. Obviously, these are not
easy challenges, but they are doable.
Thus, future progress in MAS will greatly
depend on improved genetics. However, the agronomical context, as well as socio-economic factors and policy, must be taken into account; they
influence to a large extent whether farmers adopt
improved varieties and whether they can minimise the gap between yield potential and on-farm
yield. This integration of quantitative knowledge
arising from diverse but complementary disciplines will allow researchers to more fully understand genes associated with complex traits in
crop plants and more precisely forecast the penalty of modulating expression levels of those
genes.
Large-scale genome sequencing and associated bioinformatics are becoming widely accepted
research tools for accelerating the analysis of
plant genome structure and function. Secondgeneration DNA sequences from crop plants can
provide an opportunity to use genomic information to clone genes and develop SNP markers in
plants. Rapid progress is now being achieved in
assembling the DNA sequences from individual
12
283
Evaluation of the extent of linkage disequilibrium in exotic and domesticated germplasm is yet
another requirement. Phenotypic evaluation of
multiple populations per species should be conducted so that the locations of quantitative trait
loci for important agronomic traits can be
identified by genetic and association mapping.
The accumulation of mapping information will
facilitate the exploration of syntenic regions
across orphan crops. These genetic tools will also
help in construction of physical maps of chromosomes in orphan crops. Construction of physical
maps will allow better understanding of such a
complex genome and facilitate cloning and
manipulation of traits with economic interests.
This will also help to better understand the secondary metabolism involved in interactions
between neglected crops and pathogens, symbiotic organisms, predators and pollinators and will
lead to varieties with enhanced yield potential,
nutritional benefits, resistance to pests and diseases and tolerance of adverse environmental
conditions.
Using molecular marker technology, it is now
feasible to analyse quantitative traits such as
salt tolerance and identify the chromosomal
regions (QTLs) associated with such characters.
Identifying such regions will significantly help to
increase the selection efficiency in the breeding
programmes. Molecular marker-assisted selection is considered to be faster, more efficient and
probably more cost effective than conventional
screening particularly for abiotic stresses where
expression of the trait is subject to significant
environmental effects. It will also help narrow
down the possible candidate genes and ultimately
will lead to map-based cloning of the major genes
controlling the trait of interest and opening a new
avenue for genetic manipulations using the real
candidate genes, since it has been shown that several such underutilised crops are adapted well to
the unfavourable environmental conditions. With
the recent advances in DNA sequencing and single nucleotide polymorphism (SNP) genotyping,
new approaches to QTL mapping and quantitative trait nucleotide (QTN) identification are now
available, and this could be applied to orphan
crops for identification of phenotype-related SNPs.
284
12
285
286
typic and genotypic information through common or mutually compatible crop information
systems.
However, amidst the challenges there are also
actual and potential opportunities. Several of the
constraints listed above, in particular access to
marker technologies and limited data management systems, can be overcome through the establishment of crosscutting technology and service
platforms, and several international initiatives are
supporting the development of such platforms in
tight collaboration with partners from developing
countries. To partially offset the undesirable trend
of losing the champions, novel international initiatives such as the Alliance for a Green Revolution
in Africa (AGRA) support high-quality education
in the South, and although there is still a long way
to go, governmental and institutional commitment
is increasing for the adoption of biotechnologies
in developing countries (Delannay et al. 2012).
287
288
289
12
290
Bibliography
Literature Cited
Delannay X, McLaren G, Ribaut JM (2012) Fostering
molecular breeding in developing countries. Mol
Breed 29:857873
Further Readings
Ali HQ et al (2012) An overview of genomics assisted
improvement of drought tolerance in maize (Zea mays
L.): QTL approaches. Afr J Biotechnol 11(65):
1283912848
Fauquet CM, Taylor NJ, Tohme J (2012) The global cassava partnership for the 21st century (GCP21). Trop
Plant Biol 5:48
Foolad MR, Panthee DR (2012) Marker-assisted selection
in tomato breeding. Crit Rev Plant Sci 31(2):93123
Fridman E, Zamir D (2012) Next-generation education in
crop genetics. Curr Opin Plant Biol 2012(15):218223
Bibliography
Isemura T, Kaga A, Tabata S, Somta P, Srinives P et al
(2012) Construction of a genetic linkage map and
genetic analysis of domestication related traits in
Mungbean (Vignaradiata). PLoS One 7(8):e41304.
doi:10.1371/journal.pone.0041304
Khan M (2012) Current status of genomic based approaches
to enhance drought tolerance in rice (Oryza sativa L.):
an over view. Mol Plant Breed 3(1):110. doi:10.5376/
mpb.2012.03.00
Liu Y, He Z, Appels R, Xia X (2012) Functional markers
in wheat: current status and future prospects. Theor
Appl Genet 125:110
Nakaya A, Isobe SN (2012) Will genomic selection
be a practical method for plant breeding? Ann Bot
110(6):13031316. doi:10.1093/aob/mcs109
291
Panthee DR, Foolad MR (2012) A re-examination of
molecular markers for usein marker-assisted breeding
in tomato. Euphytica 184:165179
Sharma HC et al (2002) Applications of biotechnology for
crop improvement: prospects and constraints. Plant
Sci 163:381395
Varshney RK, Graner A, Sorrells ME (2005) Genomicsassisted breeding for crop improvement. Trends Plant
Sci 10(12):621630
Xu Y et al (2012a) Whole-genome strategies for
marker-assisted plant breeding. Mol Breed
29:833854
Xu Y, Li Z-K, Thomson MJ (2012b) Molecular breeding
in plants: moving into the mainstream. Mol Breed
29:831832
N.M. Boopathi, Genetic Mapping and Marker Assisted Selection: Basics, Practice
and Benefits, DOI 10.1007/978-81-322-0958-4, Springer India 2013
293