Vous êtes sur la page 1sur 16

Journal of Experimental Botany, Vol.

Page63, No.
1 of 1611, pp.695709,
2, pp. 40454060,20122012
doi:10.1093/jxb/ers105 Advance
doi:10.1093/jxb/err313
doi:10.1093/jxb/ers105 AdvanceAccess
Accesspublication
publication 417November,
April, 20122011
This paper is available online free of all access charges (see http://jxb.oxfordjournals.org/open_access.html for further details)

RESEARCH
REVIEW PAPER
PAPER

Association
In Posidonia mapping
oceanicaincadmium
forest trees
induces
and fruit
changes
cropsin DNA
methylation and chromatin patterning
M. Awais Khan* and Schuyler S. Korban*
Department of Natural
Maria Greco, AdrianaResources & Environmental
Chiappetta, LeonardoSciences,
BrunoUniversity
and Mariaof Illinois, Urbana,
Beatrice IL 61801 USA
Bitonti*
*Department
To whom correspondence shouldofbe
of Ecology, University addressed.
Calabria, E-mail: of
Laboratory awais@illinois.edu; korban@illinois.edu
Plant Cyto-physiology, Ponte Pietro Bucci, I-87036 Arcavacata di Rende,
Cosenza, Italy
* To whom correspondence should be addressed. E-mail: b.bitonti@unical.it
Abstract
Received 29 May 2011; Revised 8 July 2011; Accepted 18 August 2011
Association mapping (AM), also known as linkage disequilibrium (LD) mapping, is a viable approach to overcome
limitations of pedigree-based quantitative trait loci (QTL) mapping. In AM, genotypic and phenotypic correlations are
investigated in unrelated individuals. Unlike QTL mapping, AM takes advantage of both LD and historical
Abstract
recombination present within the gene pool of an organism, thus utilizing a broader reference population. In plants,
In
AMmammals,
has beencadmium is widely
used in model considered
species as a non-genotoxic
with available carcinogen
genomic resources. acting through
Pursuing AM in treea methylation-dependent
species requires both
epigenetic mechanism. Here, the effects of Cd treatment on the DNA methylation
genotyping and phenotyping of large populations with unique architectures. Recently, genome patten are examined together with
sequences and
its
genomic resources for forest and fruit crops have become available. Due to abundance of single analysed
effect on chromatin reconfiguration in Posidonia oceanica. DNA methylation level and pattern were nucleotidein
actively growing organs, under short- (6 h) and long- (2 d or 4 d) term and low (10 mM) and
polymorphisms (SNPs) within a genome, along with availability of high-throughput resequencing methods, SNPs canhigh (50 mM) doses of Cd,
through a Methylation-Sensitive
be effectively used for genotyping Amplification
trees. In addition Polymorphism technique and
to DNA polymorphisms, copy annumber
immunocytological approach,
variations (CNVs) in the
respectively. The expression of one member of the CHROMOMETHYLASE (CMT) family, a DNA
form of deletions, duplications, and insertions also play major roles in control of expression of phenotypic traits. methyltransferase,
was
Thus,also
CNVs assessed by qRT-PCR.
could provide Nuclear
yet another chromatin
valuable resource,ultrastructure
beyond those wasofinvestigated
microsatellitebyand
transmission electron
SNP variations, for
microscopy. Cd treatment induced a DNA hypermethylation, as well as an up-regulation
pursuing genomic studies. As genome-wide SNP data are generated from high-throughput sequencing efforts, of CMT, indicating that de
these
novo methylation did indeed occur. Moreover, a high dose of Cd led to a progressive heterochromatinization
could be readily reanalysed to identify CNVs, and subsequently used for AM studies. However, forest and fruit crops of
interphase nuclei and apoptotic figures were also observed after long-term treatment. The
possess unique architectural and biological features that ought to be taken into consideration when collecting data demonstrate that Cd
perturbs theand
genotyping DNA methylation
phenotyping status
data, through
as these willthe involvement
also dictate whichof aAM
specific methyltransferase.
strategies SuchThese
should be pursued. changes are
unique
linked toas
features nuclear
well aschromatin reconfiguration
their impact on undertaking likely
AM to establish
studies a new balance
are outlined of expressed/repressed chromatin.
and discussed.
Overall, the data show an epigenetic basis to the mechanism underlying Cd toxicity in plants.
Key words: Association mapping, linkage disequilibrium, perennial plants, quantitative trait loci.
Key words: 5-Methylcytosine-antibody, cadmium-stress condition, chromatin reconfiguration, CHROMOMETHYLASE,
DNA-methylation, Methylation- Sensitive Amplification Polymorphism (MSAP), Posidonia oceanica (L.) Delile.

Introduction
Increasing the efficiency of selection by maximizing the use et al., 1988). Since then, QTL mapping has been widely
Introduction
of desired genetic variation is one of the critical objectives used in plants for genetic dissection of biomass, yield, and
In any
of thebreeding
Mediterranean
programme. coastal
Mostecosystem,
agronomically theimportant
endemic Although
disease not essential
resistance for plant
traits. Although there growth,
are many in published
terrestrial
seagrass
traits arePosidonia
complex, oceanica (L.) Delile
are controlled plays agenes,
by multiple relevant
androle
are plants,
reports Cd onisQTLs,
readily only
absorbeda few by QTLs
roots and have translocated
been usedinto in
by ensuring inprimary
quantitative nature. production,
The principlewater oxygenation
of mapping and
a quantita- aerial
breedingorgans while, in acquatic
programmes (Bernardo, plants, it is and
2008), directly
thesetaken
are upin
provides
tive niches(QTL)
trait locus for some animals,
was first besides
described counteracting
in the early 20th by
factleaves.
majorIn plants,
genes ratherCd absorption
than QTLs.inducesUsually, complex changes
QTL intervals
coastal erosion
century throughwherein
by Sax (1923) its widespread
seed sizemeadows
in bean, a(Ott, 1980;
quantita- at
arethe
quite long, ;510
genetic, biochemical and physiological
cM (wherein 1 cM is equallevels to 3000 which
kb
Piazzi
tive trait, was 1999;
et al., Alcoverro
associated with seed coat
et al., 2001). There
colour, is also
a morpho- ultimately
of DNA), and account
containfor its
manytoxicity
genes.(Valle and Ulmer,
Therefore, 1972;
transferring
considerable
logical marker. evidence
However, thatQTL oceanica plants
P. identification didarenot able to
receive Sanitz
multiplediminor
ToppiQTLsand Gabrielli, 1999; Benavides
across genotypes et al., 2005;
via marker-assisted
absorb attention
serious and accumulate
until the metals from of
introduction sediments
polymerase (Sanchiz
chain Weber et(MAB)
breeding al., 2006;
can leadLiu to al., 2008).
et genetic dragThe due to most obvious
presence of
et al., 1990;
reaction Pergent-Martini,
(PCR)-based 1998;markers
molecular Masertiinetthe 2005)
al., late thus
1980s. symptom of Cd
undesirable toxicity
traits is a reduction
in these regions. in plant prior
Often, growthtodue to
their
influencing metal bioavailability
Molecular markers in the marine
offer great opportunities forecosystem.
dissecting an inhibition
utility of photosynthesis,
in positional cloning or MAS respiration, and nitrogen
in crop improvement
For
complexthis traits
reason, this QTL
using seagrass is widely
mapping. considered
Systematic to be
identifica- metabolism,
efforts, as well
individual genesas ain reduction
QTL regions in water
must be andidentified
mineral
ation
metalof bioindicator
QTLs in plants specieswas(Maserti et al., 1988;
first described by Pergent
Steven uptake
via (Ouzonidouaettime-consuming,
fine-mapping, al., 1997; Perfus-Barbeoch
laborious, etand 2000;
al., costly
et al., 1995;
Tanksleys Lafabrie
group while et al., 2007).a Cd
constructing is one fragment
restriction of most Shukla This
effort. et al.,is2003;
due to Sobkowiak
the necessityandof Deckert,
making2003). large numbers
widespread
length heavy metals ingenetic
polymorphism-based both terrestrial and marine
map for tomato fruit of At the genetic
crosses to elicitlevel, in both
sufficient animals
numbers of and
meioticplants, Cd
events.
environments.
quality traits using an interspecific backcross (Paterson can induce chromosomal
Additionally, QTL identification aberrations,
is basedabnormalities
on bi-parental in

The
2011Author [2012]. Published by Oxford University Press [on behalf
The Author(s). behalf of
of the
the Society
Society for
for Experimental
Experimental Biology].
Biology]. All
All rights
rights reserved.
reserved.
For
ThisPermissions, please article
is an Open Access e-mail:distributed
journals.permissions@oup.com
under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-
nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
2 of 16| Khan
4046 and
| Khan Korban
and Korban

crosses, and quite often QTLs are specific to the bi-parental plants was conducted to study flowering time in maize
population used in identifying these QTLs. Thus far, most inbred lines, wherein 92 maize recombinant inbred lines
QTLs identified in either plant or mammalian populations (RILs) and the candidate gene Dwarf8, influencing quanti-
are not useful in a wide range of genetic backgrounds tative variation of flowering time and plant height, were
(Sorkheh et al., 2008). used (Thornsberry et al., 2001). Multiple polymorphisms
Association mapping (AM), also known as linkage associated with flowering time were identified, including
disequilibrium (LD) mapping, has been proposed as an a deletion in a key domain of the coding region of Dwarf8.
alternative approach to overcome limitations of pedigree- As for the genetic control of flowering in maize, extensive
based QTL mapping. In AM, genotype and phenotype studies have been conducted in Arabidopsis using AM. The
correlations are investigated in unrelated individuals. first study in Arabidopsis to elucidate the genetic control of
Unlike QTL mapping, AM takes advantage of LD as well flowering using LD mapping identified two haplotype
as historical recombinations present within the gene pool of groups in the genomic region of the photoperiod receptor
an organism, thus utilizing a broader reference population CRYPTOCHROME2 associated with flowering time varia-
(Breseghello and Sorrels, 2006a; Ersoz and Buckler, 2007; tion (Olsen et al., 2004). Later, Aranzana et al. (2005)
Myles et al., 2009). If two alleles from separate loci occur demonstrated that identification of known major genes
together more often than otherwise predicted, on the basis involved in flowering (FRI) and pathogen resistance (Rpm1,
of their individual frequencies, i.e. non-random association Rps2, and Rps5) could have been accomplished using
of alleles at separate loci, they are deemed to be in LD. genome-wide AM, even in small and structured samples.
Only those molecular markers that are tightly linked to the Recently, Su et al. (2011) pursued association analysis of
trait and located within the extent of LD decay will drought tolerance in maize with genes coding for key
demonstrate significant marker-trait association. If markers enzymes in the abscisic acid biosynthesis pathway, nced and
are not tightly linked to a trait, they will be separated by rab28.
recombination during meiosis throughout the evolutionary To further facilitate identification of loci controlling
history of the crop. Accumulating meiotic events in quantitative traits with increased power and mapping
a population will increase the statistical power and mapping resolution through joint linkage-association analysis in
resolution for detecting associations. However, it should be maize and Arabidopsis, a nested association mapping
noted that the rate of LD decay should be sufficient enough (NAM) population and multiparent advanced generation
to statistically identify associations, but not too high as it inter-cross (MAGIC) lines, respectively, were recently con-
will make it difficult to narrow down the target genomic structed (Kover et al., 2009; McMullen et al., 2009). NAM
region. was developed by crossing 25 diverse maize inbred lines to
AM requires availability of large numbers of polymor- line B73, a reference inbred line previously used for
phic markers and is more complex than QTL mapping, as constructing a physical map and for genome sequencing.
historical factors such as population admixture, selection, These 25 diverse inbred lines maximized the genetic di-
and genetic drift can bias the detected association. More- versity of NAM as resulting RIL families represented
over, the population genetic structure as well as effects due diverse origins and types of maize. Whereas, MAGIC lines
to non-random mating (relatedness) must be accounted for corresponded to a set of 527 RILs generated by intercross-
in the analysis to avoid false positive (spurious) associations ing a heterogeneous stock of 19 accessions of Arabidopsis
(Fig. 1; Pritchard et al., 2000; Bradbury et al., 2007; Zhu thaliana. These represented genetically diverse and highly
et al., 2008; Zhang et al., 2010). Population structure recombinant inbred lines of A. thaliana that were suitable
influences both the power and precision of detecting for high-resolution mapping of segregating traits. Buckler
associations. However, it can be overcome with good et al. (2009) studied variation in flowering time using NAM
sampling and by using appropriate algorithms to detect population, a set of 5000 maize RILs. Although they did
groupings in a population and accounting for these in an not identify any major gene effects, they noted additive
association mapping analysis (Zhu et al., 2008). Early on, QTLs of numerous small genetic effects that were shared
LD mapping had been used in human studies to understand among families, but with few genetic or environmental
the genetic control of disease. Nowadays, it has rapidly interactions. In contrast to rice and Arabidopsis, wherein
gained interest among plant scientists for studying biomass flowering time was controlled by a few genes with large
traits, yield, and disease resistance, among others. Reviews effects, epistasis, and environmental interactions, it was
on the concept, methodology, prospects, and status of LD proposed that flowering time in maize was controlled by
and AM studies in plants have already been published a simple additive model (Buckler et al., 2009). Using
(Neale and Savolainen, 2004; Gupta et al., 2005; Brese- a candidate gene-based AM approach and utilizing 275
ghello and Sorrels, 2006a; Ersoz and Buckler, 2007; accessions of A. thaliana, Ehrenreich et al. (2009) identified
Abdurakhmonov and Abdukarimov, 2008; Sorkheh et al., 210 genes that were significantly associated with flowering
2008; Zhu et al., 2008; Myles et al., 2009; Neale and time, with CO arguably possessing the most promising
Kremer, 2011). associations. Later, MAGIC lines were used to test a subset
AM in plants was initially undertaken in model species of these genes to identify associations linked to flowering
with available genomic resources, such as those of Arabi- time. It was confirmed that approximately one-third of
dopsis and maize. The first candidate gene-based AM in significant polymorphisms that had been associated with
Association
Association mapping
mapping in forest
in forest trees
trees andand fruit
fruit | 3| of
crops
crops 4047
16

Fig. 1. A schematic representation of a protocol to conduct an association mapping study. This protocol can be used for candidate
gene-, genome-wide-, and copy number variation-based association mapping, with some modifications. The overall process of
association mapping involves genotyping, phenotyping, identification, and interpretation of results.

flowering time in these accessions were now replicated in Superiority and significance of association
both populations, including polymorphisms at the CO, mapping in forest trees and fruit crops
FLC, VIN3, PHYD, GA1, and FRI loci. Both NAM and
MAGIC lines demonstrated the power and prospects of Most economic traits of horticultural and forest trees, such
a combined linkage and association mapping strategy for as wood properties, disease resistance, fruit quality, and
identifying polymorphism-trait associations in plants. In biomass traits, are controlled by multiple genes exhibiting
addition to maize and Arabidopsis, candidate-gene AM has quantitative distribution of phenotypes. Thus, quantitative
also been used in wheat, rice, barley, and sorghum to study genetic strategies have been used to identify genes control-
yield, flowering, and disease resistance traits (Breseghello ling these traits (Neale, 2007). However, the most widely
and Sorrels, 2006b; Ravel et al., 2006; Agrama et al., 2007; used approach to study quantitative traits in plants, i.e.
Tommasini et al., 2007; Murray et al., 2009; Stracke et al., QTL mapping, has not been promising in pursuing genetic
2009; Cockram et al., 2010; Maccaferri et al., 2011). These studies in tree species. Fruit and forest trees are character-
studies demonstrated feasibility of conducting genome-wide ized by long juvenility periods; moreover, establishing and
AM in plants and emphasized the need of having appropri- maintaining bi-parental crosses and progenies of trees in
ate controls to detect false positives as well as availability of multiple locations for QTL mapping are often difficult and
appropriate sample size. costly (Rikkerink et al., 2007). For example, apple trees
So far, LD-based studies have primarily focused on require periods of 57 years to overcome juvenility, as well
plants with short lifecycles, but with only few examples in as considerable economic input before seedlings are re-
plants having long juvenility periods such as those of forest productive and amenable for phenotypic evaluation. Com-
trees and grapes. AM is particularly suited for forest trees monly, an F1 progeny resulting from full-sib crosses
and perennial horticultural crops to overcome their charac- between two outbred parents is used for QTL mapping in
teristic pedigree-based mapping limitations. This is due to perennial forest trees and fruit crops. Recombination in an
their long generation time, laborious and time-consuming F1 cross is attributed to a single meiotic event, resulting in
trait evaluation, slow maturation, and, at times, their low resolution of QTLs. Often, the region of the QTL spans
polyploidy. In this review, we will focus on the importance, ;510 cM, with the likelihood of presence of hundreds of
current status, and potential of AM and LD studies in genes within this region. For example in conifers, a QTL
forest trees and fruit crops. In addition, we will provide interval spans ;15 cM (Brown et al., 2003; Gonzalez-
insights into the role of these studies in the genetic Martnez et al., 2007). This suggests that the majority of
improvement of these long-lived perennial plants. identified QTLs, particularly those with minor to moderate
4 of 16| Khan
4048 and
| Khan Korban
and Korban

effects, are of limited utility in tree genetic improvement ;100 species, 24 chromosomes (2n), and a genome size of
efforts. Utilizing QTLs in MAB without fine-mapping is not 1040 Gb; moreover, pine LD decays at a rate of <500 to
advisable due to genetic drag effects and/or rapid decay of 2000 bp, and has 12% nucleotide diversity (Neale and
LD in most fruit and forest trees. However, fine-mapping of Ingvarsson, 2008). Since the introduction of QTL mapping
the QTL region to locate a target gene(s) is both time- methods, forest geneticists have been investigating quantita-
consuming and laborious (Neale and Savolainen, 2004). tive traits of both economic (growth, wood properties, and
Moreover, it should also be noted that most QTLs are disease resistance) and adaptive (cold tolerance and phenol-
limited to either a single or a few genetic backgrounds, and ogy) values. According to Neale and Ingvarsson (2008),
will not be useful across a wide gene pool. For instance, it is AM approaches in forest trees offer advantages over
likely that disease resistance could break down before genes pedigree-based genetic tests due to the availability of large
conferring resistance are identified and fine-mapped via size random-mating populations with minimal population
QTL mapping. For all these reasons, QTL mapping seems structure, presence of adequate levels of nucleotide di-
laborious and time-consuming in trees and fruit crops, versity, rapid decay of LD, direct determination of haploids
particularly when the trait is controlled by multiple genes by sequencing of haploid seed megagametophyte tissues,
with small effect. Therefore, it is highly desirable to have and reliable phenotypic evaluations in clonally propagated
access to genomic tools and genetic approaches that could plants.
accelerate detection of functionally important genetic Availability of genomic resources is one of the key
regions in trees (Savolainen and Pyhajarvi, 2007). limitations for conducting an AM study in a crop. Genome
AM serves as a viable alternative approach that can sizes of forest trees vary considerably, impacting availability
overcome the limitations of pedigree-based mapping in of genomic resources, and influencing AM strategies. For
perennial plants. In contrast to QTL mapping, AM requires example, sequencing of the genome of Populus trichocarpa,
large numbers of molecular markers and utilizes recombi- which is four times the size of the Arabidopsis genome, has
nation events present within the gene pool of an organism. been long completed (Tuskan et al., 2006). Sequencing of
It is only those markers that are tightly linked to the locus the pine genome is still ongoing as the size of this genome is
of interest and located within the extent of the LD decay of ;100-fold larger than that of the Arabidopsis genome (Pavy
the genome that will show significant associations. AM et al., 2005). On average, there is a slightly lower nucleotide
could also serve as a powerful tool in dissecting quantitative diversity in poplar compared to that of pine, but LD decay
traits into their individual gene components in fruit and is almost equal (Ingvarsson, 2005). LD declines rapidly in
forest trees due to their large architectural tree frames, forest trees, within ;1 kb, compared to that of self-
random mating, unstructured populations, and rapid decay pollinated plant species, thus requiring availability of large
of LD (Komulainen et al., 2003; Neale and Savolainen, numbers of molecular markers for AM. In addition to the
2004; Rikkerink et al., 2007). genome sequence of poplar, large numbers of expressed
sequence tag (EST) sequences are also available for multiple
forest trees. Thus, there are considerable genomic resources
and wide genetic diversity in forest trees that would serve
Current status of LD studies in forest trees well in pursuing AM strategies. Therefore, due to expanding
and fruit crops genomic resources and limitations of QTL methods in forest
Forest trees trees, AM studies have been carried out in perennial forest
trees.
Forest trees are not only important due to their economic As for other crops, conducting AM in tree species
value for wood biomass and fuel, but they are also critical requires both genotyping and phenotyping of a large
for maintaining plant and animal biodiversity and temper- population (Fig. 1). SNPs are currently the markers of
ing environmental effects, as well as their natural resource choice for genotyping trees due to their abundance within
and aesthetic values. Threats to forest trees from biotic and a genome and the availability of high-throughput rese-
abiotic factors are rapidly increasing (Neale, 2007). quencing and genotyping methods. For those conifers with
Although there is wide variation among different forest limited EST databases, SNPs are identified by resequencing
trees, generally they are out-crossing, long-lived, and at candidate genes in a small (10100) panel of individual
early stages of domestication (Savolainen and Pyhajarvi, trees, using haploid seed megagametophytes, known as the
2007). Despite similarities among forest trees, there are SNP discovery panel (Neale, 2007). For those forest trees
large differences between genetic and lifecycle character- with abundant EST data, SNPs are discovered in silico
istics of forest trees that render some better suited for using bioinformatics tools. Identified SNPs are then geno-
genetic studies than others. For instance, two of the most typed on a larger population using high-throughput geno-
important forest trees, poplar and pine, are different in typing methods, such as the Illumina GoldenGate assay
many aspects. Poplar is a dioecious angiosperm with 29 (Eckert et al., 2009). For accurate phenotypic evaluation,
species, 38 chromosomes (2n), and a genome size of 450 individuals of a population are grown in replicated trials by
Mb; moreover, LD decays at a rate of <500 bp, and has vegetative propagation or subjected to family-based testing.
a nucleotide diversity of 0.51% (Neale and Ingvarsson, Moreover, to accurately estimate the effects of the SNP
2008). In contrast, pine is a monoecious gymnosperm with genotype by minimizing the effects of population genetic
Association
Association mapping
mapping in forest
in forest trees
trees andand fruit
fruit | 5| of
crops
crops 4049
16

structure and genetic relatedness, a large population is successful candidate gene association mapping in trees using
usually used. So far, candidate gene-based AM studies and a complex family structure.
population genetic neutrality tests have frequently been The inherent broad phenotypic adaptation to diverse
used to study wood-related economic and adaptive traits, environments that is present in natural populations of forest
and to investigate gene behaviour under natural selection trees is key to successful establishment of any tree species
conditions in forest trees (Neale, 2007; Eckert et al., 2009). either as domesticated or undomesticated (Petit and
Hampe, 2006). Therefore, understanding the genetic control
of adaptability is critical in dealing with current problems in
Conifers (pine and spruce species) forests. Eckert et al. (2010) have used a genome-wide
dataset of SNPs genotyped across 3059 functional genes to
As mentioned above, SNPs are generally identified by study patterns of population structure and to identify
resequencing ESTs using a small SNP discovery panel and genetic loci linked to response to aridity across a natural
are then genotyped on large samples of unrelated trees by range of loblolly pine. After accounting for confounding
high-throughput genotyping assays (Pavy et al., 2008; effects of shared ancestry with correlations between genetic
Eckert et al., 2009). Sufficient numbers of EST sequences and environmental variations, five loci correlating with
are available for Picea and Pinus (Pavy et al., 2005; Rungis aridity have been identified. These loci are primarily in-
et al., 2005) to pursue SNP discovery and conduct LD volved in abiotic stress responses to temperature and
studies in conifers. The NCBI database, as of 6 February drought.
2012, has 328791, 247119, 36384, 34524, 341325, and Successful establishment of forest trees depend not only
296253 EST sequences for Pinus taeda (loblolly pine), Pinus on their adaptability to constantly changing environments,
contorta (lodgepole pine), Pinus banksiana (jack pine), Pinus but also to their responses to emerging pathogens. Pitch
pinaster (meritime pine), Picea glauca (white spruce), and canker, a disease caused by the necrotrophic pathogen
Picea sitchensis (sitka spruce), respectively. Fusarium circinatum, is an important fungal disease of
Gonzalez-Martnez et al. (2007) conducted the first multi- loblolly pine. Quesada et al. (2010) have identified genes
gene association study in forest trees. A population of 422 that are associated with resistance to pitch canker in
435 unrelated loblolly pine trees (P. taeda L.) trees, in loblolly pine using a set of 498 largely unrelated clonally
a clonally replicated trial, was used to conduct an associa- propagated genotypes. Upon inoculation of these genotypes
tion analysis study of 58 SNPs, from 20 wood- and with F. circinatum and measuring lengths of necrotic tissues
drought-related candidate genes and wood property traits. after 4, 8, and 12 weeks, significant associations have been
These traits included earlywood and latewood specific detected in 10 out of 3938 SNPs tested. These 10 SNPs have
gravity, percentage latewood, earlywood microfibril angle, exhibited small effects, thus suggesting that resistance is
and wood chemistry (lignin and cellulose contents). They quantitative, and that whole-genome scans must be con-
used mixed linear models to perform AM analysis, where ducted to identify remaining variations for this trait.
population structure and relatedness was accounted for. Recently, Dillon et al. (2010) have evaluated the utility of
The strongest association was observed between allelic LD mapping to detect associations between SNPs and wood
variations in tubulin, a gene involved in the formation of quality in a natural population of Pinus radiata. After
cortical microtubules, and the earlywood microfibril angle. accounting for both population structure and experimental
It was suggested that due to rapid LD decay in conifers, error, a total of 10 significant associations (P < 0.05, q <
SNPs revealing genetic associations were likely to be located 0.1) have been detected with one or more traits, out of 149
in close proximity to causative polymorphisms (Gonzalez- loci investigated. Significant associations have been re-
Martnez et al., 2007). Furthermore, a strong association examined in an Australian land race, and those associations
between a SNP within the candidate gene 4-coumarate CoA previously observed in the discovery population have
ligase (4cl) and percentage latewood was also detected in further validated associations of two genes with wood
this study, thus confirming previous findings based on co- density. Decreased wood density is associated with a minor
location of a QTL for percentage latewood. In another allele, thus suggesting that these SNPs may be under weak
study, Gonzalez-Martnez et al. (2008) took advantage of negative purifying selection for wood density. These find-
both pedigreed crosses and genetic diversity to develop the ings clearly demonstrate the utility of LD mapping in
first family-based AM approach in plants, known as detecting associations, even when the power of detecting
a quantitative transmission disequilibrium test. A popula- SNPs with small effects is anticipated to be low (Dillon
tion of loblolly pine, consisting of 961 clones from 61 et al., 2010).
families, was evaluated at two sites to test for genetic Various neutrality tests have also been used to identify
associations between 46 SNPs (identified from 41 biotic and genes and genomic regions that have been targets of natural
abiotic stress-inducible genes) and carbon isotope discrimi- selection in forest trees. Due to long generation times,
nation, a measure of water use efficiency. Several candidate molecular evolution at neutral sites in forest trees is very
gene associations were detected together with two very slow, and, in contrast to their slow rate of neutral evolution,
promising associations, a cell structure stabilizing dehydrin large tree populations respond quickly to natural selection.
gene (dhn-1) and a cell wall reinforcement protein gene (lp5) Therefore, detecting traces of selection may be easier in trees
(Gonzalez-Martnez et al., 2008). This study demonstrated than in many other species due to low demographic effects
6 of 16| Khan
4050 and
| Khan Korban
and Korban

in large randomly mating tree populations (Savolainen and also found several phenological traits of P. tremula with
Pyhajarvi, 2007). According to Neale and Ingvarsson strong genetic differentiation and clinal variation across the
(2008), of ;290 genes of trees subjected to neutrality tests, latitudinal gradient. It was suggested that genetic differenti-
20% have exhibited departures from neutrality. In one of ation at candidate loci was better described by FST at
the early studies describing nucleotide diversity, Heuertz neutral loci rather than by QST at quantitative traits. Later,
et al. (2006) have conducted LD and tests of neutrality in Ingvarsson et al. (2008) reported that polymorphism varied
spruce (Picea) and have surveyed DNA polymorphisms at substantially across the phyB2 region when surveyed within
22 loci in ;47 haplotypes from seven populations. Their an 80-kb region surrounding the phytochrome locus, but
results have revealed that the overall nucleotide variation is there were no signs of deviations from neutral expectations.
limited, being lower than that observed in most plant Moreover, using 41 SNPs in a mapping population, they
species. LD is also restricted and does not extend beyond identified two non-synonymous SNPs in the phyB2 gene
a few hundred base pairs. Moreover, neutrality tests have associated with variations in timing of bud set. These SNPs
revealed presence of an excess of both rare and high- explained between 1.5 and 5% of the observed phenotypic
frequency-derived variants and pointed to a severe bottle- variation in bud set. It was proposed that due to low LD in
neck. It has been concluded that demographic departures this region, these SNPs were strong candidates that were
from equilibrium expectations and population structure causally linked to variation in bud set.
must be accounted for when detecting selection at candidate
genes and in association mapping studies, respectively.
Recently, neutrality tests have been used to study the effects Eucalyptus
of natural selection on 41 candidate genes from loblolly
Eucalyptus (Eucalyptus nitens) is a widely adapted forest
pine; these genes have been selected primarily from host
tree, and has been the focus of studies on adaptation as
pathogen interactions together with 15 drought-tolerance
a key element in genetic conservation of forests. Thumma
and 13 wood-quality genes identified in previous studies
et al. (2005) have used a candidate-gene-based LD mapping
(Ersoz et al., 2010). Patterns of both directional selection
approach to identify alleles associated with microfibril
and selective sweep consistent with the arms-race model of
angle, a wood quality trait affecting stiffness and strength
disease response evolution have been detected, as well as
of wood in eucalyptus. SNPs detected in the cinnamoyl CoA
patterns consistent with diversifying selection.
reductase gene, a key lignin gene, have been used to
To further elucidate the genetic control of adaptive traits
genotype 290 unrelated trees from a natural population of
in Douglas fir, Krutovsky and Neale (2005) studied LD and
E. nitens. Two haplotypes that are significantly associated
haplotype and nucleotide frequencies, and performed neu-
with microfibril angle, as confirmed in two full-sib families
trality tests in cold-hardiness and wood quality-related
of E. nitens and Eucalyptus globules, have been identified. In
candidate genes. Coastal Douglas fir (Pseudotsuga menziesii
a recent study, Thumma et al. (2009) have demonstrated the
Franco), an important tree of western North American
potential of revealing functional polymorphisms underlying
forests, has evolved complex adaptive mechanisms. In the
quantitative traits by integrating both QTL and association
study, genes were selected on the basis of their functions in
mapping methods. First, a marker from the COBRA-like
other plants and their co-location with cold-hardiness-
gene, whose Arabidopsis homologue has been implicated in
related QTLs. On average, the frequency of SNPs was one
cellulose deposition, was found to be strongly associated
SNP per 46 bp across coding and noncoding regions. LD
with a QTL for cellulose content in a full-sib family. By
within genes decayed relatively slowly, but steadily, while
genotyping SNPs and a simple sequence repeat (SSR)
a neutrality test suggested directional selection (Krutovsky
marker in an association population, LD analysis has
and Neale, 2005).
revealed that LD declines within the length of the COBRA-
like gene. Subsequent association mapping analysis has
Poplar contributed to fine-resolution mapping of the effect of this
gene to a SNP marker.
Thus far, poplar (Populus trichocarpa) is the only forest tree
with a complete genome sequence, thus allowing resequenc-
ing of different genotypes to identify SNPs for genetic
Grapes
studies. Ingvarsson et al. (2006) first reported on genetic
dissection of complex adaptive traits using candidate genes The cultivated grapevine (Vitis vinifera L.) is one of the first
in poplar. They identified SNPs in the phytochrome gene economically important fruit crops whose genome was
(phyB2) that co-located to a previously reported QTL for sequenced. Having access to the genome sequence has
timing of bud set, and this was genotyped in 16 Populus facilitated the identification and development of genetic
tremula populations collected along a latitudinal gradient in markers and has also enabled LD studies in grapes. The
Sweden. It was found that there was a significant clinal 3 cultivated grapevine is primarily a self-pollinated perennial
latitude variation for bud set. A sliding-window scan of fruit crop. Most of todays cultivated grapes have been
phyB2 identified four SNPs with significant clinal varia- derived from controlled crosses among a few select culti-
tions, suggesting an adaptive response of phyB2 to local vars, and the elite selections have been maintained by
photoperiodic conditions. Additionally, Hall et al. (2007) vegetative propagation. There are only a few studies
Association
Association mapping
mapping in forest
in forest trees
trees andand fruit
fruit | 7| of
crops
crops 4051
16

describing LD patterns across the grapevine genome, as well 5 linked to the Muscat flavour has been found to co-localize
as AM studies undertaken (Myles et al., 2011). with a VvDXS gene encoding 1-deoxy-D-xylulose 5-phosphate
Characterization of LD in the wild French grapevine, synthase (Emanuelli et al., 2010). Upon resequencing of the
V. vinifera L. subsp. sylvestris, was first reported by VvDXS gene in an ad hoc association population of 148
Barnaud et al. (2010). LD patterns and extent were assessed grape cultivars, consisting of muscat-flavoured, aromatic,
using un-phased SSRs and reconstructed haplotypic data in and neutral accessions, as well as muscat-like aromatic
a sample of 85 plants from southern France by performing mutants and non-aromatic offsprings of Muscats, three
independence tests and multiallelic r2 analysis. It was found SNPs in moderate LD that are significantly associated with
that the LD decayed rapidly, with r2 values decreasing to muscat-flavoured cultivars have been identified, after ac-
0.1 within 2.7 cM for genotypic data and within 1.4 cM for counting for population structure. They have also identified
haplotypic data. However, when LD was compared to a putative causal SNP responsible for a predicted non-
previous findings from a study on cultivated grapevine neutral substitution (Emanuelli et al., 2010). Although LD
subsp. sativa, LD was extended further, by 12-fold, in rapidly decays in grapevine, LD could be extended in those
cultivated compared with wild grapevine. In a recent study, genomic regions selected during domestication. Therefore,
Myles et al. (2011) also reported that there was rapid decay conducting a candidate gene-based study for traits impor-
of LD in V. vinifera that appeared unchanged between the tant during domestication is feasible, although the availabil-
wild ancestor and the domesticated grape. These differences ity of high densities of markers across the genome is
in LD patterns between cultivated and wild grapevine could required for genome-wide AM (Myles et al., 2011).
primarily be due to domestication bottlenecks and vegeta-
tive propagation. It was suggested that the LD in wild
Rosaceous fruit crops
grapevine seemed to extend slightly further than in wild
relatives of other crops, thus rendering it amenable for The Rosaceae is the third most economically important
future AM studies. Integrating both QTL and AM strate- plant family (Dirlewanger et al., 2002; Shulaev et al., 2008).
gies has proven highly effective in various studies. For It includes such economically important crops as apple
example, Fournier-Level et al. (2010) combined QTL and (Malus), pear (Pyrus), stone fruit (Prunus), strawberry
AM to study the genetic patterns of anthocyanin content, (Fragaria), and rose (Rosa). Polyploidy is common in
a determinant of berry colour, in grapes. A strong QTL, rosaceous plants; moreover, genome sizes vary from moder-
accounting for 62% of the variation in anthocyanin content, ate to small as in apple (742 Mb) and strawberry (240 Mb),
was mapped in an F1 pseudo-testcross. Markers were respectively (Velasco et al., 2010; Shulaev et al., 2011).
identified in four Myb-type genes, selected based on Rosaceous plants are often vegetatively propagated to
metabolic profiles, within the QTL interval, and these were maintain both additive and non-additive genetic effects in
tested in a core collection of natural grape germplasm phenotypes of superior genotypes. According to Rikkerink
consisting of 141 individuals. Using Bayesian reconstruction et al. (2007), trait evaluation in most fruit breeding
of effective population size dynamics, the presence of programmes depends on a two-step strategy. In the first
extended LD in these genes was revealed. Furthermore, step, a large number of individuals in non-replicated trials
a multivariate regression analysis identified five polymor- are evaluated to select a small number of individuals with
phisms in VvMybA genes that accounted for 84% of the desirable traits. In the next step, these selected plants are
observed variation. It was concluded that the observed asexually propagated in a replicated trial for reliable trait
variation in anthocyanin content in grape was mainly evaluation. For most fruit crops, as in forest trees, genetic
accounted for by a single cluster of three VvMybA genes. analysis is complicated due to presence of high levels of
Fruit development is an important economic trait of heterozygosity, long generation time, long juvenility period,
grapevine, but there is little known about the genetic and time-consuming trait evaluation, slow physiological matu-
molecular controls of fleshy fruit development. Houel et al. ration, and polyploidy, with few exceptions (Shulaev et al.,
(2010) have identified markers from a gene (flb) proposed to 2008). Costs of growing and maintenance of perennial
be linked to a fleshless berry mutation by resequencing plants until they reach maturity to collect phenotypic data
genomic regions of two grape cultivars. They have also are high, thus efforts to conduct early selection at the
sequenced these gene fragments in a highly diverse set of seedling stage are highly desirable. In addition, fruit
both cultivated and wild V. vinifera genotypes to identify evaluation requires quick assessment and special care due
possible signatures of domestication in the cultivated to perishability (Rikkerink et al., 2007). In general, there are
V. vinifera. In this study, SNPs significantly associated to many similarities between forest trees and fruit crops, and
berry weight variation have been identified in the flb region. some of the genomic tools developed and knowledge gained
Moreover, eight gene fragments with significant deviations for one group could be transferred to the other group.
from neutrality of the Tajimas D parameter have been However, before implementing such approaches across
identified in the cultivated pool along with putative these groups, we must acknowledge and understand the
signatures of selection. Sweet- and floral-flavoured Muscat fundamental differences. Fruit crops are mainly maintained
cultivars are highly regarded as table grapes and for wine through vegetative propagation in a domesticated state
making. Using QTL analysis, Muscat flavour determination while forest trees are found in both partially undomesti-
has been investigated and a major QTL on chromosome cated and domesticated states (Neale and Ingvarsson, 2008).
8 of 16| Khan
4052 and
| Khan Korban
and Korban

Therefore, LD patterns between forest trees and fruit crops a 1536 SNP GoldenGate assay for apple (Khan et al., 2012)
could be quite different. For vegetatively propagated and a 8000 SNP Infinium assay for apple, peach, and cherry
domesticated trees, incidence of recombination is unlikely, (Chagne et al., 2012), will provide more opportunities for
thus this can result in extended LD compared to un- candidate gene-based association mapping. Although the
domesticated trees. Infinium has 8000 SNPs, this is not sufficient for genome-
There are LD data for only a few plants and only for wide AM due to rapid LD decay in apple and in other
specific loci in Rosaceae; therefore, it is rather difficult to rosaceous fruit crops. But, this can be used for whole-
propose general conclusions about LD decay. Recently, genome scanning of marker-trait associations utilizing
Aranzana et al. (2010) have studied extent, distribution, and a joint linkage-association approach that makes use of both
structure of genetic variation in North American and pedigree and diversity analyses.
European commercial peach cultivars, as well as in some Presence of co-linearity between some rosaceous species
old peach cultivars. They have genotyped 50 SSRs that are should also be taken into consideration when drawing
evenly distributed across all linkage groups in 224 peach conclusions based on the economic value of a crop. Co-
cultivars. The population structure analysis has divided the linearity among species has been deemed acceptable by the
sample into three main groups, based mainly on their fruit Rosaceae research community as a unitary system for
characteristics, including melting-flesh peach, melting-flesh genetic analysis to overcome economic costs (Rikkerink
nectarine, and non-melting subpopulations. LD analysis has et al., 2007). However, transferability of AM results across
revealed that LD extends up to 1315 cM in peach. The species may not always be feasible as there are also major
extended LD could be attributed to the self-pollination biological differences, even within a single species, thus
habit of peach cultivars used in this study and a bottleneck rendering synteny across species as a minor factor. How-
that must have occurred early on in modern breeding ever, it should also be noted that, unlike forest trees, where
practices. It is concluded that a high LD suggests that population structure is minimal, domesticated fruit trees
whole-genome scanning approaches are well suited for can have substantial population structure that may con-
genetic studies of important traits in peach. found AM inferences. Therefore, a large set of accessions/
The mechanism of inbreeding control could also have genotypes are required for reliable identification of associa-
significant influence on LD, as it is assumed that self- tions by minimizing spurious associations due to population
incompatible rosaceous fruits will have rapid LD decay structure.
compared to self-compatible fruits. Furthermore, deliberate
breeding can also influence LD; however, long generation
times and recent breeding efforts in fruit trees may have Future of association mapping in forest trees
reduced these effects (Myles et al., 2011). Compared to and fruit crops in the era of next-generation
pedigree-based approaches, AM is more suited for rosa-
sequencing
ceous plants due to their above-described biological fea-
tures. Depending on LD, different AM strategies could be In perennial trees, traditional full-sib crosses, i.e. F1 crosses
implemented. Genome size and LD patterns are critical in between two outbred plants, are used to identify associa-
determining the number of markers required for conducting tions between traits and genetic loci. A maximum of four
whole-genome scans, as well as economic feasibility and alleles would segregate in such a cross. However, in AM,
overall success of AM studies (Myles et al., 2009). De- a large number of alleles present within the gene pool of
veloping sufficient numbers of markers will greatly depend a species are tested against the phenotype to detect
on available genomic resources for a given crop. Candidate significant associations. As discussed earlier, patterns of
gene AM may be the method of choice compared to LD and availability of genomic resources for a species will
genome-wide mapping for crops characterized by fast LD dictate which AM strategy is more appropriate for a given
decay and with limited available genomic resources. As to crop. Although the utility of AM in crop improvement is
which strategy is preferable for identifying genes of interest well established, its active use is rather slow, mainly due to
will also depend on the economic value of the target crop. limited numbers of genomic resources available for most
AM is preferred for those horticultural fruit crops of high agriculturally important crops, with only a few exceptions.
economic value, such as apple, peach, pear, and strawberry. Development of genetic markers is crucial for AM studies.
For apples, Cevik et al. (2009) have used a pedigree-based A whole genome or selected segments of a genome of
QTL mapping approach and have identified markers a panel of organisms are sequenced to identify differences
(MdMADS2.1, MdMADS2.2, and MdMADS14) from two across the genome(s) of the panel(s). Subsequently, identi-
candidate orthologous FRUITFULL-like genes linked to fied polymorphisms are genotyped across a larger and more
fruit flesh firmness. Subsequently, they have used an diverse yet unrelated population.
association analysis to collect further evidence for associa- Due to high costs of sequencing and genotyping, marker
tion between MdMADS2.1 and flesh firmness in a popula- development is considered very costly and thus only justifi-
tion of 168 apple accessions. The association mapping able in crops of high commercial value. Recently, there has
population has allowed for confirmation that MdMADS2.1 been a new revolution in technological advances that impact
and fruit flesh are significantly associated. Additionally, economic aspects of whole-genome sequencing efforts.
recently developed high-throughput SNP genotyping assays, Next-generation sequencing technologies are highly efficient
Association
Association mapping
mapping in forest
in forest trees
trees andand fruit
fruit | 9| of
crops
crops 4053
16

and costs of sequencing have sharply declined. Currently, japonica rice) and Nipponbare genomes (Arai-Kichise et al.,
there are many ambitious ongoing whole-genome sequenc- 2011). Validation of a set of selected SNPs has yielded
ing projects for a wide range of organisms. The Genome a success rate of 95 and 88% for the Omachi and Nipponbare
10K Consortium of Scientists (G10KCOS) has undertaken genomes, respectively. Huang et al. (2009) have utilized
a sequencing project (G10K) supported by zoos, museums, a whole-genome resequencing approach to genotype recombi-
universities, and research centres to assemble a genomic zoo nant inbred lines in rice and have reported that this is a faster
the genomes of 10,000 vertebrate species (for mammals, approach for collecting data. More importantly, it provides
birds, amphibians, reptiles, and fishes), approximately one accurate determination of recombination breakpoints com-
for every vertebrate genus. In November 2010, the Beijing pared to those revealed in a genetic map constructed using
Genomics Institute (BGI) and the G10KCOS have together PCR-based markers.
planned to sequence genomes of the first 101 vertebrate Recently, it has been demonstrated that it is not necessary
species within 2 years. The 1000 Plants & Animals reference to resequence a genome at very high coverage to identify
genome project, launched in January 2010, aims to sequence SNPs and copy number variations (CNVs), as resequenced
genomes of 1000 economically and scientifically important samples will be aligned using the already sequenced genome
plant/animal species within 2 years. Moreover, the possibility as a reference (Alkan et al., 2009). Improvements in
of sequencing pooled DNA will also change how AM is sequencing throughput and pair-end sequencing, together
pursued. In organisms where a reference genome sequence with reduced genome coverage will enable multiplexing of
exists (Table 1), genotyping-by-resequencing (GBS, Fig. 2) an barcoded samples within the same sequencing lane, effec-
association population will be pursued. Resequencing of tively reducing sequencing costs (Fig. 2). DNA of two or
whole genomes using next-generation sequencing will allow more genotypes with different indices could be mixed in
rapid and efficient identification of SNPs and insertions equal molar concentrations and loaded onto a single
deletions between genomes. Recently, resequencing of the rice sequencing lane to perform paired-end sequencing. After
genome identified a total of 132,462 SNPs, 16,448 insertions, trimming barcodes, sequences could be aligned to the
and 19,318 deletions between the Omachi (landrace of reference genome to identify polymorphisms across the

Table 1. Association mapping-relevant biological characteristics of forest trees and fruit crops whose genomes have been sequenced

Sequenced Family Ploidy Chromosomes Pollination Genome LD Protein Reference


genomes (n) size (Mb) extent coding
genes
Strawberry Rosaceae F. vesca is a diploid, 7 Self- 240 34,809 Shulaev et al.
(Fragaria vesca) but others pollination (2011)
species include
tetraploid,
pentaploid,
hexaploid, octoploid
(F. 3 ananassa)
and mixoploid
Cacao Sterculiaceae Diploid 10 Cross- 430 28,798 Argout et al.
(Theobroma pollinated (2011)
cacao)
Apple Rosaceae Predominantly 17 Cross- 742 57,386 Velasco et al.
(Malus 3 diploid, along pollination (2010)
domestica) with some triploids
and tetraploids
Peach Rosaceae Diploid 8 Self- 227 1315 cM 28,702 International
(Prunus persica) pollination Peach
Genome
Initiative
(IPGI)
Papaya Caricaceae Diploid 9 Polygamous 372 28,629 Ming et al.
(Carica papaya) (2008)
Grapevine Vitaceae Predominantly 19 Self- 487 2.7 cM 26,346 Velasco et al.
(Vitis vinifera) diploid but tri pollination (2007)
and tetra-ploids
also exist
Poplar Salicaceae Diploid 19 Cross- 485 <500 bp 45,555 Tuskan et al.
(Populus pollination (2006)
trichocarpa)
4054 | Khan
10 of 16 andand
| Khan Korban
Korban

Fig. 2. Genotyping strategies for candidate gene and genome-wide association mapping (GWAS). The strategy for candidate gene
genotyping is based on Sanger sequencing using ABI technology, while that for GWAS is the genotyping-by-sequencing (GBS) adapted
from Elshire et al. (2011) for Illumina GA sequencing technology. The GBS strategy could be used for both single-nucleotide
polymorphism (SNP) genotyping and copy number variation (CNV) identification. LD, linkage disequilibrium.

population. For example, the HiSeq2000 platform is rice landraces using a genotyping-by-resequencing approach
capable of generating approximately 120 Gbp per flowcell (Huang et al., 2010). Resequencing of 517 rice landraces has
(;30 Gb per sequencing lane), and this throughput is contributed to the identification of ;3.6 million SNPs and
expected to increase further in the near future. An estimated aiding in the construction of a high-density haplotype map.
cost of resequencing an apple population, with a total On average, identified loci have accounted for ;36% of the
genome size of ;742 Mb (Velasco et al., 2010), would add phenotypic variance, and six loci, closely located to pre-
up to ;$25,000 for 14 individuals per flow cell, with an viously identified genes, have been identified (Huang et al.,
average cost of ;$1800 per genotype, assuming that ;203 2010). This has demonstrated that integrating high-through-
genome coverage would be sufficient to identify SNPs and put genome resequencing and genome-wide AM can serve
CNVs by resequencing, which can be achieved by multi- as a powerful complementary strategy to bi-parental QTL
plexing of two genotypes per lane. mapping for genetic dissection of complex traits in plants.
The next-generation high-throughput sequencing technolo-
gies combined with multiplexing of DNA samples will
Genome-wide association mapping
significantly contribute to reduced costs (Fig. 2). As GBS
Success of whole-genome AM depends on density and data for large numbers of germplasm become available,
robustness of available markers as well as accuracy of these would be useful in studying various traits over many
phenotypic evaluation. To achieve appropriate marker years. However, the panel of genotypes for sequencing
density for a target plant species, it is important to measure should be carefully selected, keeping in mind their diversity
genome-wide LD and have access to high-throughput and their future utility in breeding efforts. For example in
genotyping methods. For those forest and fruit crops where apples, an appropriate germplasm set for AM studies would
whole-genome sequences are available (Table 1), resequenc- be the USDA Malus core collection. This Malus core
ing of the AM panel is doable using next-generation high- collection includes diverse apple germplasm along with
throughput sequencing technologies, thus enabling rapid a large number of accessions from the cultivated apple,
and cost-effective genome-wide identification of polymor- M. x domestica, which have been distributed to multiple sites
phisms. A genome-wide association mapping analysis has throughout the USA, and has been evaluated for multiple
been undertaken for 14 agronomic traits in a population of traits, and continues to undergo evaluation.
Association
Association mapping
mapping in forest
in forest treestrees
andand
fruit fruit
crops | 11| of
crops 4055
16

Sequencing of reduced representation genomic libraries markers are needed to have a reasonable coverage of
(Altshuler et al., 2000; Baird et al., 2008; Elshire et al., 2011) Arabidopsis, grapevine, and maize genomes, respectively
could be used for identification and genotyping of SNPs for (Kim et al., 2007; Myles et al., 2009). AM studies based on
genome-wide AM in crops with extended LD and complex candidate genes require markers only for regions of interest.
genomes (Fig. 2). The GBS method described by Elshire The candidate gene-based AM has a higher likelihood of
et al. (2011) is a robust, simple, quick, and highly specific success when the target trait is well-characterized at the
approach for species with high diversity. Reduced represen- biochemical and/or physiological level, with known meta-
tation genomic libraries reduce genome complexity and bolic pathways, to permit selection of candidate genes
allow for multiplexing of large numbers of indexed samples (Pflieger et al., 2001). Candidate regions to be included in
within a single sequencing lane. Genome complexity re- AM analysis are also selected based on results from previous
duction strategies are usually applied to complex genomes QTL mapping and by assessing comparative studies in
such as maize, wheat, barley, and soybean to generate related crops. Selecting candidate genes from closely related
simpler genomic DNA libraries for high-throughput se- species by synteny requires prior knowledge about the degree
quencing. These reduced representation genomic libraries of conservation of gene families. For example, there is a high
are constructed using methylation-sensitive restriction synteny between the genomes of apple and pear, as revealed
enzymes that avoid repetitive regions of genomes and can by the apple genome sequence (Velasco et al., 2010) and by
target lower copy regions with 2- to 3-fold higher efficiency studies demonstrating high transferability of markers from
(Elshire et al., 2011). Reduction of genome complexity by apple to pear (Celton et al., 2009; Gasic et al., 2009).
avoiding repetitive regions markedly simplifies computa- Therefore, significant associations identified by genome-wide
tionally challenging alignment problems in species of high AM in apples (genome has been sequenced) could be readily
levels of genetic diversity. Different restriction enzymes tested and validated in pear (genome has not yet been
could be used to digest genomic DNA of target species and sequenced) through candidate-gene AM.
to investigate polymorphism patterns within a species. One plausible method for identification and genotyping
Selection of a single or multiple restriction enzymes for of SNPs in candidate genes could be through GBS of an
a reduced representation library relies on genomic features AM population (Fig. 2). However, this method will require
of a given organism and its intended applications (Baird access to high genome coverage for reliable detection of
et al., 2008; Wu et al., 2010). For example, a high- polymorphisms for plant species where a reference genome
throughput sequencing approach of pooled DNA fragments is not available. For plant species lacking a whole genome
of a reduced representation library of two parental lines of sequence, a two-step strategy may be preferable. This will
a soybean mapping population has been used for identify- entail identifying SNPs in candidate genes using high-
ing SNPs (Wu et al., 2010). As the frequency of predicted throughput sequencing of a small set of a population, and
putative SNPs of short sequence reads has been detected at subsequently genotyping-identified SNPs in a large associa-
a low sequencing depth, a total of 39,022 putative SNPs tion population using SNP genotyping assays; e.g., Illumina
have been identified, and validation of these SNPs have GoldenGate.
been predicted at low and high stringencies of 72% and
85%, respectively. It has been suggested that this approach
Potential of CNV-based association mapping
is more efficient for targeting multiple QTL regions in the
same genetic population, and it can be used for fine- In addition to DNA polymorphism, there is growing
mapping multiple QTL regions in other crops. Previously, evidence for presence of structural variations among
Hyten et al. (2010) have discovered a total of 710825,047 genomes of different individuals within a species, resulting
of predicted SNPs using high-throughput resequencing of in phenotypic variations (Swanson-Wagner et al., 2010).
a reduced representation library in soybean. These SNPs Variations in genome structure consist of chromosomal
have been subsequently used to develop a high-resolution rearrangements (inversions and translocations), segmental
genetic map using RILs for assembly of whole-genome and gene duplications, and CNVs in the form of deletions,
shotgun sequences of soybean. duplications, and insertions. These structural variations
may account for phenotypic variation not captured by
Candidate gene-based association mapping DNA marker-based genetic studies. It has been demon-
strated in several studies that variations in genome structure
Genome-wide AM has been useful in detecting candidate contribute extensively to genetic variability, influencing
genes underlying Mendelian traits in maize, rice, and phenotypic variation in plants, livestock, and humans
Arabidopsis, either due to relatively simple genetics and (Springer et al., 2009; Liu et al., 2010; Stranger et al.,
strong imposed selection to the traits besides availability of 2011). Chromosomal rearrangements and CNVs in DNA
large amount of genomic resources. However, for species contribute greatly to total structural variations within
with rapid LD decay and fewer genomic resources, genome- a genome. CNVs are segments of the genome that are >1
wide AM is generally not suitable due to large numbers of kb in length, have >90% sequence identity, and vary in
markers required to cover the entire genome. Thus, a candi- copy number when compared to a reference genome. CNVs
date gene-based AM strategy may be the best option. affect more nucleotides per genome than sequence varia-
Roughly, ;140,000, ; 2 million, and ;1015 million tions found in SNPs (Zhang et al., 2009).
4056 | Khan
12 of 16 andand
| Khan Korban
Korban

Studies in domesticated livestock and in humans suggest quencing coverage depth could help to avoid problems
that CNVs contribute to important production and disease related to the failure of the read-mapping algorithm to align
traits, either through dosage effects or via non-additive reads to the reference sequence. However, an alternative
effects of allele combinations (Springer et al., 2009; approach would be to perform low-coverage sequencing
Henshall et al., 2010). In addition to dosage effects, CNVs on large numbers of accessions from an association
could influence gene function by positional effects and by panel and then identify the most diverse accessions, which
unmasking mutant alleles of genes when the functional copy could then be subjected to deeper genome coverage
is deleted (Stankiewicz and Lupski, 2010). In contrast to sequencing.
currently known sequence variations, including SNPs and Genome-wide AM studies of both common and rare
SSRs, levels of structural variations, especially of CNVs CNVs, as well as common SNPs can now be pursued
that likely contribute to extensive phenotypic diversity and (McCarroll et al., 2008). Thus, CNVs can be reliably
plasticity of crops, have not been given much attention in identified using the GBS method as described by Elshire
plants. According to Springer et al. (2009), CNVs could et al. (2011) with high coverage depth (Fig. 2) and
provide yet another valuable resource beyond traditional subsequently used either as functional variations and/or as
microsatellite and SNP variations for pursuing genomic neutral markers in association mapping studies.
studies. Hence, investigating the role, structure, and signif-
icance of CNVs in higher plants would lead to new critical
and fundamental information and knowledge on plant
phenotypic outcomes. As genome-wide SNP data are
Conclusions
generated, they can be readily reanalysed to identify CNVs,
and subsequently used for AM studies as well. In recent Recently, increasing numbers of genomic resources have
studies in cattle and humans, statistical methodologies used become available for forest trees and fruit crops. Whole
for SNP association studies have also been used to identify genome sequences of strawberry (Shulaev et al., 2011),
CNVs associated with traits of interest. As CNVs often cacao (Argout et al., 2011), apple (Velasco et al., 2010),
include entire genes and their regulatory regions, they are peach (Sosinski et al., 2010), papaya (Ming et al., 2008),
likely to play major roles in control of expression of grape (Velasco et al., 2007), and poplar (Tuskan et al.,
phenotypic traits (Kato et al., 2008). As SNPs are used in 2006) have recently become available. Moreover, complete
AM studies, CNVs could also prove useful in studying genome sequences of two citrus species (Gmitter, 2010),
phenotypic traits. Therefore, it is prudent to capitalize on sweet orange and Clementine mandarin, will soon become
CNVs both as markers for mapping efforts as well as available, along with genome sequences of raspberry (Price
potential functional variants, i.e. analogous to functional et al., 2011), coffee (Wincker et al., 2011), Chinese chestnut
nucleotide polymorphisms. (Barakat et al., 2010), and blueberry (Brown et al., 2010),
Often, there are two concerns in utilizing CNVs in AM among others. Although many of these are merely draft
studies: (1) to what extent are CNVs inherited rather than sequences, high-throughput sequencing integrated with
arise as new mutations in each generation, and (2) the barcoded multiplexing of samples can drastically reduce
extent to which inherited CNVs arise from common sequencing and genotyping costs. All of these recent
polymorphisms compared to rare variants. In humans, developments will aid not only in expanding genetic and
studies have revealed that CNVs follow Mendelian in- genomic resources, but also in refining genome sequences of
heritance, and ;80% of copy number differences between these fruit species. Ultimately, these genomes can be used as
any two individuals appear to arise from common copy references to identify SNPs and CNVs, and will significantly
number polymorphisms (CNPs) with segregating allelic limit flaws of SNP- and CNV-based genome-wide and
frequencies of >5%, and > 90% of these CNPs are reported candidate gene-based AM studies, especially in economi-
to arise from those segregating at an allele frequency of cally important perennial species.
above 1%. Another concern over rare variants is their Presence of considerable levels of synteny among many
difficulty in identifying them, although with high-through- rosaceous species suggest that even for those plant species
put sequencing, this is no longer a major concern. with fewer genomic resources, candidate-gene AM coupled
When choosing perennial forest and fruit trees for CNV with QTL mapping studies and comparative mapping
studies, presence of high levels of heterozygosity reduces would be feasible and highly valuable. Knowledge acquired
the power of detecting CNVs and is a valid concern. This in one species can then be extended to others. For example,
is similar to studies on humans, wherein heterozygosity is establishing apple as a model by increasing and refining its
also high, and it could be argued that the availability of genome sequence and identifying regions associated with
higher quality reference genome(s) would have had important traits could even be used for other related species
contributed to successful read-depth-based CNV identifi- of Rosaceae and the sub-tribe Pyreae, particularly in pear,
cation. As reference genome sequences for perennial trees by conducting comparative genomics. This can lead to
are not as well refined as in human or other model species, a better understanding of genome structure and polymor-
reliable identification of CNVs requires higher genome phism, and will also provide large resources of molecular
sequence coverage. Therefore, enhancing the quality of markers for germplasm evaluation, breeding, and pursuing
the reference genome sequence and increasing the rese- positional cloning efforts.
Association
Association mapping
mapping in forest
in forest treestrees
andand
fruit fruit
crops | 13| of
crops 4057
16

Acknowledgements mapping of complex traits in diverse samples. Bioinformatics 23,


26332635.
This work was partially funded by the USDA-NIFA-SCRI
Breseghello F, Sorrells ME. 2006a. Association analysis as
(grant AG 2009-51181-06023) and the University of Illinois
a strategy for improvement of quantitative traits in plants. Crop
Office of Research (projects 65-325 and 875-922. We thank
Science 46, 13231330.
Dr. Kenneth Olsen and his research group (Washington
University) for critical reading of the manuscript and their Breseghello F, Sorrells MS. 2006b. Association mapping of kernel
helpful suggestions. size and milling quality in wheat (Triticum aestivum L.) cultivars.
Genetics 172, 11651177.
Brown A, Colman S, Lommel S, Rowland L, Diener S,
References Windham E, Burke M. 2010. Construction of a blueberry (Vaccinium
Abdurakhmonov IY, Abdukarimov A. 2008. Application of corymbosium) draft genomic sequence using multiple platforms.
association mapping to understanding the genetic diversity of plant Abstracts of the Plant and Animal Genome Conference XVIII, 913
germplasm resources. International Journal of Plant Genomics 2008, January 2010, San Diego, California. Abstract no. P-105.
574927. Brown GR, Bassoni DL, Gill GP, Fontana JR, Wheeler NC,
Agrama HA, Eizenga GC, Yan W. 2007. Association mapping of Megraw RA, Davis MF, Sewell MM, Tuskan GA, Neale DB. 2003.
yield and its components in rice cultivars. Molecular Breeding 19, Identification of quantitative trait loci influencing wood property traits in
341356. loblolly pine III QTL verification and candidate gene mapping. Genetics
164, 15371546.
Alkan C, Kidd JM, Marques-Bonet T, et al. 2009. Personalized
copy number and segmental duplication maps using next-generation Buckler ES, Holland JB, Bradbury PJ, et al. 2009. The genetic
sequencing. Nature Genetics 41, 10611067. architecture of maize flowering time. Science 325, 714718.
Altshuler D, Pollara VJ, Cowles CR, Van Etten WJ, Baldwin J, Celton J-M, Chagne D, Tustin S, Terakami S, Nishitani C,
Linton L, Lander ES. 2000. An SNP map of the human genome Yamamoto T, Gardiner S. 2009. Update on comparative genome
generated by reduced representation shotgun sequencing. Nature mapping between Malus and Pyrus. BMC Research Notes 2, 182188.
407, 513516. Cevik V, Ryder CD, Popovich A, Manning K, King GJ,
Aranzana MJ, Abbassi EK, Howad W, Arus P. 2010. Genetic Seymour G. 2009. A FRUITFULL-like gene is associated with genetic
variation, population structure and linkage disequilibrium in peach variation for fruit flesh firmness in apple (Malus domestica Borkh.). Tree
commercial varieties. BMC Genetics 11, 69. Genetics and Genomes 6, 271279.
Aranzana MJ, Kim S, Zhao KY, et al. 2005. Genome-wide Chagne D, Crowhurst RN, Troggio M, et al. 2012. Genome-wide
association mapping in Arabidopsis identifies previously known SNP detection, validation, and development of an 8K SNP array for
flowering time and pathogen resistance genes. PLoS Genetics 1, apple. PLoS One 7, e31745.
531539. Cockram J, White J, Zuluaga DL, et al. 2010. Genome-wide
Arai-Kichise Y, Shiwa Y, Nagasaki H, Ebana K, Yoshikawa H, association mapping to candidate polymorphism resolution in the
Yano M, Wakasa K. 2011. Discovery of genome-wide DNA unsequenced barley genome. Proceedings of the National Academy of
polymorphisms in a landrace cultivar of japonica rice by whole- Sciences, USA 107, 2161121616.
genome sequencing. Plant and Cell Physiology 52, 274282. Dillon SK, Nolan M, Li W, Bell C, Wu HX, Southertonet SG. 2010.
Argout X, Salse J, Aury J-M, et al. 2011. The genome of Allelic variation in cell wall candidate genes affecting solid wood
Theobroma cacao. Nature Genetics. 43, 101108. properties in natural populations and land races of Pinus radiate.
Baird NA, Etter PD, Atwood TS, et al. 2008. Rapid SNP discovery Genetics 185, 14771487.
and genetic mapping using sequenced RAD markers. PLoS One 3, Dirlewanger E, Cosson P, Tavaud M, Aranzana MJ, Poizat C,
e3376. Zanetto A, Arus P, Laigret F. 2002. Development of microsatellite
Barakat A, Addo-Quaye C, Ficklin S, Saski C, Staton M, markers in peach [Prunus persica (L.) Batsch] and their use in genetic
Hebard F, Miller W, Schuster S, Carlson JE. 2010. Sequencing the diversity analysis in peach and sweet cherry (Prunus avium L.).
genome of Chinese chestnut (Castanea mollissima) for ecosystem Theoretical and Applied Genetics 105, 127138.
restoration. Abstracts of the American Society of Plant Biologists Eckert AJ, Bower AD, Wegrzyn JL, Pande B, Jermstad KD,
Annual Meetings, 31 July to 4 August 2010, Montreal, Canada. Krutovsky KV, Clair JBS, Neale DB. 2009. Association genetics of
Abstract no. P06009. coastal Douglas-fir (Pseudotsuga menziesii var. menziesii, Pinaceae). I.
Barnaud A, Laucou V, This P, Lacombe T, Doligez A. 2010. Cold-hardiness related traits. Genetics 182, 12891302.
Linkage disequilibrium in wild French grapevine, Vitis vinifera L. Eckert AJ, van Heerwaarden J, Wegrzyn JL, Nelson CD, Ross-
subsp.silvestris. Heredity 104, 431437. Ibarra J, Gonzalez-Martnez SC, Neale DB. 2010. Patterns of
Bernardo R. 2008. Molecular markers and selection for complex traits population structure and environmental associations to aridity across the
in plants: learning from the last 20 years. Crop Science 48, range of loblolly pine (Pinus taeda L., Pinaceae). Genetics 185, 969982.
16491664. Ehrenreich IM, Hanzawa Y, Chou L, Roe JL, Kover PX,
Bradbury PJ, Zhang Z, Kroon DE, Casstevens TM, Purugganan MD. 2009. Candidate gene association mapping of
Ramdoss Y, Buckler ES. 2007. TASSEL: software for association Arabidopsis flowering time. Genetics 183, 325335.
4058 | Khan
14 of 16 andand
| Khan Korban
Korban

Elshire RJ, Glaubitz JC, Sun Q, et al. 2011. A robust, simple Huang X, Wei X, Sang T, et al. 2010. Genome-wide association
genotyping-by-sequencing (GBS) approach for high diversity species. studies of 14 agronomic traits in rice landraces. Nature Genetics 42,
PLoS One 6, e19379. 961967.
Emanuelli F, Battilana J, Costantini L, Cunff LL, Boursiquot J-M, Hyten DL, Cannon SB, Song Q, Weeks N, Fickus EW,
This P, Grando MS. 2010. A candidate gene association study on Shoemaker RC, Specht JE, Farmer AD, May GD, Cregan PB.
muscat flavor in grapevine (Vitis vinifera L.). BMC Plant Biology 10, 2010. High-throughput SNP discovery through deep resequencing of
241. a reduced representation library to anchor and orient scaffolds in the
Ersoz ES, Wright MH, Gonzalez-Martinez SC, Langley CH, soybean whole genome sequence. BMC Genomics 11, 38.
Neale DB. 2010. Evolution of disease response genes in loblolly pine: Ingvarsson PK. 2005. Nucleotide polymorphism and linkage
insights from candidate genes. PLoS One 5, e14234. disequilbrium within and among natural populations of European
Ersoz ES, Yu J, Buckler ES. 2007. Applications of linkage aspen (Populus tremula L., Salicaceae). Genetics 169, 945953.
disequilibrium and association mapping in crop plants. In: RK Ingvarsson PK, Garcia MV, Hall D, Luquez V, Jansson S. 2006.
Varshney, R Tuberosa, eds, Genomics-assisted crop improvement: Clinal variation in phyB2, a candidate gene for day-length-induced
vol. 1 Genomics approaches and platforms. The Netherlands: growth cessation and bud set, across a latitudinal gradient in
Springer, pp 97119. European aspen (Populus tremula). Genetics 172, 18451853.
Fournier-Level A, Lacombe T, Le Cunff L, Boursiquot JM, Ingvarsson PK, Garcia MV, Luquez V, Hall D, Jansson S. 2008.
This P. 2010. Evolution of the VvMYbA gene family, the major Nucleotide polymorphism and phenotypic associations within and
determinant of berry colour in cultivated grapevine (Vitis vinifera L.). around the phytochrome B2 locus in European aspen (Populus
Heredity 104, 351362. tremula, Salicaceae). Genetics 178, 22172226.
Gasic K, Han Y, Kertbundit S, Shulaev V, Iezzoni A, Stover E, Kato M, Nakamura Y, Tsunoda T. 2008. MOCSphaser: a haplotype
Bell R, Wisniewski M, Korban S. 2009. Characteristics and inference tool from a mixture of copy number variation and single
transferability of new apple EST-derived SSRs to other Rosaceae nucleotide polymorphism data. Bioinformatics 24, 16451646.
species. Molecular Breeding 23, 397411. Khan MA, Han Y, Korban SS. 2012. A high-throughput apple SNP
Gmitter FG. 2010. The haploid mandarin and diploid sweet orange genotyping assay using the GoldenGate platform. Gene 494,
genome sequences. Abstracts of the Plant and Animal Genome 196201.
Conference XVIII, 913 January 2010, San Diego, California. Abstract Kim S, Plagnol V, Hu TT, Toomajian C, Clark RM, Ossowski S,
no. w-146. Ecker JR, Weigel D, Nordborg M. 2007. Recombination and
GonzalezMartnez SC, Huber DA, Ersoz E, Davis JM, Neale DB. linkage disequilibrium in Arabidopsis thaliana. Nature Genetics 39,
2008. Association genetics in Pinus taeda L. II. Carbon isotope 11511155.
discrimination. Heredity 101, 1926. Komulainen P, Brown GR, Mikkonen M, Karhu A, Garcia-
GonzalezMartnez SC, Wheeler NC, Ersoz E, Nelson CD, Gil MR, OMalley D, Lee B, Neale DB, Savolainen O. 2003.
Neale DB. 2007. Association genetics in Pinus taeda L. I. Wood Comparing EST-based genetic maps between Pinus sylvestris and. P.
property traits. Genetics 175, 399409. taeda. Theoretical and Applied Genetics 107, 667678.
Gupta PK, Rustgi S, Kulwal PL. 2005. Linkage disequilibrium and Kover PX, Valdar W, Trakalo J, Scarcelli N, Ehrenreich IM,
association studies in higher plants: present status and future Purugganan MD, Durrant C, Mott R. 2009. A multiparent advanced
prospects. Plant Molecular Biology 57, 461485. generation inter-cross to fine-map quantitative traits in Arabidopsis
thaliana. PLoS Genetics 5, e1000551.
Hall D, Luquez V, Garcia MV, St Onge KR, Jansson S,
Ingvarsson PK. 2007. Adaptive population differentiation in Krutovsky KV, Neale DB. 2005. Nucleotide diversity and linkage
phenology across a latitudinal gradient in European aspen (Populus disequilibrium in cold-hardiness- and wood quality-related candidate
tremula L.): a comparison of neutral markers, candidate genes and genes in Douglas fir. Genetics 171, 20292041.
phenotypic traits. Evolution 61, 28492860. Liu GE, Hou Y, Zhu B, et al. 2010. Analysis of copy number variations
Henshall JM, Whan VA, Norris BJ. 2010. Reconstructing CNV among diverse cattle breeds. Genome Research 20, 693703.
genotypes using segregation analysis: combining pedigree information Maccaferri M, Sanguineti MC, Demontis A, et al. 2011.
with CNV assay. Genetics, Selection, Evolution 42, 34. Association mapping in durum wheat grown across a broad range of
Heuertz M, De Paoli E, Kallman T, Larsson H, Jurman I, water regimes. Journal of Experimental Botany 62, 409438.
Morgante M, Lascoux M, Gyllenstrand N. 2006. Multilocus McCarroll SA, Kuruvilla FG, Korn JM, et al. 2008. Integrated
patterns of nucleotide diversity, linkage disequilibrium and detection and population-genetic analysis of SNPs and copy number
demographic history of Norway spruce [Picea abies (L.) Karst]. variation. Nature Genetics 40, 11661174.
Genetics 174, 20952105. McMullen MD, Kresovich S, Villeda HS, et al. 2009. Genetic
Houel C, Bounon R, Chab J, et al. 2010. Patterns of sequence properties of the maize nested association mapping population.
polymorphism in the fleshless berry locus in cultivated and wild Vitis Science 325, 737740.
vinifera accessions. BMC Plant Biology 10, 284. Ming R, Hou S, Feng Y, et al. 2008. The draft genome of the
Huang X, Feng Q, Qian Q, et al. 2009. High-throughput genotyping transgenic tropical fruit tree papaya (Carica papaya L.). Nature 452,
by whole-genome resequencing. Genome Research 19, 10681076. 991996.
Association
Association mapping
mapping in forest
in forest treestrees
andand
fruit fruit
crops | 15| of
crops 4059
16

Murray SC, Rooney WL, Hamblin MT, Mitchell SE, Kresovich S. Rikkerink EHA, Oraguzie NC, Gardiner SE. 2007. Prospects of
2009. Sweet sorghum genetic diversity and association mapping for association mapping in perennial horticultural crops. In: NC Oraguzie,
brix and height. The Plant Genome 2, 4862. EHA Rikkerink, SE Gardiner, HN Silva, eds, Association mapping in
Myles S, Boyko AR, Owens CL, et al. 2011. Genetic structure and plants. New York: Springer, pp. 249269.
domestication history of the grape. Proceedings of the National Rungis D, Hamberger B, Berube Y, Wilkin J, Bohlmann J,
Academy of Sciences, USA 108, 34573458. Ritland K. 2005. Efficient genetic mapping of single nucleotide
Myles S, Peiffer J, Brown PJ, Ersoz ES, Zhang Z, Costich DE, polymorphisms based upon DNA mismatch digestion. Molecular
Buckler ES. 2009. Association mapping: critical considerations shift Breeding 16, 261270.
from genotyping to experimental design. The Plant Cell 21, 21942202. Savolainen O, Pyhajarvi T. 2007. Genomic diversity in forest trees.
Neale DB. 2007. Genomics to tree breeding and forest health. Current Opinion in Plant Biology 10, 162167.
Current Opinion in Genetics & Development 17, 539544. Sax K. 1923. The association of size differences with seed-coat
Neale DB, Ingvarsson PK. 2008. Population, quantitative and pattern and pigmentation in Phaseolus vulgaris. Genetics 8, 552560.
comparative genomics of adaptation in forest trees. Current Opinion in Shulaev V, Korban SS, Sosinski B, et al. 2008. Multiple models for
Plant Biology 11, 149155. Rosaceae genomics. Plant Physiology 147, 9851003.
Neale DB, Kremer A. 2011. Forest tree genomics: growing Shulaev V, Sargent DJ, Crowhurst RN, et al. 2011. The genome of
resources and applications. Nature Reviews Genetics 12, 111122. woodland strawberry (Fragaria vesca). Nature Genetics 43, 109116.
Neale DB, Savolainen O. 2004. Association genetics of complex Sorkheh K, Malysheva-Otto LV, Wirthensohn MG, Tarkesh-
traits in conifers. Trends in Plant Science 9, 325330. Esfahani S, Martnez-Gomez P. 2008. Linkage disequilibrium,
Olsen KM, Halldorsdottir SS, Stinchcombe JR, et al. 2004. genetic association mapping and gene localization in crop plants.
Linkage disequilibrium mapping of Arabidopsis CRY2 flowering time Genetics and Molecular Biology 31, 805814.
alleles. Genetics 167, 13611369. Sosinski B, Verde I, Morgante M, Rokhsar D. 2010. The
Paterson AH, Lander ES, Hewitt JD, Peterson S, Lincoln SE, international peach genome initiative. A first draft of the peach genome
Tanksley SD. 1988. Resolution of quantitative traits into Mendelian sequence and its use for genetic diversity analysis in peach. Abstracts
factors by using a complete linkage map of restriction fragment length of the 5th International Rosaceae Genomics Conference. November
polymorphisms. Nature 335, 721726. 2010, Stellenbosch, South Africa. O46.

Pavy N, Paule S, Parsons L, et al. 2005. Generation, annotation, Springer NM, Ying K, Fu Y, et al. 2009. Maize inbreds exhibit high
analysis and database integration of 16,500 white spruce EST levels of copy number variation (CNV) and presence/absence variation
clusters. BMC Genomics 6, 144. (PAV) in genome content. PLoS Genetics 5, e1000734.

Pavy N, Pegas B, Beauseigle S, et al. 2008. Enhancing genetic Stankiewicz P, Lupski JR. 2010. Structural variation in the human
mapping of complex genomes through the design of highly- genome and its role in disease. Annual Reviews of Medicine 6,
multiplexed SNP arrays: application to the large and unsequenced 437455.
genomes of white spruce and black spruce. BMC Genomics 9, 21. Stracke S, Haseneyer G, Veyrieras JB, Geiger HH, Sauer S,
Petit RJ, Hampe A. 2006. Some evolutionary consequences of being Graner A, Piepho HP. 2009. Association mapping reveals gene
a tree. Annual Review of Ecology, Evolution, and Systematics 37, action and interactions in the determination of flowering time in barley.
187214. Theoretical and Applied Genetics 118, 259273.

Pflieger S, Lefebvre V, Causse M. 2001. The candidate gene Stranger BE, Stahl EA, Raj T. 2011. Progress and promise of
approach in plant genetics: a review. Molecular Breeding 7, 275291. genome-wide association studies for human complex trait genetics.
Genetics 187, 367383.
Price JC, Ward JA, Clement MJ, Weber CA, Lewers KS,
Hagen W, Haynes B, Swanson J-D, Udall JA. 2011. Whole Su Z, Li X, Hao Z, et al. 2011. Association analysis of the nced and
genome sequencing of the highly heterozygous diploid red raspberry, rab28 genes with phenotypic traits under water stress in maize. Plant
Rubus idaeus cv Heritage. Abstracts of the Plant and Animal Genome Molecular Biology Reporter 29, 714722.
XIX Conference. 1519 January 2012, San Diego, California. Abstract Swanson-Wagner RA, Eichten SR, Kumari S, Tiffin P, Stein JC,
no. w-251. Ware D, Springer NM. 2010. Pervasive gene content variation and
Pritchard JK, Stephens M, Donnelly P. 2000. Inference of population copy number variation in maize and its undomesticated. progenitor.
structure using multilocus genotype data. Genetics 155, 945959. Genome Research 20, 16891699.

Quesada T, Gopal V, Cumbie WP, Eckert AJ, Wegrzyn JL, Thornsberry JM, Goodman MM, Doebley J, Kresovich S,
Neale DB, Goldfarb B, Huber DA, Casella G, Davis JM. 2010. Nielsen D, Buckler ES. 2001. Dwarf8 polymorphisms associate with
Association mapping of quantitative disease resistance in a natural variation in flowering time. Nature Genetics 28, 286289.
population of loblolly pine (Pinus taeda L.). Genetics 186, 677686. Thumma BR, Matheson BA, Zhang D, Meeske C, Meder R,
Ravel C, Praud S, Murigneux A, Linossier L, Dardevet M, Downes GM, Southerton SG. 2009. Identification of a cis-acting
Balfourier F, Dufour P, Brunel D, Charmet G. 2006. Identification regulatory polymorphism in a eucalypt cobra-like gene affecting
of Glu-B11 as a candidate gene for the quantity of high-molecular- cellulose content. Genetics 183, 11531164.
weight glutenin in bread wheat (Triticum aestivum L.) by means of an Thumma BR, Nolan MF, Evans R, Moran GF. 2005.
association study. Theoretical and Applied Genetics 112, 738743. Polymorphisms in cinnamoyl CoA reductase (CCR) are associated with
4060 | Khan
16 of 16 andand
| Khan Korban
Korban

variation in microfibril angle in Eucalyptus spp. Genetics 171, Wincker P, Albert VA, Andrade AA, et al. 2011. Sequencing the
12571265. coffee genome. Abstracts of the Plant and Animal Genome
Tommasini L, Schnurbusch T, Fossati D, Mascher F, Keller B. Conference XIX, 1519 January 2011, San Diego, California. Abstract
2007. Association mapping of Stagonospora nodorum blotch no. w-152.
resistance in modern European winter wheat varieties. Theoretical and Wu X, Ren C, Joshi T, Vuong T, Xu D, Nguyen H. 2010. SNP
Applied Genetics 115, 697708. discovery by high-throughput sequencing in soybean. BMC Genomics
Tuskan GA, DiFazio S, Jansson S, et al. 2006. The genome of 11, 469.
black cottonwood, Populus trichocarpa (Torr. & Gray). Science 313, Zhang Z, Ersoz E, Lai CQ, et al. 2010. Mixed linear model approach
15961604. adapted for genome-wide association studies. Nature Genetics 42,
Velasco R, Zharkikh A, Affourtit J, et al. 2010. The genome of the 355360.
domesticated apple (Malus 3 domestica Borkh.). Nature Genetics 42, Zhang F, Gu W, Hurles ME, Lupski JR. 2009. Copy number
833839. variation in human health, disease, and evolution. Annual Review of
Velasco R, Zharkikh A, Troggio M, et al. 2007. A high quality draft Genomics Human Genetics 10, 451481.
consensus sequence of the genome of a heterozygous grapevine Zhu C, Gore M, Buckler ES, Yu J. 2008. Status and prospects of
variety. PLoS One 2, e1326. association mapping in plants. The Plant Genome 1, 520.

Vous aimerez peut-être aussi