Vous êtes sur la page 1sur 31

Marcos Malosetti & Fred van Eeuwijk

Introduction to mixed model QTL mapping using GenStat


13-15th December 2010, ESALQ Piracicaba, So Paulo, Brazil
QTL mapping
Objective of QTL mapping studies:
describe phenotypes in relation to underlying genetic
factors (called QTL)
QTL = Quantitative Trait Locus
Finding statistical association between
information at the DNA level (molecular markers)
and phenotypic variation
Linkage analysis: common approach, conventional
QTL mapping
LD mapping or association mapping : more recently
Linkage and linkage disequilibrium
Statistical association between markers and
phenotypes found when there is linkage
(disequilibrium) between markers and QTLs.
Linkage disequilibrium (LD): non-random
association of alleles at two loci
not necessarily on the same chromosome
Linkage: non-random association of alleles at two
loci due to limited recombination between the loci
Linkage necessarily involves loci on the same
chromosome
Conventional QTL mapping versus LD mapping
Designed crosses Association panel
Both, linkage analysis and LD mapping, rely on linkage
disequilibrium to detect QTLs
Conventional QTL mapping versus LD mapping
Designed crosses Association panel
LD marker-QTL is only
consequence of linkage.
LD marker-QTL can be
consequence of linkage but also
other factors can cause LD.
Population admixture
Genetic relatedness / population substructure
All genotypes of a segregating population have by
expectation equal relatedness/correlation
In association mapping panels the genotypes
show heterogeneous genetic relatedness:
Unstructured: coefficient of coancestry u
ij

Structured genetic relatedness (population
substructure)
Not accounting for relatedness (coancestry or
population substructure) will cause spurious
associations
LD mapping requires more elaborated modelling of the
genetic relatedness in the population
Association panel: a set of interconnected genotypes
known or unknown
population history
K
(
(
(
(
(
(
(
=
(
(
(
(
(
(
(

Genetic correlation between individuals: unstructured
K = kinship, coeff. of coancestry
pedigree
markers
Genetic correlation between individuals: structured
Identify sets of more or less
homogenous genotypes
Quantifying genetic relatedness / Structuring VCOV(G)
Bayesian clustering STRUCTURE (Pritchard et
al. 2000)
It can be computationally intensive
Population assumptions not always compatible with
plant populations (mainly developed to be used in
human population genetics)
Not always obvious how to define the model to use

0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
6
6
6
7
6
8
6
9
7
0
7
2
7
4
7
5
7
6
7
7
7
8
7
9
8
0
8
1 1 3 5 7 8 9
1
0
1
1
2
1
2
8
3
2
8
3
8
6
8
8
9
2
9
3
9
5
1
0
8
1
1
0
1
1
1
1
1
2
1
1
7
1
2
0
1
2
1
1
2
2
1
2
6
1
2
8
1
3
2
1
3
5
1
3
6
1
3
7
1
4
0
1
4
8
1
5
2
1
5
3
1
5
5
1
5
6
1
5
7
1
5
8
1
5
9
1
7
2
1
7
4
1
7
5
1
7
6
1
7
7
1
7
9
1
8
0
1
8
5
1
8
6
1
9
2 4
2
5
3
1
3
4
5
2
8
9
9
0
9
1
9
4
9
6
9
9
1
0
7
1
0
9
1
1
4
1
1
8
1
1
9
1
2
3
1
2
4
1
2
5
1
2
7
1
2
9
1
3
0
1
3
1
1
3
3
1
3
4
1
3
8
1
3
9
1
4
1
1
5
0
1
5
1
1
5
4
1
6
1
1
6
6
1
6
8
1
6
9
1
7
0
1
7
1
1
7
3
1
7
8
1
8
2
1
8
3
1
8
4
1
8
8
1
4
1
6
2
2
2
3
2
4
2
6
2
7
2
9
3
0
3
3
3
5
3
6
3
7
3
8
3
9
4
0
4
1
4
2
4
3
4
4
4
5
4
6
4
7
4
8
4
9
5
0
5
1
5
3
5
4
5
5
5
6
5
7
5
8
5
9
6
0
6
1
6
2
6
3
6
4
6
5
7
1
8
2
8
4
1
0
6
1
1
3
1
1
5
1
4
2
1
4
4
1
4
6
1
4
7
1
4
9
1
8
7
1
8
9
1
9
1
1
2
1
3
1
5
1
7
1
8
1
9
2
0
1
0
0
1
0
1
1
0
2
1
0
3
1
0
4
1
0
5
1
6
0
1
6
2
1
6
3
1
6
5
1
6
7
A
n
c
e
s
t r
y
Genotype
EMed
Turk
SWMed
NMed2
NMed6
Quantifying genetic relatedness / Structuring VCOV(G)
Classical multivariate approaches
Simple, fast
Similar results with STRUCTURE
Where to define boundaries between groups?
Other criteria (e.g. geographical origin)
Eigenanalysis
Eigenanalysis
PCA on genotype x marker scores matrix with a
formal test for the number of axes (dimensions)
No discrete groups, but set of PCs be used as
covariates in marker trait association analysis
When PCs are introduced in random part of a
mixed model, they will approximate the full
genetic relationship matrix
Straightforward, simple, and is easy to program in
a conventional statistical package
Mixed models and LD mapping in GenStat
LD mapping models should accommodate
the complex genetic relationships in the
population.
Mixed models are particularly suitable
(GenStat).
Suite of GenStat procedures developed to
run different models for LD mapping.
Procedures can be run from the GUI.
A mixed model for LD mapping
P = genotype + error
) , 0 ( ~ ) , 0 ( ~
2 2
o o N error N G
genotype
P = marker + genotype* + error
i i i
G P c + + =
i i i i
G x P c o + + + =
MM if 1
Mm if 0
mm if 1
=
=
=
i
i
i
x
x
x
A naive mixed model for LD mapping
This model assumes UNRELATED genotypes
Standard assumption:
) , 0 ( ~ ) , 0 ( ~
2 2
o o N error N G
genotype
Relationship matrix K=I
) , 0 ( ~
2
genotype
I N G o
This model ignores genetic relatedness/ population structure
(
(
(
(
(
(

=
1 0 0 0
1 0 0
1 0
1


K
P = marker + genotype* + error
i i i i
G x P c o + + + =
K should be in the model to correct for relatedness
Now the relationship matrix (K) is in the model
K = kinship matrix derived from pedigree/marker
information
Change model assumption:
) , 0 ( ~ ) 2 , 0 ( ~
2 2
o o N error K N G
g
Relationship matrix KI
(
(
(
(
(
(

=
II I I I
K
u u u u
u u u
u u
u


3 2 1
33 23 13
22 12
11
P = marker + genotype* + error
i i i i
G x P c o + + + =
LD mapping using eigenanalysis
The PC scores represent relatedness / population
structure
PCs impose approximate covariance structure
Computationally less intensive than full structuring of
VCOV(G)
) , 0 ( N ~ error ) , 0 ( N ~ G ) , 0 ( N ~ C
2 2
genotype
2
scores
o o o
P = marker + PCs + genotype* + error
i i
M
m , i i i
G C x P c o + + + + =

Population structure
This model imposes a common covariance between genotypes within
a group
Genotypes from different groups are still assumed unrelated
Groups from STRUCTURE or clustering
) , 0 ( ~ ) , 0 ( ~ ) , 0 ( ~
2 2 2
o o o N error N G N C
genotype group
Relationship matrix KI
Group 1
Group 2
P = marker + group + genotype + error
i k i k i i
G C x P c o + + + + =
) (
LD mapping in GenStat 13
Correcting for genetic relatedness: kinship vs
null
Correcting for genetic relatedness: PC scores vs null
LD decay plots
LD decay plots
No correction Correction for population structure
Response marker =
predictor marker +
error

Response marker =
PC scores / groups +
predictor marker +
error

LD image plots
No correction Correction for population structure
Genetic relatedness in segregating populations
Segregating populations
No selection
No mutation
No genetic drift
Simple model does not
work
More complex residual
structure
F1
100-1000 offsprings
Parent 2 Parent 1
QTL = Quantitative Trait Locus
Modelling genetic relatedness for QTL detection
y
i
quantitative trait response
x
i
genotypic covariable (marker information)
o additive marker effect
Residual random variation consists of
Genetic residual with a correlation structure
Relationship matrix
u
ij
coefficient of coancestry between genotypes i and j
Standard independent residual (experimental error)
i i i
i
G x y c o + + + =
*
) , 0 ( ~
) 2 , 0 ( ~
2
2
*
*
o c
o
N
K N G
i
G i
(
(
(
(
(
(

= E
II I I I
G
u u u u
u u u
u u
u
o


3 2 1
33 23 13
22 12
11
2
*
Chi-square test for segregation distortions
Allele frequencies show deviations from expectation
Genome-wide scan: plant height
Model including
genetic relatedness
(kinship)
information)
Model ignoring
relatedness
(kinship)
information)
Conclusions
Study of genetic relatedness crucial in LD
mapping
Kinship
Eigenanalysis
Clustering methods (including STRUCTURE)
Need to control for genetic relatedness when
assessing:
Marker marker association (LD decay)
Marker trait association (LD mapping)
GenStat procedures / GUI can be used to run all
these types of analyses

Vous aimerez peut-être aussi