Vous êtes sur la page 1sur 40

12th National Convention on Statistics (NCS)

EDSA Shangri-La Hotel, Mandaluyong City


October 1-2, 2013

PLANT BREEDING TOOLS: SOFTWARE FOR PLANT BREEDERS


by
Nellwyn Sales, Violeta Bartolome, Alexander Cañeda,
Alaine Gulles, Rose Imee Zhella Morantte, Leilani Nora,
Angel Manica Raquel, Christoffer Edd Relente,
Darwin Talay and Guoyou Ye

For additional information, please contact:

Author’s name: Nellwyn Sales


Designation: Specialist
Affiliation: International Rice Research Institute
Address: College, Los Baños, Laguna
Tel. no.: 5362701 loc 2238
E-mail: n.sales@irri.org

Co-author’s name Violeta Bartolome, Alexander Cañeda, Alaine Gulles, Rose Imee
Zhella Morantte, Leilani Nora, Angel Manica Raquel, Christoffer
Edd Relente, Darwin Talay and Guoyou
Designation: Senior Associate Scientist - Biometrics, Senior Specialist -
Software Engineering, Assistant Scientist, Specialist, Assistant
Scientist, Programmer, Programmer, Programmer and Senior
Scientist - Breeding Informatics Specialist
Affiliation: International Rice Research Institute
Address: College, Los Baños, Laguna
Tel. no.: 5362701 loc 2238
E-mail: v.bartolome@irri.org, a.caneda@irri.org, a.gulles@irri.org,
r.morantte@irri.org, l.nora@irri.org, a.raquel@irri.org,
c.relente@irri.org, d.talay@irri.org and g.ye@irri.org
PBTools: Software for Plant Breeders

Nellwyn Sales, Violeta Bartolome, Alexander Cañeda, Darwin Talay, Alaine Gulles,
Rose Imee Zhella Morantte, Leilani Nora, Angel Manica Raquel,
Christoffer Edd Relente and Guoyou Ye

Abstract

Data from plant breeding trials need to be analyzed properly, with greater speed
to support selection decision making. Genetic information should also be derived from
breeding trials to determine more efficient breeding strategies. Although general
statistical software can be used to analyze breeding trials, many practical breeders are
seeking easy-to-use analytical tools.

PBTools is a free statistical application developed primarily for plant breeders. It


was created using the Eclipse Rich Client Platform (RCP) and R language. It has a user-
friendly graphical user interface (GUI). Its current version provides modules for the
following: data management in spreadsheet view, randomization for commonly used
experimental designs, single- and multiple-environment analysis, QTL analysis, selection
index, commonly used mating designs, and generation mean analysis, some of which
are presented in the paper.

Keywords: statistical software, plant breeding

Introduction

Plant breeding research generally involves use of statistical techniques to


determine valid conclusions which aid in selection decision making. Users depend on
statistical software to generate reliable results quickly. However, most software are
developed for general statistical analysis only and are costly. Some are freeware but
have limited functionality or are difficult to use.

R is a language and an environment for statistical computing and graphics. Many


classical and modern statistical methods have been well implemented in R, mainly
through hundreds of add-on packages contributed by leading statisticians. New add-on
packages can be rapidly developed by exploring the functions of existing packages. R
has become increasingly popular. However, some users find its command line interface
challenging. Software with a GUI involving menus, dialog boxes, and spreadsheets is
generally more preferred.

Plant Breeding Tools (PBTools) is a free statistical application created using the
Eclipse Rich Client Platform (RCP), a platform for building and deploying rich client
applications, and R language. It has been developed to assist plant breeders in the
design and analysis of data. It has an easy to navigate GUI that does not require users
to have programming skills to perform data manipulation and analysis.

Its current version provides modules for data management in spreadsheet view,
randomization for commonly used experimental designs, single- and multiple-
environment analysis, QTL analysis, selection index, commonly used mating designs,
and generation mean analysis.
An introduction of the features of PBTools is provided in this paper.

PBTools Environment

Main Window

The PBTools main window (Figure 1) has a menu bar which houses five items:
Project, Data, Analysis, Randomization, and Help. The Project menu contains functions
for creating and managing projects. The Data menu contains functions for reading,
managing and manipulating datasets. The Analysis Menu contains functions to perform
statistical analysis. The Randomization Menu contains functions for generating random
assignment of factor levels for commonly used experimental designs in plant breeding.
Finally, the Help Menu is used to access PBTools’ user’s manual and some information
about the software.

Figure 1. PBTools Main Window.

Figure 2 presents the submenu items under the Data, Analysis, and
Randomization menus.

2
Figure 2. Submenu items for the Data, Analysis, and Randomization menus.

The PBTools main window is divided into two panels: the Project Explorer panel
and the Editor panel. The Project Explorer panel functions as a file manager of the active
project, where names of data files and analysis results files are displayed in tree form,
while the Editor panel serves as viewer for selected data (by means of the Data Viewer
tab) and/or results of analysis (via the Results Viewer tab).

When the name of a data file is selected by double-clicking in the Project


Explorer tree, the file is displayed in a Data Viewer tab in spreadsheet form in the Editor
panel (Figure 3). Data values can be edited in the Data Viewer. Data manipulation can
also be performed here using options available in a toolbar or the submenu items under
the Data menu.

Figure 3. Data Viewer Tab in the Editor Panel of PBTools.

When an analysis is performed or the filename of an output file is selected under


the Output folder in the Project Explorer tree, the results are displayed in a Results
Viewer tab (Figure 4). Depending on the contents of the results folder, the Results
Viewer may have an Output page and/or Graph page.

3
Figure 4. Results Viewer (Output and Graph Tabs) in the Editor Panel of PBTools.

Handling Data

PBTools uses comma-separated values (csv) format for data files. Data files are
created outside of PBTools and imported into the Data folder of the active project. Data
files (.rda or .txt) may also be imported into PBTools but they will be automatically
converted into .csv format. To represent missing observations, the user can use “NA”,
period, blank or space.

A selected file is displayed in a Data Viewer tab in the Editor panel. Several Data
Viewers can be seen simultaneously inside the Editor Panel.

Data values can be edited in the Data Viewer. Data manipulation can also be
performed using options available in a toolbar in the Data Viewer (Figure 5) or the
submenu items under the Data menu. These options are: inserting row(s)/column(s)
from the data, deleting row(s)/column(s) from the data, creating a new variable, editing
variable information, sorting, aggregating, reshaping, merging, and appending data sets
and creating data subset.

Figure 5. Toolbar for Data Manipulation in the Data Viewer Tab.

4
Randomization for Some Experimental Designs

The Randomization menu allows users to generate randomization of treatments


for single- and multi-factor designs. Randomization in PBTools is currently available for
the following designs: randomized complete block (RCB), lattice, alpha-lattice,
augmented Latin square, and augmented RCB design. Depending on the design, the
user needs to provide information on a dialog box (Figure 6).

Figure 6. Sample Dialog Box for Generating Randomization


and Layout.

A field book in csv format is created (Figure 7) and the layout (saved in a text file)
is displayed in the Data Viewer (Figure 8).

5
Figure 7. Sample Field Book after the Randomization.

Figure 8. Sample Layout after the Randomization.

6
Analysis

Prior to doing analysis in PBTools, a data set must first be selected and opened
in the Data Viewer. When the needed data set is active, the desired submenu option can
be selected from the Analysis menu.

Single-environment analysis

Analyses using mixed models for the following designs are available in PBTools:
Randomized Complete Block (RCB), Augmented RCB, Augmented Latin Square, Alpha-
Lattice and Row-Column. The user should specify in the dialog box some required
information and preferred output or graphs to be generated. If the Environment field is
specified, the analysis will be done per environment level. Otherwise, the data will be
treated as if it came from one environment. The user also has the option to regard
genotype as fixed or random factor while the remaining terms in the model are regarded
as random. As illustration, Figures 9 to 11 show the filled-up tabs for a sample analysis
in RCB.

Figure 9. Model Specifications Tab of Single-Environment


Analysis Dialog Box.

7
Figure 10. Options Tab of Single-Environment Analysis Dialog Box.

Figure 11. Graph Tab of Single-Environment Analysis Dialog Box.

8
After providing necessary information and desired options in the dialog box, the
text and graph outputs will be displayed separately in tabs in the Editor panel. Additional
csv files containing the computed residuals and summary statistics (if genotype is fixed)
or predicted means (if genotype is random) are saved inside the results folder. These
generated files can be accessed through the Project Explorer.

Figure 12 shows the sample partial text output of the single-environment


analysis. If genotype is regarded as fixed, the output of the analysis includes the
following: some descriptive statistics of the response variable per environment level;
data summary; estimates of the variance components of the model; test for the
significance of genotypic effect wherein the denominator degrees of freedom in F test is
computed according to a general Satterthwaite approximation; least-square means of
the genotypes; summary statistics of the standard errors of the difference; and, pairwise
mean comparison using Dunnett's procedure if comparing treatments with a control and
HSD if performing all pairwise mean comparisons. If genotype is regarded as random,
the output of the analysis includes the following: some descriptive statistics of the
response variable per environment level; data summary; estimates of the variance
components of the model; test for the significance of genotypic effect using -2
loglikelihood ratio test; test for the significance of the check/control effect, predicted
genotype means derived using the Best Linear Unbiased Prediction (BLUP); least-
square means of the checks (if checks/controls are specified); and, estimate of
heritability. Genotypic and phenotypic correlation can be performed if two or more
response variables are specified.

DATA FILE: E:/NSALES/pbtools workspace/testing/Data/2013DS_IRvarieties.csv

SINGLE-ENVIRONMENT ANALYSIS

DESIGN: Randomized Complete Block (RCB)

==============================
GENOTYPE AS: Fixed
==============================

------------------------------
RESPONSE VARIABLE: YIELD
------------------------------

DESCRIPTIVE STATISTICS:

Variable ENV N_NonMissObs Mean StdDev


1 YIELD 1 100 4349.44 781.2752
2 YIELD 2 100 4338.00 791.9981

------------------------------
ANALYSIS FOR: ENV = 1
------------------------------

DATA SUMMARY:

Number of observations read: 100


Number of observations used: 100

Factors Number of Levels Levels


ENTRY 50 IR36 IR64 IR72 ... IRRI168
REP 2 1 2

9
VARIANCE COMPONENTS TABLE:

Groups Variance Std.Dev.


1 REP 3029.013 55.03647
2 Residual 239417.453 489.30303

TESTING FOR THE SIGNIFICANCE OF GENOTYPIC EFFECT:

Analysis of Variance Table with Satterthwaite Denominator Df


Df Sum Sq Mean Sq F value Denom Pr(>F)
ENTRY 49 48306379 985844.5 4.1177 49 0.0000

GENOTYPE LSMEANS AND STANDARD ERRORS:

ENTRY LSMean StdErrMean


1 IR36 3247.5 348.1689
2 IR64 4461.0 348.1689
3 IR72 3603.5 348.1689
4 IRRI102 4109.5 348.1689
5 IRRI103 3826.5 348.1689
6 IRRI104 4336.5 348.1689
7 IRRI105 4193.0 348.1689
8 IRRI106 4091.5 348.1689
9 IRRI108 4470.5 348.1689
10 IRRI109 4697.5 348.1689
11 IRRI112 4443.0 348.1689
12 IRRI113 4807.5 348.1689
13 IRRI115 4262.5 348.1689
14 IRRI116 5218.0 348.1689
15 IRRI117 4571.5 348.1689
16 IRRI118 5056.5 348.1689
17 IRRI119 5530.5 348.1689
18 IRRI120 5016.5 348.1689
19 IRRI122 4779.5 348.1689
20 IRRI123 5193.5 348.1689
21 IRRI124 3657.0 348.1689
22 IRRI125 3229.0 348.1689
23 IRRI127 3313.0 348.1689
24 IRRI128 2962.5 348.1689
25 IRRI133 3386.5 348.1689
26 IRRI134 3765.5 348.1689
27 IRRI135 4742.0 348.1689
28 IRRI136 4650.5 348.1689
29 IRRI139 3468.0 348.1689
30 IRRI140 4172.5 348.1689
31 IRRI141 5002.0 348.1689
32 IRRI143 4403.0 348.1689
33 IRRI145 3907.0 348.1689
34 IRRI146 5871.5 348.1689
35 IRRI147 4368.5 348.1689
36 IRRI148 4315.0 348.1689
37 IRRI149 4400.0 348.1689
38 IRRI150 5076.5 348.1689
39 IRRI151 4041.0 348.1689
40 IRRI152 3323.0 348.1689
41 IRRI154 5120.0 348.1689
42 IRRI155 3490.0 348.1689
43 IRRI156 5182.0 348.1689
44 IRRI157 3134.0 348.1689

10
45 IRRI161 4840.0 348.1689
46 IRRI162 4863.5 348.1689
47 IRRI163 4328.5 348.1689
48 IRRI164 4943.0 348.1689
49 IRRI165 4058.5 348.1689
50 IRRI168 5542.5 348.1689

STANDARD ERROR OF THE DIFFERENCE (SED):

Estimate
Minimum 489.3030
Average 489.3030
Maximum 489.3030

SIGNIFICANT PAIRWISE COMPARISONS (IF ANY):


Compared with control(s)

Trmt[i] Trmt[j] Difference Lower Upper


1 IRRI113 IR36 1560.0 29.371879 3090.628
2 IRRI116 IR36 1970.5 439.871879 3501.128
3 IRRI118 IR36 1809.0 278.371879 3339.628
4 IRRI119 IR36 2283.0 752.371879 3813.628
5 IRRI120 IR36 1769.0 238.371879 3299.628
6 IRRI122 IR36 1532.0 1.371879 3062.628
7 IRRI123 IR36 1946.0 415.371879 3476.628
8 IRRI141 IR36 1754.5 223.871879 3285.128
9 IRRI146 IR36 2624.0 1093.371879 4154.628
10 IRRI150 IR36 1829.0 298.371879 3359.628
11 IRRI154 IR36 1872.5 341.871879 3403.128
12 IRRI156 IR36 1934.5 403.871879 3465.128
13 IRRI161 IR36 1592.5 61.871879 3123.128
14 IRRI162 IR36 1616.0 85.371879 3146.628
15 IRRI164 IR36 1695.5 164.871879 3226.128
16 IRRI168 IR36 2295.0 764.371879 3825.628
17 IRRI116 IR72 1614.5 82.445583 3146.554
18 IRRI119 IR72 1927.0 394.945583 3459.054
19 IRRI123 IR72 1590.0 57.945583 3122.054
20 IRRI146 IR72 2268.0 735.945583 3800.054
21 IRRI156 IR72 1578.5 46.445583 3110.554
22 IRRI168 IR72 1939.0 406.945583 3471.054

==============================
GENOTYPE AS: Random
==============================

------------------------------
RESPONSE VARIABLE: YIELD
------------------------------

DESCRIPTIVE STATISTICS:

Variable ENV N_NonMissObs Mean StdDev


1 YIELD 1 100 4349.44 781.2752
2 YIELD 2 100 4338.00 791.9981

------------------------------
ANALYSIS FOR: ENV = 1
------------------------------

11
DATA SUMMARY:

Number of observations read: 100


Number of observations used: 100

Factors Number of Levels Levels


ENTRY 50 IR36 IR64 IR72 ... IRRI168
REP 2 1 2

VARIANCE COMPONENTS TABLE:

Groups Variance Std.Dev.


1 Test:Check 365200.468 604.31818
2 REP 3029.351 55.03954
3 Residual 239417.912 489.30350

TESTING FOR THE SIGNIFICANCE OF GENOTYPIC EFFECT USING -2 LOGLIKELIHOOD RATIO


TEST:

Formula for Model1: YIELD ~ 1 + Check + (1|REP) + (1|Test:Check)


Formula for Model2: YIELD ~ 1 + Check + (1|REP)

AIC BIC logLik Chisq Df Pr(>Chisq)


Model2 1566.855 1582.486 -777.4273
Model1 1546.951 1565.187 -766.4756 21.9034 1 0.0000

TESTING FOR THE SIGNIFICANCE OF CHECK EFFECT:

Analysis of Variance Table with Satterthwaite Denominator Df


Df Sum Sq Mean Sq F value Denom Pr(>F)
Check 3 912075.5 304025.2 1.2699 45.9965 0.2958

PREDICTED MEANS:

ENTRY Means
1 IRRI102 4177.854
2 IRRI103 3964.718
3 IRRI104 4348.815
4 IRRI105 4240.740
5 IRRI106 4164.297
6 IRRI108 4449.734
7 IRRI109 4620.695
8 IRRI112 4429.023
9 IRRI113 4703.539
10 IRRI115 4293.083
11 IRRI116 5012.700
12 IRRI117 4525.800
13 IRRI118 4891.069
14 IRRI119 5248.053
15 IRRI120 4860.944
16 IRRI122 4682.452
17 IRRI123 4994.248
18 IRRI124 3837.062
19 IRRI125 3514.722
20 IRRI127 3577.985
21 IRRI128 3314.012
22 IRRI133 3633.340
23 IRRI134 3918.777
24 IRRI135 4654.209

12
25 IRRI136 4585.298
26 IRRI139 3694.720
27 IRRI140 4225.301
28 IRRI141 4850.023
29 IRRI143 4398.898
30 IRRI145 4025.345
31 IRRI146 5504.871
32 IRRI147 4372.915
33 IRRI148 4332.622
34 IRRI149 4396.638
35 IRRI150 4906.132
36 IRRI151 4126.264
37 IRRI152 3585.516
38 IRRI154 4938.893
39 IRRI155 3711.289
40 IRRI156 4985.587
41 IRRI157 3443.174
42 IRRI161 4728.016
43 IRRI162 4745.715
44 IRRI163 4342.789
45 IRRI164 4805.589
46 IRRI165 4139.444
47 IRRI168 5257.091

CHECK/CONTROL LSMEANS:

ENTRY LSMean StdErrMean


1 IR36 3247.5 697.4388
2 IR64 4461.0 697.4388
3 IR72 3603.5 697.4388

HERITABILITY:

0.75

==============================

GENOTYPIC CORRELATIONS:

Site: 1
YIELD PLTHGT
YIELD 0.4687
PLTHGT 0.4687

PHENOTYPIC CORRELATIONS:

Site: 1
YIELD PLTHGT
YIELD 0.4342
PLTHGT 0.4342

==============================

Figure 12. Sample Partial Text Output of the Single-environment Analysis.

Generated graphs can be viewed by clicking the Graph Tab of the displayed
results folder. Sample generated graphs are shown in Figure 13. The distribution of the

13
values of the response variable can be assessed by looking at the boxplot and
histogram while for the distribution of the residuals, diagnostic plots and heatmap are
available.

Figure 13. Sample graph outputs for Single-Environment Analysis.

14
Multi-environment analysis

For experiments done in multiple environments, analysis can be done across


environments. Two options are available for multi-environment analysis: one-stage and
two-stage. One-stage analysis requires raw data while two-stage analysis requires
summary data generated from single environment analysis as input. In this paper, only
one-stage analysis is presented.

One-stage analysis is available for RCB, Alpha-Lattice and Row-Column


designs. Sample filled-up tabs for RCB are shown in Figures 14 to 16.

Figure 14. Model Specifications Tab of Multi-Environment Analysis


(One-Stage) Dialog Box.

15
Figure 15. Options Tab of Multi-Environment Analysis (One-Stage)
Dialog Box.

Figure 16. Graph Tab of Multi-Environment Analysis (One-Stage) Dialog Box.

16
After processing the analysis, a text output is displayed in the Editor panel.
Figure 17 shows the sample partial result of the analysis. If genotype is regarded as
fixed, the output of the analysis includes the following: data summary; some descriptive
statistics of the response variable; estimates of the variance components of the model;
test for the significance of genotypic effect wherein the denominator degrees of freedom
in F test is computed according to a general Satterthwaite approximation; test for the
significance of environment and genotypic × environment effects using -2 loglikelihood
ratio test; genotypic × environment means; least-square means of the genotypes;
summary statistics of the standard errors of the difference; pairwise mean comparison
using Dunnett's procedure if comparing treatments with a control and HSD if performing
all pairwise mean comparisons; stability analysis using Finlay-Wilkinson model and
Shukla’s model; and, additive main effects and multiplicative interaction (AMMI) analysis.
If genotype is regarded as random, the output of the analysis includes the following: data
summary; some descriptive statistics of the response variable; estimates of the variance
components of the model; test for the significance of genotypic, environment and
genotypic × environment effects using -2 loglikelihood ratio test; genotypic ×
environment means; predicted genotype means derived using the Best Linear Unbiased
Prediction (BLUP); and, estimate of heritability.

DATA FILE: E:/NSALES/pbtools workspace/testing/Data/RCB_ME.csv

MULTI-ENVIRONMENT ANALYSIS (ONE-STAGE)

DESIGN: Randomized Complete Block (RCB)

==============================
GENOTYPE AS: Fixed
==============================

------------------------------
RESPONSE VARIABLE: Yield
------------------------------

DATA SUMMARY:

Number of observations read: 495


Number of observations used: 495

Factors Number of Levels Levels


Env 11 E1 E10 E11 ... E9
Genotype 15 GEN1 GEN10 GEN11 ... GEN9
Block 3 1 2 3

DESCRIPTIVE STATISTICS:

Variable N_NonMissObs Mean StdDev


1 Yield 495 1718.596 1117.791

VARIANCE COMPONENTS TABLE:

Groups Variance Std.Dev.


1 Genotype:Env 664747.696 815.32061
2 Block:Env 6497.161 80.60497
3 Env 530979.863 728.68365
4 Residual 94709.288 307.74874

17
TESTING FOR THE SIGNIFICANCE OF GENOTYPIC EFFECT:

Analysis of Variance Table with Satterthwaite Denominator Df


Df Sum Sq Mean Sq F value Denom Pr(>F)
Genotype 14 1386210 99015 1.0455 139.9947 0.4127

TESTING FOR THE SIGNIFICANCE OF ENVIRONMENT EFFECT USING -2 LOGLIKELIHOOD RATIO


TEST:

Formula for Model1: Yield ~ 1 + Genotype + (1|Env) + (1|Block:Env) + (1|Genotype:Env)


Formula for Model2: Yield ~ 1 + Genotype + (1|Block:Env) + (1|Genotype:Env)

AIC BIC logLik Chisq Df Pr(>Chisq)


Model2 7513.743 7589.425 -3738.871
Model1 7457.593 7537.480 -3709.797 58.1496 1 0.0000

TESTING FOR THE SIGNIFICANCE OF GENOTYPE X ENVIRONMENT EFFECT USING -2


LOGLIKELIHOOD RATIO TEST:

Formula for Model1: Yield ~ 1 + Genotype + (1|Env) + (1|Block:Env) + (1|Genotype:Env)


Formula for Model2: Yield ~ 1 + Genotype + (1|Env) + (1|Block:Env)

AIC BIC logLik Chisq Df Pr(>Chisq)


Model2 7942.529 8018.211 -3953.264
Model1 7457.593 7537.480 -3709.797 486.9356 1 0.0000

GENOTYPE X ENVIRONMENT MEANS: (some rows are deleted)

Genotype Env Yield_means


1 GEN1 E1 2958.1639
2 GEN1 E5 352.6978
3 GEN1 E4 1294.7984
4 GEN1 E11 2072.8784
5 GEN1 E7 514.8013
6 GEN1 E3 1373.9847
7 GEN1 E6 1800.7020
8 GEN1 E10 1026.3886
9 GEN1 E2 782.0394
10 GEN1 E9 4034.1663
.
.
.
165 GEN9 E9 1517.3235

GENOTYPE LSMEANS AND STANDARD ERRORS:

Genotype LSMean StdErrMean


1 GEN1 1500.965 334.2864
2 GEN10 1804.918 334.2864
3 GEN11 1902.555 334.2864
4 GEN12 2023.497 334.2864
5 GEN13 1953.552 334.2864
6 GEN14 2163.924 334.2864
7 GEN15 1829.575 334.2864
8 GEN2 1826.130 334.2864
9 GEN3 1615.296 334.2864
10 GEN4 1557.518 334.2864

18
11 GEN5 1246.419 334.2864
12 GEN6 1606.112 334.2864
13 GEN7 1523.806 334.2864
14 GEN8 1871.903 334.2864
15 GEN9 1352.776 334.2864

STANDARD ERROR OF THE DIFFERENCE (SED):

Estimate
Minimum 355.8134
Average 355.8134
Maximum 355.8134

SIGNIFICANT PAIRWISE COMPARISONS (IF ANY):


Compared with control(s)

(No significant pairwise comparisons.)

STABILITY ANALYSIS USING FINLAY-WILKINSON MODEL:

Slope SE t.value Prob MSReg MSDev


GEN1 1.2781068 0.2844126 4.493847 1.502182e-03 9332417.6 462123.09
GEN10 0.6527849 0.2813760 2.319973 4.548779e-02 2434445.5 452307.80
GEN11 1.6932433 0.2287290 7.402837 4.091093e-05 16379423.5 298883.70
GEN12 1.6800025 0.2291829 7.330400 4.418676e-05 16124257.8 300071.21
GEN13 1.0057795 0.3020917 3.329384 8.807711e-03 5779170.2 521359.88
GEN14 1.5848878 0.5707898 2.776658 2.151802e-02 14350165.8 1861282.35
GEN15 1.0948669 0.3462029 3.162501 1.150386e-02 6848296.6 684733.14
GEN2 1.1363548 0.2719630 4.178344 2.381936e-03 7377136.1 422551.36
GEN3 0.8969413 0.4169984 2.150947 5.994425e-02 4596083.4 993410.28
GEN4 0.3024868 0.2899820 1.043123 3.240997e-01 522724.4 480398.97
GEN5 0.7151867 0.1240071 5.767305 2.703442e-04 2922125.1 87852.27
GEN6 0.6888218 0.2543126 2.708563 2.405402e-02 2710651.2 369484.01
GEN7 0.6136909 0.2464923 2.489696 3.443918e-02 2151588.5 347109.54
GEN8 0.9025519 0.3482402 2.591751 2.912969e-02 4653763.4 692815.70
GEN9 0.7542942 0.2547682 2.960708 1.594118e-02 3250434.9 370809.14

STABILITY ANALYSIS USING SHUKLA'S MODEL:

lower est. upper


GEN1 264.805740 703.6167 1869.584
GEN10 16.424694 738.2801 33185.250
GEN11 151.126221 937.3589 5813.959
GEN12 135.445669 943.9686 6578.851
GEN13 177.323849 786.7224 3490.405
GEN14 182.708778 1427.7333 11156.674
GEN15 211.352595 953.4036 4300.767
GEN2 163.700619 778.2112 3699.514
GEN3 125.120648 944.0875 7123.534
GEN4 5.338358 687.2203 88467.608
GEN5 3.165181 218.8051 15125.728
GEN6 46.682322 511.4079 5602.508
GEN7 28.716855 497.1291 8606.003
GEN8 160.228588 787.3109 3868.589
GEN9 124.674720 621.5542 3098.701

19
AMMI ANALYSIS:
Percentage of Total Variation Accounted for by the Principal Components:

percent acum Df Sum.Sq Mean.Sq F.value Pr.F


PC1 38.2 38.2 23 101890340.4 4430014.80 46.77 0.0000
PC2 23.9 62.1 21 63620736.9 3029558.90 31.99 0.0000
PC3 19.2 81.3 19 51267271.5 2698277.45 28.49 0.0000
PC4 8.7 90.0 17 23217503.0 1365735.47 14.42 0.0000
PC5 5.4 95.4 15 14268487.9 951232.52 10.04 0.0000
PC6 2.4 97.8 13 6282476.7 483267.44 5.10 0.0000
PC7 1.2 99.0 11 3123773.8 283979.43 3.00 0.0008
PC8 0.6 99.6 9 1635022.9 181669.21 1.92 0.0487
PC9 0.4 100.0 7 1129431.9 161347.42 1.70 0.1084
PC10 0.0 100.0 5 100764.4 20152.87 0.21 0.9582
PC11 0.0 100.0 3 0.0 0.00 0.00 1.0000

==============================
GENOTYPE AS: Random
==============================

------------------------------
RESPONSE VARIABLE: Yield
------------------------------

DATA SUMMARY:

Number of observations read: 495


Number of observations used: 495

Factors Number of Levels Levels


Env 11 E1 E10 E11 ... E9
Genotype 15 GEN1 GEN10 GEN11 ... GEN9
Block 3 1 2 3

DESCRIPTIVE STATISTICS:

Variable N_NonMissObs Mean StdDev


1 Yield 495 1718.596 1117.791

VARIANCE COMPONENTS TABLE:

Groups Variance Std.Dev.


1 Genotype:Env 664747.522 815.32050
2 Block:Env 6497.219 80.60533
3 Genotype 2876.978 53.63747
4 Env 530984.967 728.68715
5 Residual 94709.311 307.74878

TESTING FOR THE SIGNIFICANCE OF GENOTYPIC EFFECT USING -2 LOGLIKELIHOOD RATIO


TEST:

Formula for Model1: Yield ~ 1 + (1|Genotype) + (1|Env) + (1|Block:Env) + (1|Genotype:Env)


Formula for Model2: Yield ~ 1 + (1|Env) + (1|Block:Env) + (1|Genotype:Env)

AIC BIC logLik Chisq Df Pr(>Chisq)


Model2 7627.446 7648.469 -3808.723
Model1 7629.433 7654.661 -3808.717 0.0127 1 0.9102

20
TESTING FOR THE SIGNIFICANCE OF ENVIRONMENT EFFECT USING -2 LOGLIKELIHOOD RATIO
TEST:

Formula for Model1: Yield ~ 1 + (1|Genotype) + (1|Env) + (1|Block:Env) + (1|Genotype:Env)


Formula for Model2: Yield ~ 1 + (1|Genotype) + (1|Block:Env) + (1|Genotype:Env)

AIC BIC logLik Chisq Df Pr(>Chisq)


Model2 7686.930 7707.953 -3838.465
Model1 7629.433 7654.661 -3808.717 59.4969 1 0.0000

TESTING FOR THE SIGNIFICANCE OF GENOTYPE X ENVIRONMENT EFFECT USING -2


LOGLIKELIHOOD RATIO TEST:

Formula for Model1: Yield ~ 1 + (1|Genotype) + (1|Env) + (1|Block:Env) + (1|Genotype:Env)


Formula for Model2: Yield ~ 1 + (1|Genotype) + (1|Env) + (1|Block:Env)

AIC BIC logLik Chisq Df Pr(>Chisq)


Model2 8114.369 8135.392 -4052.184
Model1 7629.433 7654.661 -3808.717 486.9356 1 0.0000

GENOTYPE X ENVIRONMENT MEANS: (some rows are deleted)

Genotype Env Yield_means


1 GEN1 E1 2967.6279
2 GEN1 E5 362.1033
3 GEN1 E4 1304.2393
4 GEN1 E11 2082.3420
5 GEN1 E7 524.2095
6 GEN1 E3 1383.4127
7 GEN1 E6 1810.1753
.
.
.
165 GEN9 E9 1533.2220

PREDICTED GENOTYPE MEANS:

Genotype Mean
1 GEN1 1709.135
2 GEN10 1722.349
3 GEN11 1726.594
4 GEN12 1731.851
5 GEN13 1728.811
6 GEN14 1737.956
7 GEN15 1723.421
8 GEN2 1723.271
9 GEN3 1714.106
10 GEN4 1711.594
11 GEN5 1698.069
12 GEN6 1713.706
13 GEN7 1710.128
14 GEN8 1725.261
15 GEN9 1702.693

HERITABILITY:

0.05

Figure 17. Sample Partial Text Output of Multi-environment (One-stage) Analysis.

21
Sample generated graphs are shown in Figure 18. Boxplot and histogram are
available for the evaluation of the distribution of the values of the response variable while
diagnostic plots for the distribution of the residuals. If AMMI analysis is requested, biplots
are generated to aid in the assessment of the interaction between genotype and
environment.

Figure 18. Sample graph outputs of Multi-environment (One-stage) Analysis.

22
Quantitative Trait Locus (QTL) Analysis

To link certain complex traits to specific regions of the chromosomes, quantitative


trait locus (QTL) analysis can be used. In the current version of PBTools, QTL analysis
is done per environment level only. The following files are needed as input: a csv file
containing phenotypic data which can be either the predicted means or raw data, a tab-
delimited txt file containing genotypic data and a tab-delimited txt file containing the
genetic map. Both the genotypic data and genetic map files should be in the Flapjack
format. The dialog box for QTL analysis with raw data as input are shown in Figures 19
to 21.

Figure 19. Data Input Tab of QTL Analysis Dialog Box.

Figure 20. Model Specifications Tab of QTL Analysis


Dialog Box.

23
Figure 21. Options Tab of QTL Analysis Dialog Box.

A text file is generated after the analysis. Figure 22 shows the sample partial text
output which contains the following: results of the single-environment analysis; LOD
scores of all the markers; and statistics on the selected/significant markers.

DATA FILE: E:/NSALES/pbtools workspace/SampleProject/Data/QTL_pheno.csv

SINGLE-ENVIRONMENT ANALYSIS

DESIGN: Randomized Complete Block (RCB)

------------------------------
RESPONSE VARIABLE: HEIGHT
------------------------------

DESCRIPTIVE STATISTICS:

Variable ENV N_NonMissObs Mean StdDev


1 HEIGHT 1 606 88.69455 31.62887
2 HEIGHT 2 606 86.69455 31.62887

------------------------------
ANALYSIS FOR: ENV = 1
------------------------------

DATA SUMMARY:

Number of observations read: 606


Number of observations used: 606

Factors Number of Levels Levels


GENOTYPE 202 G_001 G_002 G_003 ... P2
REP 3 1 2 3

24
VARIANCE COMPONENTS TABLE:

Groups Variance Std.Dev.


1 REP 611.3356 24.72520
2 Residual 302.7346 17.39927

TESTING FOR THE SIGNIFICANCE OF GENOTYPIC EFFECT:

Analysis of Variance Table with Satterthwaite Denominator Df


Df Sum Sq Mean Sq F value Denom Pr(>F)
GENOTYPE 201 235948.7 1173.874 3.8776 401.9975 0.0000

GENOTYPE LSMEANS AND STANDARD ERRORS: (some rows are deleted)

GENOTYPE LSMean StdErrMean


1 G_001 106.09469 17.45067
2 G_002 69.11523 17.45067
3 G_003 131.68455 17.45067
4 G_004 116.59435 17.45067
5 G_005 110.58281 17.45067
6 G_006 65.38761 17.45067
7 G_007 77.01282 17.45067
8 G_008 99.30493 17.45067
9 G_009 90.64883 17.45067
10 G_010 61.68259 17.45067
.
.
.
202 P2 77.78512 17.45067

STANDARD ERROR OF THE DIFFERENCE (SED):

Estimate
Minimum 14.2064
Average 14.2064
Maximum 14.2064

==============================

QTL ANALYSIS

METHOD: CIM

------------------------------
RESPONSE VARIABLE: HEIGHT
------------------------------
------------------------------
ANALYSIS FOR: ENV = 1
------------------------------

QTL RESULT (ALL): (some rows are deleted)


marker Chr Pos LOD
1 M_0001 1 0 0.998238082
2 M_0002 1 1 1.029862513
3 M_0006 1 5 1.097213383
4 1_loc10 1 10 0.683066798
5 M_0018 1 17 0.127036203
6 1_loc20 1 20 0.107014795
7 M_0022 1 21 0.098035804

25
8 1_loc30 1 30 0.126238502
9 M_0032 1 31 0.124952527
10 1_loc40 1 40 0.675666296
11 M_0042 1 41 7.910819471
12 1_loc50 1 50 9.507940316
13 M_0053 1 52 8.808176807
14 M_0056 1 55 7.075115150
15 M_0058 1 57 6.531686512
16 1_loc60 1 60 0.023835142
17 M_0062 1 61 0.010078575
18 M_0063 1 62 0.497989434
19 M_0066 1 65 0.511330577
20 M_0069 1 68 1.022791170
21 1_loc70 1 70 0.849382305
22 M_0076 1 75 0.286639058
23 M_0081 1 80 0.157442187
24 M_0083 1 82 1.915089343
25 M_0085 1 84 2.384399583
.
.
.
284 M_1087 7 150 0.154691409

QTL RESULT (SELECTED):


marker Chr Pos LOD m.eff Rsq
1 1_loc50 1 50 9.507940 8.388484 0.03366559
2 2_loc10 2 10 7.066201 -6.655585 0.05975959
3 M_0340 3 42 6.479474 -4.977133 0.07389893
4 M_0543 4 74 3.921155 4.776014 0.09046470
5 6_loc120 6 120 3.699853 3.778455 0.14220448
6 M_0989 7 52 10.649286 7.272148 0.17627526

Figure 22. Sample Partial Text Output of QTL Analysis

Sample generated graphs are shown in Figure 23. These include heatmap of
LOD scores and recombination fractions, plot of pairwise genotypic differences, marker
map, visualization of genotypes, plot of missing genotypes and QTL maps.

26
27
Figure 23. Sample graph outputs of QTL analysis.

Selection Index Analysis

To be able to select favorable genotypes based on several traits, a linear function


of these traits referred to as selection index can be used. Four phenotypic selection
indices and two molecular selection indices are available in PBTools. If any of the
phenotypic selection indices is selected, a csv file containing the weights is needed while
for the molecular selection indices, weights, marker and QTL files should be specified.
The dialog box for selection index analysis appears in Figure 24.

Figure 24. Dialog box for Selection Index Analysis.

28
Figure 25 shows the sample output of the analysis. The output includes genetic
and phenotypic correlation matrices, molecular covariance matrix, statistics on the
values of the selection index and breeding values, characteristics of the selected
individuals, and values of the selection index for all individuals.

DATA FILE: E:/Program Files/PBTools/Projects/SampleProject/Data/SI_ traits.csv

Lande and Thompson Selection Index

DESIGN: Lattice

Genetic and Phenotypic Correlation Matrices

GENETIC CORRELATION MATRIX

MFL1 FFL1 EHT1 PHT1 GY1 MFL.2 FFL2 EHT2 PHT2 GY2
MFL1 1.00 0.88 0.19 -0.33 -0.62 0.96 0.88 0.23 -0.25 -0.36
FFL1 0.88 1.00 0.18 -0.20 -0.77 0.83 0.68 0.15 -0.28 -0.41
EHT1 0.19 0.18 1.00 0.75 -0.03 0.18 0.04 1.09 0.96 0.13
PHT1 -0.33 -0.20 0.75 1.00 0.32 -0.23 -0.28 0.75 1.09 0.28
GY1 -0.62 -0.77 -0.03 0.32 1.00 -0.59 -0.70 0.09 0.40 0.98
MFL.2 0.96 0.83 0.18 -0.23 -0.59 1.00 0.91 0.31 -0.23 -0.52
FFL2 0.88 0.68 0.04 -0.28 -0.70 0.91 1.00 0.27 -0.23 -0.61
EHT2 0.23 0.15 1.09 0.75 0.09 0.31 0.27 1.00 0.77 -0.08
PHT2 -0.25 -0.28 0.96 1.09 0.40 -0.23 -0.23 0.77 1.00 0.20
GY2 -0.36 -0.41 0.13 0.28 0.98 -0.52 -0.61 -0.08 0.20 1.00

PHENOTYPIC CORRELATION MATRIX

MFL1 FFL1 EHT1 PHT1 GY1 MFL.2 FFL2 EHT2 PHT2 GY2
MFL1 1.00 0.71 0.01 -0.37 -0.47 0.63 0.56 0.10 -0.17 -0.29
FFL1 0.71 1.00 0.04 -0.26 -0.48 0.54 0.51 0.03 -0.20 -0.31
EHT1 0.01 0.04 1.00 0.80 0.07 0.13 0.06 0.71 0.51 0.06
PHT1 -0.37 -0.26 0.80 1.00 0.32 -0.14 -0.15 0.49 0.53 0.17
GY1 -0.47 -0.48 0.07 0.32 1.00 -0.35 -0.39 0.04 0.15 0.44
MFL.2 0.63 0.54 0.13 -0.14 -0.35 1.00 0.72 0.05 -0.31 -0.40
FFL2 0.56 0.51 0.06 -0.15 -0.39 0.72 1.00 0.05 -0.21 -0.45
EHT2 0.10 0.03 0.71 0.49 0.04 0.05 0.05 1.00 0.81 0.15
PHT2 -0.17 -0.20 0.51 0.53 0.15 -0.31 -0.21 0.81 1.00 0.34
GY2 -0.29 -0.31 0.06 0.17 0.44 -0.40 -0.45 0.15 0.34 1.00

MOLECULAR COVARIANCE MATRIX

MFL1 FFL1 EHT1 PHT1 GY1 MFL.2 FFL2 EHT2 PHT2 GY2
MFL1 1.00 0.33 0.69 0.63 0.11 0.43 -0.23 0.48 0.79 0.37
FFL1 0.33 1.00 0.46 0.37 0.02 0.24 0.22 0.16 0.23 -0.12
EHT1 0.69 0.46 1.00 0.48 0.23 0.51 -0.41 0.58 0.41 0.13
PHT1 0.63 0.37 0.48 1.00 -0.21 0.47 0.25 0.48 0.50 -0.10
GY1 0.11 0.02 0.23 -0.21 1.00 -0.04 -0.34 -0.01 0.27 -0.18
MFL.2 0.43 0.24 0.51 0.47 -0.04 1.00 0.15 0.73 0.31 0.04
FFL2 -0.23 0.22 -0.41 0.25 -0.34 0.15 1.00 -0.14 -0.04 -0.16
EHT2 0.48 0.16 0.58 0.48 -0.01 0.73 -0.14 1.00 0.32 0.01
PHT2 0.79 0.23 0.41 0.50 0.27 0.31 -0.04 0.32 1.00 -0.02
GY2 0.37 -0.12 0.13 -0.10 -0.18 0.04 -0.16 0.01 -0.02 1.00

29
COVARIANCE BETWEEN SELECTION INDEX AND BREEDING VALUE: 3.205508

VARIANCE OF THE SELECTION INDEX: 1.50582

VARIANCE OF THE BREEDING VALUE: 6.145354

CORRELATION BETWEEN SELECTION INDEX AND BREEDING VALUE: 0.9999

VALUES OF THE TRAITS, SELECTION INDEX, MEANS, GAINS FOR THE 5% SELECTED
INDIVIDUALS

MFL1 FFL1 EHT1 PHT1 GY1 MFL.2 FFL2


Entry 240 102.01 102.12 80.53 140.12 0.00 100.70 100.74
Entry 2 104.88 104.42 100.22 148.82 28.72 99.60 100.36
Entry 21 101.23 100.75 93.45 163.42 34.45 100.23 98.34
Entry 132 104.69 104.71 75.75 126.00 44.00 97.88 100.35
Entry 167 106.75 105.49 81.67 132.64 0.83 106.21 108.25
Entry 179 104.39 101.50 76.00 126.03 23.78 100.74 99.34
Entry 220 105.34 104.37 78.75 137.92 97.16 99.84 98.31
Entry 174 103.13 102.54 96.84 146.33 47.44 96.59 99.25
Entry 165 104.61 103.66 90.75 150.25 79.00 99.32 103.80
Entry 161 103.12 99.05 85.75 148.50 82.50 101.46 103.00
Entry 87 101.53 100.88 103.66 156.84 68.33 102.34 99.20
Mean of
Selected Individuals 103.79 102.68 87.58 143.35 46.02 100.44 100.99
Mean of
all Individuals 102.01 102.12 80.53 140.12 75.99 100.70 100.74
Selection Differential 1.78 0.56 7.04 3.23 -29.97 -0.26 0.26
Expected Genetic Gain
for 5% 1.57 1.64 1.52 1.19 -1.13 1.95 1.37

EHT2 PHT2 GY2 LT index


Entry 240 78.47 133.44 0.00 2.88
Entry 2 106.75 161.75 140.00 2.59
Entry 21 89.69 156.81 1.50 2.34
Entry 132 76.09 127.28 33.39 2.24
Entry 167 75.00 113.64 0.00 2.14
Entry 179 74.31 124.14 9.24 2.14
Entry 220 87.00 143.50 129.50 1.94
Entry 174 94.93 149.59 18.61 1.83
Entry 165 96.25 160.25 41.00 1.80
Entry 161 76.50 125.69 12.75 1.74
Entry 87 91.95 140.56 35.00 1.70
Mean of Selected Individuals 86.08 139.69 38.27 NA
Mean of all Individuals 78.47 133.44 51.70 NA
Selection Differential 7.61 6.26 -13.43 NA
Expected Genetic Gain for 5% 1.73 1.08 -1.02 NA

30
VALUES OF THE TRAITS AND THE SELECTION INDEX FOR ALL INDIVIDUALS

MFL1 FFL1 EHT1 PHT1 GY1 MFL.2 FFL2 EHT2 PHT2 GY2
Entry 1 102.21 100.25 71.45 123.75 42.45 99.29 98.95 68.51 117.47 16.87
Entry 2 104.88 104.42 100.22 148.82 28.72 99.60 100.36 106.75 161.75 140.00
Entry 3 98.97 100.44 80.17 154.36 77.78 97.93 96.82 66.75 133.75 116.00
.
.
.
(some rows are deleted)

Figure 25. Sample Partial Text Output of Selection Index Analysis.

Mating Designs

Several mating designs were developed to estimate genetic variance


components. Experiments which used the following mating designs can be analyzed in
PBTools: North Carolina I to III, Triple Test Cross and Diallel I to IV. For each mating
design, the experiment can be laid out in any of these four experimental designs:
Completely Randomized, RCB, Alpha-Lattice, and Row-Column design. In this paper,
only analyses for North Carolina II and Diallel II are presented.

For North Carolina II, analysis can be done per environment level or across
environments. To illustrate analysis per environment level, a sample completed dialog
box is shown in Figure 26.

Figure 26. Dialog Box for North Carolina Experiment II.

31
The results of the analysis as shown in Figure 27 includes data summary,
ANOVA table (assuming fixed model), estimates of the variance components and
estimates of the genetic variance components.

DATA FILE: E:/NSALES/pbtools workspace/SampleProject/Data/NCII_ME.csv

DESIGN: NORTH CAROLINA EXPERIMENT II IN RCB (INBRED)

-----------------------------
RESPONSE VARIABLE: Y
-----------------------------
-----------------------------
ANALYSIS FOR: Env = A
-----------------------------

DATA SUMMARY:

Factors No of Levels Levels


Env 1 A
Female 8 1 2 3 ... 8
Male 8 10 11 12 ... 9
Block 3 1 2 3

Number of observations read: 192


Number of observations used: 192

ANOVA TABLE FOR THE EXPERIMENT:

Df Sum Sq Mean Sq F value Pr(>F)


Block 2 22.3531 11.1766 1.14 0.3220
Male 7 1112.6490 158.9498 16.26 0.0000
Female 7 735.2669 105.0381 10.75 0.0000
Male:Female 49 823.1469 16.7989 1.72 0.0086
Residuals 126 1231.4730 9.7736
-------
REMARK: Raw dataset is balanced.

LINEAR MIXED MODEL FIT BY RESTRICTED MAXIMUM LIKELIHOOD:

Formula: Y ~ 1 + (1|Block) + (1|Male) + (1|Female) + (1|Male:Female)

AIC BIC logLik deviance REMLdev


1057.665 1077.21 -522.8323 1047.758 1045.665

Fixed Effects:
Estimate Std. Error t value
(Intercept) 55.6076 1.1379 48.8702

Random Effects:
Groups Variance Std. Deviation
Male:Female 2.3418 1.5303
Female 3.6767 1.9175
Male 5.9229 2.4337
Block 0.0219 0.1481
Residual 9.7736 3.1263

32
ESTIMATES OF GENETIC VARIANCE COMPONENTS:

Estimate
VA 9.599670
VD 2.341800
Narrow sense heritability(plot-mean based) 0.442075
Broad sense heritability(plot-mean based) 0.549917
Dominance Ratio 0.698492

Figure 27. Sample Text Output of North Carolina II Per Environment Analysis

For multiple-environment analysis, the sample output as shown in Figure 28


includes data summary, ANOVA table (assuming fixed model), estimates of the variance
components and estimates of the genetic variance components.

DATA FILE: E:/NSALES/pbtools workspace/SampleProject/Data/NCII_ME.csv

MULTIPLE ENVIRONMENT ANALYSIS

DESIGN: NORTH CAROLINA EXPERIMENT II IN RCB (INBRED)

-----------------------------
RESPONSE VARIABLE: Y
-----------------------------

DATA SUMMARY:

Factors No of Levels Levels


Env 2 A B
Female 8 1 2 3 ... 8
Male 8 10 11 12 ... 9
Block 3 1 2 3

Number of observations read: 384


Number of observations used: 384

ANOVA TABLE:

Df Sum Sq Mean Sq F value Pr(>F)


Env 1 8.7665 8.7665 0.38 0.5724
Env:Block 4 93.0194 23.2549 2.21 0.0682
Male 7 840.6569 120.0938 11.43 0.0000
Female 7 821.7501 117.3929 11.17 0.0000
Male:Female 49 1233.6320 25.1762 2.40 0.0000
Env:Male 7 493.0000 70.4286 6.70 0.0000
Env:Female 7 215.9051 30.8436 2.93 0.0057
Env:Male:Female 49 974.2153 19.8819 1.89 0.0009
Residuals 252 2648.7840 10.5110
-------
REMARK: Raw dataset is balanced.

33
LINEAR MIXED MODEL FIT BY RESTRICTED MAXIMUM LIKELIHOOD:

Formula: Y ~ 1 + (1|Env) + (1|Env:Block) + (1|Male) + (1|Female) +


(1|Male:Female) + (1|Env:Male) + (1|Env:Female) + (1|Env:Male:Female)

AIC BIC logLik deviance REMLdev


2147.079 2186.586 -1063.54 2128.376 2127.079

Fixed Effects:
Estimate Std. Error t value
(Intercept) 55.7587 0.7634 73.042

Random Effects:
Groups Variance Std. Deviation
Env:Male:Female 3.1506 1.7750
Male:Female 0.8681 0.9317
Env:Female 0.3985 0.6313
Env:Male 1.8441 1.3580
Female 1.7219 1.3122
Male 1.0554 1.0273
Env:Block 0.1792 0.4233
Env 0.0000 0.0000
Residual 10.5157 3.2428

ESTIMATES OF GENETIC VARIANCE COMPONENTS:

Estimate
VA 2.777330
VAxE 2.242660
VD 0.868100
VDxE 3.150620
h2-narrow sense 0.142031
H2-broad sense 0.186425
Dominance Ratio 0.790653

Figure 28. Sample Text Output of North Carolina II Mulit-environment Analysis

In cases where information on effects due to reciprocal, maternal, general


combining ability (GCA) and the specific combining ability (SCA) of parents in crosses
are of interest, diallel design can be used. In analyzing such designs, PBTools makes
use of the Griffing Method. The sample dialog box for Diallel II analysis is shown in
Figure 29.

34
Figure 29. Dialog Box for Diallel Analysis (Griffing Method 2).

Partial results are shown in Figure 30. This consists of data summary, test for the
significance of the crosses, test for the significance of GCA and SCA effects, GCA, SCA
and residual variance estimates, estimates of genetic variance components and
estimates of the GCA and SCA effects.

35
DATA FILE: E:/NSALES/pbtools workspace/SampleProject/Data/Diallel_M2.csv

DIALLEL ANALYSIS: GRIFFING METHOD II IN RCB (CROSS)

-----------------------------
RESPONSE VARIABLE: Plant_height
-----------------------------
-----------------------------
ANALYSIS FOR: Env = Normal
-----------------------------

DATA SUMMARY:

Factors No of Levels Levels


P1 7 1 2 3 4 5 6 7
P2 7 1 2 3 4 5 6 7
Block 3 1 2 3
Env 1 Normal

Number of observations read: 84


Number of observations used: 84

TESTING FOR THE SIGNIFICANCE OF CROSS EFFECT: (Crosses = P1:P2)

Formula for Model 1: Plant_height ~ Crosses + (1|Block)


Formula for Model 2: Plant_height ~ (1|Block)

AIC BIC logLik Chisq Df Pr(>Chisq)


Model2 758.9207 766.2131 -376.4603
Model1 466.3195 539.2440 -203.1597 346.6012 1 0.0000

MATRIX OF MEANS:

1 2 3 4 5 6 7
1 142.9000 148.3333 163.9000 152.9000 142.3667 160.6667 191.3333
2 129.6667 142.9000 143.9000 131.5667 143.5333 186.6667
3 131.5667 163.5667 136.9000 166.4333 189.7667
4 159.3333 149.8000 164.8000 200.2333
5 122.4333 138.9000 175.7667
6 146.1333 195.5667
7 157.5333

ANALYSIS OF VARIANCE:

SV Df Sum Sq Mean Sq F value Pr(>F)


GCA 6 7784.37 1297.39 6.05 0.0008
SCA 21 4500.58 214.31 14.56 0.0000
Error 54 794.66 14.72

ESTIMATES OF VARIANCE COMPONENTS:

Estimate Std. Error


GCA 120.3424 217.0737
SCA 199.5973 66.1426
Error 14.7160 2.8321

36
ESTIMATES OF GENETIC VARIANCE COMPONENTS:

Estimate
VA 481.369471
VD 798.389175
h2-narrow sense 0.371865
H2-broad sense 0.988632
Dominance Ratio 1.821307

GENERAL COMBINING ABILITY EFFECTS (diagonal), SPECIFIC COMBINING


ABILITY EFFECTS (above diagonal):

1 2 3 4 5 6 7
1 -0.6608 3.1454 10.8935 -7.5806 1.1861 3.7083 13.0157
2 -10.5571 -0.2102 -6.6843 0.2824 -3.5287 18.2454
3 -2.7386 5.1639 -2.2028 11.5528 13.5269
4 4.7354 3.2231 2.4454 16.5194
5 -14.5646 -4.1546 11.3528
6 1.2132 15.3750
7 22.5725

TABLE OF STANDARD ERRORS AND LSDs:

Std. Error LSD


Gi 1.1839
Sii 2.9299
Sij 3.4430
Gi-Gj 1.8084 3.6256
Sii-Sjj 4.0437 8.1070
Sij-Sik 5.1149 10.2547
Sij-Skl 4.7845 9.5924

Figure 30. Sample Text Output of Diallel II Per Environment Analysis.

For multiple-environment analysis, the sample output as shown in Figure 31


includes data summary, test for the significance of the crosses × environment effect, test
for the significance of GCA and SCA effects, estimates of the GCA and SCA effects,
GCA, SCA and residual variance estimates and estimates of genetic variance
components.

37
DATA FILE: E:/NSALES/pbtools workspace/SampleProject/Data/Diallel_M2.csv

MULTIPLE ENVIRONMENT ANALYSIS

DIALLEL ANALYSIS: GRIFFING METHOD II IN RCB (CROSS)

-----------------------------
RESPONSE VARIABLE: Plant_height
-----------------------------

DATA SUMMARY:

Factors No of Levels Levels


P1 7 1 2 3 4 5 6 7
P2 7 1 2 3 4 5 6 7
Env 2 Normal Saline
Block 3 1 2 3

Number of observations read: 168


Number of missing observations: 0

ANOVA TABLE:

Df Sum Sq Mean Sq F value Pr(>F)


Env 1 17111.43 17111.43 28.18 0.0061
Block(Env) 4 2429.29 607.32 14.44 0.0000
Crosses 27 80041.58 2964.50 31.42 0.0000
Crosses x Env 27 2547.46 94.35 2.24 0.0019
Error 108 4543.85 42.07
-------
REMARK: Raw dataset is balanced.

ANOVA TABLE:

Df Sum Sq Mean Sq F value Pr(>F)


GCA 6 54367.67 9061.28 51.14 0.0001
SCA 21 25673.91 1222.57 17.30 0.0000
GCAxE 6 1063.14 177.19 4.21 0.0008
SCAxE 21 1484.32 70.68 1.68 0.0450
Error 108 4543.85 42.07
-------
REMARK: Raw dataset is balanced.

MATRIX OF MEANS:

1 2 3 4 5 6 7
1 131.0000 133.8333 153.5500 140.7333 128.9000 154.4333 182.2833
2 116.8333 130.5500 133.8500 122.7167 129.0500 175.9000
3 122.0000 154.5000 131.2167 153.2167 177.7833
4 137.3333 138.5667 155.7833 195.2333
5 111.8833 132.5667 166.3333
6 143.0167 188.6167
7 155.1000
-------
REMARK: Raw dataset is balanced.

38
GENERAL COMBINING ABILITY EFFECTS, SPECIFIC COMBINING ABILITY EFFECTS (above
diagonal)

1 2 3 4 5 6 7
1 1.4884 11.7329 -6.4745 -1.7227 6.5181 12.5181
2 -0.5819 -2.6727 2.7792 -8.1801 16.8199
3 8.5051 1.8069 6.5144 9.2310
4 3.7662 3.6903 21.2903
5 -2.9412 8.9755
6 13.9662
7
GCA -1.6418 -12.3270 -2.8548 2.5360 -14.0492 3.2434 25.0934

TABLE OF STANDARD ERRORS AND LSDs:

Std. Error LSD


Gi 0.8172
Sii 2.3767
Sij 2.0225
Gi-Gj 1.2483 2.4743
Sii-Sjj 2.7913 5.5328
Sij-Sik 3.5307 6.9985
Sij-Skl 3.3027 6.5465

ESTIMATES OF VARIANCE COMPONENTS:

Estimate
GCA 143.1889
SCA 191.9809
GCAxE 3.9448
SCAxE 9.5364
Error 42.0727

ESTIMATES OF GENETIC VARIANCE COMPONENTS:

Estimate
VA 572.755727
VAxE 15.779056
VD 767.923608
VDxE 38.145415
h2-narrow sense 0.398667
H2-broad sense 0.933181
Dominance Ratio 1.637530

Figure 31. Sample Text Output of Diallel II Multi-environment Analysis.

Future Direction

In succeeding versions, analysis for other experimental designs will be made


available for single-environment analysis. Also, other input data options will be
introduced. This includes allowing subsampling data for mating designs and accepting
input files of different formats for QTL analysis. Finally, a module for multi-environment
QTL analysis will also be added.

39

Vous aimerez peut-être aussi