Académique Documents
Professionnel Documents
Culture Documents
Change
Patrick S. Schnable
PSS has IP and equity interests in Data2Bio, Dryland Genetics, DecisionPx & EnGeniousAg
Some Terminology
• Model Species: A species with attributes that make
it particularly suitable for scientific investigations.
• Maize: the most important model species that is
also a crop
• Corn: the most important global crop which also
happens to be a model species
Why is rate of yield increase for maize higher than for wheat and
soybean?
• Biological differences (cross-pollinated vs. self-pollinated?
C4? Genetic diversity?); if so, how to explain rice?
• Greater R&D investment? (driven by ability to capture
investment via sales of hybrid seed) 4
Heterosis (Hybrid Vigor)
8
A Single Illumina HiSeq 2000
Paired-End LaneHP LaserJet P4015
Illumina HiSeq 2000 Letter Size Paper Black & White Printer
Paired-End Lane: Letter Size Paper: Printer:
180 x 106 Reads 1” margins ~50 pages/min
2 x 100 bp reads 12 pt font size (single sided)
36 x 109 bp Times New Roman
11 inches
8 days ~2,500 character Cartridge:
(UPPERCASE) ~24,000 pages
36GB
data
Sequencing Cost 8.5 inches
<$3,000
How much paper is needed to print data from 1 paired-end lane of HiSeq?
Or Or And
36x 109 bp / 2.5 x 103 = 14.4 x 106 pages 14.4 x 106 / 500 = 28,800 reams 28,8000 / 10 = 2,880 cases (12” x 10” x 18”) 1 printer + 600 cartridges + 200 days
A pile of cases 12 ft x 12 ft x 12ft (x2)
Supply Cost:
Genotype &
Evaluate with
GS Model
Evaluate
Progeny
11
#2: Genome Editing: Targeted Editing of
the Genetic Code
12
#3: Increasing Efficiency & Speed of Transformation
(meanwhile its reducing genotype dependency)
13
Consolidation in the Seed/Breeding
Industry
More Recently:
• China Chem
purchased
Syngenta
• DuPont & Dow
merged ->Corteva
• Bayer purchased
Monsanto
14
Overcoming Challenges in Crop
$32M
Production
B73 Reference Genome NGS data in NCBI SRA (Oct 2016)
40
35
Tera (1012) Bases
SAM Parents Resequencing
30
Maize Pan-genome and
25
Pan-transcriptome
20
Maize HapMap2
CAU Resequencing
15
ISU Zeanome (RNA-Seq)
10
Maize
HapMap CAAS Resequencing
5
Ames Diversity Panel
IBM RILs RNA-Seq
0
2009 2010 2011 2012 2013 2014 2015 2016
15
Schnable, Ware et al, Science, 2009
The Promise
of Predictive Statistical Models
Given sufficient genotyping, phenotyping and
environmental data we can use the approaches of Big
Data to develop statistical models that enable the
prediction of traits/phenotypes based on genotypes and
environmental data
. Phenotype (P) = Genotype (G) + Environment (E)* + GxE
17
What’s Required?
SL
S
S
L S
L
SL S S
SL S
L
D ro u g h t Im p a c t T y p e s :
L SL SL D e lin e a t e s d o m in a n t im p a c ts
S = S h o r t- T e r m , ty p ic a lly le s s th a n
L 6 m o n th s ( e .g . a g r ic u lt u r e , g r a s s la n d s )
L = L o n g - T e r m , ty p ic a lly g r e a te r t h a n
6 m o n th s ( e .g . h y d r o lo g y , e c o lo g y )
L I n t e n s i ty :
A u th o r : D 0 A b n o r m a lly D r y
D a v id M i s k u s D 1 M o d e ra te D r o u g h t
N O A A /N W S /N C E P /C P C D 2 S e v e re D ro u g h t
D 3 E x tre m e D ro u g h t
D 4 E x c e p t io n a l D r o u g h t
T h e D r o u g h t M o n it o r f o c u s e s o n b r o a d -
S S s c a le c o n d i tio n s . L o c a l c o n d it io n s m a y
L v a r y . S e e a c c o m p a n y i n g te x t s u m m a r y f o r
S S
L SL
f o r e c a s t s ta te m e n t s .
S S S
SL SL
SL L
L SL
h t t p : / / d r o u g h t m o n i t o r .u n l .e d u /
L
L
L
SL
SL
S L
21
Automated Phenotyping Systems:
(some) Design Considerations
23
HTP Time-Lapse
Photography
Colton
McNinch Stefan Hey
Yield Trials
Irrigated
27
28
700,000
Per Mo 5.6Tb
Per Mo
Identifying Features in
Image Data via Using AI
and HTP Computing
Baskar Ganapathysubramanian
30
Amazon Turking Plant Height
Dan
Baskar
Nettleton
Ganapathysubramanian
Baseline: line
connecting where
plants grow out of the
soil (yellow)
Lisa Coffey
Zihao Zheng
( 郑子豪 )
Root
Angle
Shallow Steep
Huyu Liu
( 刘祜宇 ) Root
CAU
Volume
Liang Dong
N Application is a Goldilocks Problem
• Under application -> yield losses
• Over application -> wasted input costs & environmental impact
Optimum N Input
• Nitrogen is 2nd most expensive input for rain
fed corn (after seed)
• Selecting appropriate N application rates is
complicated by substantial year-to-year
variation in N production from soil organic
matter and field losses of N
• Predicting the optimal level of Nitrogen is
currently difficult to impossible
• 35% of fields exhibit NO response to N
• $1.67B of wasted N fertilizer per year
36
In Planta Nitrate Sensors
Plant sensor
0
2000
60
600 120
180
Sensor Measurement
240
500 300
ppm nitrate-N
400
300
200
100
0
0 100 200 300 400 500 600 2000
Conventional Measurement
ppm nitrate-N
Faster, smarter breeding
of naturally water
efficient crops
James Schnable
co-founder
Problem 1: Long-term increase in
global demand for grain
39
Problem 2: Exponential Investments in Breeding Major Crops
Produce only Linear Yield Increases
40
Problem 3: Water is limiting factor
in agricultural production
Competition for Water (Irrigated Corn) Competition for Water (Rain fed Corn)
41
Dryland Genetics
is developing improved varieties of Proso
millet
42
DLG’s Breeding Strategy: Incorporate
Global Genetic Diversity into US Germplasm
Evaluate
Progeny
Select
Intercross Best
Lines
43
Examples of DLG’s New Varieties from 2018
Yield Trials
Impact
46
47