Title: Applications of Mixed Linear Models in Forest Genetics Tim White, Dudley Huber
1Applications of Mixed Linear Models inForest
GeneticsTim White, Dudley Huber Salvador Gezan
- Introduction
- Genetic Tests in Forestry
- BLUP in Genetics
- Case Study 1 Multi-Generation BLUP
- Case Study 2 Incomplete Blocks
- Conclusions
Justus Seely Conference, OSU July 2003
2Introduction
- Purposes of Genetic Tests in Breeding Programs
- Estimate genetic parameters (functions of second
moments) - H2 s2G / (s2G s2E) ? heritability genetic
control - Rank candidates for subsequent selection
- Parents from offspring
- All tested individuals from several generations
of breeding - Predict genetic gain from selection
- Nature of Genetic Tests
- Designed tests crops, trees
- Field data animals
- All
- Unbalanced, messy
- Multiple generations, several traits, inbreeding,
culling
3Genetic Tests in Forestry
- Typical Mating Design (Many Treatments)
- 200 parents (A, B, C )
- Intermated to form 300 families (AxB, BxC, CxD
) - To create 24,000 offspring (AB1, AB2, AB3 AB80)
- Typical Field Design (Similar to Ag Field Trials)
- 8 field locations
- 20 resolvable replicates per location (20 x 300
6,000 trees) - Alpha-lattice incomplete blocks
- Desired Rankings for 24,200 Genotypes (All at
Once) - 200 parents
- 24,000 offspring
4Genetic Tests in Forestry Breeding
5Genetic Tests in Forestry Nursery
6Genetic Tests in ForestrySite Preparation and
Planting
7Genetic Tests in ForestryMaintenance and
Measurement
8Applications of Mixed Linear Models inForest
Genetics
- Introduction
- Genetic Tests in Forestry
- BLUP in Genetics
- Case Study 1 Multi-Generation BLUP
- Case Study 2 Incomplete Blocks
9BLUP in Genetics
- Conceptual Equation
-
- g is a q x 1 vector of random genetic values
being predicted - y is an n x 1 vector of data observations
- µ E(y)
- V Var(y), n x n variance matrix assumed known
- C Cov(y, g), n x q covariances of data
genetic values - Properties (assuming C and V known)
- GLS solution for BLUE of µ (adjusts for nuisance
fixed effects) - BLUP of g
- If y and g are MVN, then
- BLUP equation is E(g y)
- Gain from selection is maximized
- Application
- REML used to estimate variance components
- Many algorithms
10BLUP in Genetics
- History
- 1950s Henderson proposed mixed model equations
(MME) - 1970s Proofs, properties and algorithms of MME
- 1980s Widespread application of BLUP in animal
breeding - 1990s Widespread application of BLUP in forestry
- Reasons for BLUP
- Unbalanced and messy data
- Shrinkage (regression) based on precision
predicts future gain - Tendency to select better-tested genotypes
- Multivariate capabilities
- Tests of varying precision (high H2 versus low
H2) and ages - Data or predictions on correlated traits
- Incorporating genotype x environment interaction
- Genetic Peculiarities
- Tests spanning several generations of selection
multiple genetic relationships - Non-random mating
- Culling of data before final measurement
11Applications of Mixed Linear Models inForest
Genetics
- Introduction
- Genetic Tests in Forestry
- BLUP in Genetics
- Case Study 1 Multi-Generation BLUP
- Case Study 2 Incomplete Blocks
12Case Study 1 Multi-Generation Selection
- Goals
- Demonstrate genetic values as random (BLUP)
versus fixed (GLS) - Illustrate impact of incorporating all
generations in the analysis - Used Simulation (1 Run) to Create
- A three-generation selection process G0, G1, and
G2 - G0 is infinite in size with specified genetic
structure h2 0.19 - 40 randomly chosen G0 parents randomly mated to
create G1 offspring - 4800 offspring of the matings planted in
genetic tests - Data analyzed to rank 4800 candidates
- 40 selected G1 parents randomly mated to create
G2 offspring - 4800 offspring of the matings planted in
genetic tests - Data analyzed to rank 4800 candidates
13Case Study 1 Selection and Testing Program
14Case Study 1 Stage 1 Data Analysis
- Data and Linear Model
- 4,800 observations from 3 tests
- Location, block, genotype, g x l
- Goal of Analysis
- Rank 40 G0 parents based on performance of their
progeny - Analysis 1
- Treat genotypes as fixed effects
- Estimate parental means from GLS
- Analysis 2
- Treat genotypes as random effects
- Incorporate pedigree file including 40 G0 parents
- Predict parental values from BLUP
15Case Study 1 Stage 1 Data Analysis
- BLUP values are shrunken
- STD(FE) 0.20
- STD(BLUP) 0.17
- Rank changes occur
- BLUP incorporates mating design through pedigree
file - Parents mated with good mates are adjusted
- Parents in fewer crosses are shrunken
16Case Study 1 Stage 2 Data Analysis
- Goal
- Estimate genetic variance
- Predict gain (i.e. genetic superiority) of the 40
G1 parents - Two Analytical Options
- Both BLUP
- Both use pedigree files with data from all 3
generations - Option 1
- Data 4800 G2 progeny
- Pedigree 40 G1 15 G0 ancestors
- Option 2
- Data 4800 G2 4800 G1
- Pedigree 40 G2 ancestors
17Case Study 1 Stage 2 Data Analysis
- Estimate of genetic variance
- True value 0.126
- Option 1 0.111
- Option 2 0.124
- Prediction of Mean Genetic Value
- True mean of 40 G1 parents 0.555
- Option 1 0.007
- Option 2 0.541
- Multi-Generation Analysis
- Accounted for selection bias
- Genetic variance of Option 1 is reduced from
population truncation - Genetic value (gain) of Option 1 is centered on
the G1 mean, while that for Option 2 is against
the G0 mean
18Applications of Mixed Linear Models inForest
Genetics
- Introduction
- Genetic Tests in Forestry
- BLUP in Genetics
- Case Study 1 Multi-Generation BLUP
- Case Study 2 Incomplete Blocks
19Case Study 2 Incomplete Block Designs
- Goal
- Identify optimal experimental designs for clonal
tests (hundreds of clones) - Maximize H2, minimize residual variance, maximize
gain from selection - Used Simulation to Compare
- CRD, RCB and several Alpha-lattice designs
- Different block sizes
- Alpha designs versus post-hoc blocking
- Effect of mortality
- Approach
- Fixed genetic design 8 ramets of 256 clones
2,048 trees in a test - Fixed heritability for CRD H2 0.25
- Simulated 3 types of surfaces gradients,
patchiness and both - Overlay field design and clone randomization
- Do 1000 simulations of each combination
20Case Study 2 Creating the Surfaces
- Each surface was a grid of 32 rows x 64 columns
2,048 trees - Gradient Polynomial model
- ? ? (x y) ? (x2y xy2) varying ? ?
- Patchiness Separable Autoregressive (AR1 x AR1)
- Var (eij) ?s2 ?e2 1 varying ?e2
from 0.2 to 0.8 - Cov (eij , eij) ?s2 ?xdx ?ydy varying ?x
?y from 0.01 to 0.99 - Both
21Case Study 2 Field Layout
- Experimental Designs
- Linear Model for Data Generation
- 1000 Simulations for Each Surface/Design
- Two Levels of Mortality 0 and 25
y ? C Es Ee ?2T
?2c ?2s ?2e
22Case Study 2 Heritability
- Mean Heritability Estimates
- IBDgtRCBgtCRD, all surfaces
- 32BKgt16BKgt8BKgt4BK, all surfaces
- Replicate effect greater for gradients
- Small patches ? lower H2
23Case Study 2 Gains from Selection
- Selection Process to Calculate Gain Efficiency
- Select of best 5 of clones based on BLUP value
- Calculate gain for selected clones using their
true value, Gs - Select best 5 of clones based on true value
- Calculate gain for selected clones using their
true value, Gt - Gain efficiency Gs/Gt
24Conclusions
- BLUP
- Treats genetic values as random variables
- Has revolutionized analysis of genetic data
- Can incorporate multiple traits, many
generations, unbalanced data, culling of data
non-random mating - Predicts genetic values and gain directly
- Shrinks observed means more when less information
- Spatial Variation and Analysis
- Incomplete blocks, row-column and Latinization
all have potential to reduce residual error in
forest genetic experiments - Spatial analysis also has potential to increase
precision - Mixed linear models facilitate use of these
designs and analyses