SlideShare a Scribd company logo
1 of 260
Advanced Biometrics
Techale Birhan (PhD)
Course
• Biometry – Plant Breeding
• Biostatistics – Plant Biotechnology
Course content…
 Introduction to basic principles of AE
 Experimental Design and Field Management
 Multivariate analysis
 Incomplète Block design
 Practical data analysis with R, SAS & Tassel
 Data transformation
What is research?
• Research means an organized and systematic
way of finding solution to a question
• Is a planned inquiry to obtain new facts or to
confirm or deny the results of the previous
experiments
Research Cycle
Experimental research or non-experimental
research
• One may simply observe a scenario and
decide based on own subjective judgment or
• Require other tools and methods to assist in
the process of decision making
Development of Quantitative Genetics
• Johannsen, 1903, Seed weight of inbred lines
– Variation among lines, heritable
– Parents with heavy lines gave heavy offsprings
– Variation within lines, not heritable, environmental
• Nilsson-Ehle in 1909 studied kernel color in wheat
– crossed red lines to white
– F1 red intermediate between the two parents, and
– F2 ranged from red to white.
– Some lines segregated 3:1 (red:white) in the F2,
– whereas some segregated 15:1 and some 63:1.
– Kernel color is ontrolled by three genes.
– These three genes act additively independently that gives a
continuous distribution
Field experimentation is used to obtain
New information or to improve the results of
previous findings
• It helps to answer questions such as:
 Which fertilizer level gives optimum yield?
 Which insecticide is the most effective?
 Is the improved variety higher yielding than the
local varieties?
Development of Quantitative Genetics cont.
• Fisher (1918) introduced statistics
in Mendelian genetics, where
variance (2) was used to measure
differences in a population.
• This analysis involves population,
not an individual.
• Population - Group of individuals
belonging to a certain class.
Development of Quantitative Genetics cont.
• Wright (1932) studied coat color in
guinea pigs and recognized the
importance of gene interaction
• Inbreeding, non-random mating and
selection on the genetic composition of
a population.
• The two most important contribution of
Wright are the concept of inbreeding
coeficient and effective population size.
Steps in experimental methods:
• Define and state the problem
• State objectives
• Develop hypothesis
• Implement the experiment
• Data collection and analysis
• Interpretation of results
• Preparation of complete & precise report
To produce an acceptable result
• The trials must be designed properly
• Data must be collected properly
• Correct analytical method must be used
NB: It is quite difficult to compromise at analysis stage
if the design was initially wrong
Basic Statistical Terms and Concepts
Categories of genetic studies
1. Qualitative characters
• Characters that could be grouped into ,
kinds, types/classes.
Statistics…
Statistics
Inferential
(Test hypothesis, make
conclusion )
(Making decision about
population based on
sample)
Descriptive
(Describe characteristics,
organize and summarize)
(mean, mode, median)
Categories of genetic studies
3. Quantitative Genetics
• Quantitative genetics is a field of science
involving transmission, inheritance or
heredity of variation among quantitative
traits of individuals, i.e. variation in traits
that can only be differentiated using
measurements.
Characteristics of QT cont.
• Although there are 27 genotypes,
many of them have the same
phenotype and hence there are
only seven phenotypes (0, 1, 2, 3,
4, 5 and 6).
• Therefore there is no strict one to
one realtionship between
genotype and phenotype.
Experimental Error
In the process of experimentation there are
several sources of errors that may be encountered
at all stages of the work.
• Inaccurate equipment
• Personal bias
• Inadequate replication
• Lack of uniformity in soil fertility
• Topography or drainage
• Damage by rodents, birds, insects & diseases
Experimental Error
Precision
• Precision is the closeness of repeated
measurements
Accuracy
• Accuracy is the closeness of a measured or
computed value to its true value
Hypothesis
• A proposed explanation made on the basis of
limited evidence
• A starting point for further investigation
Hypothesis
Null hypothesis
• Any hypothesis to be tested and is denoted by H0
• There is no difference between treatment
Alternate hypothesis
• Denoted by H1 or HA
• There is at least one treatment different from other
If one plant is watered with distilled water and
the other with mineral water, then there is no
difference in the growth of these two plants
Characteristics of QT cont.
• Many different genotypes can have the same
phenotype. Considering k number of genes, all
having an equal effect on a trait. If there are two
alleles at each locus and that they exhibit co-
dominance (neither allele is dominant), then
there will be a total of 3k genotypes. For
example with k = 3 the following genotypes and
phenotypes can be shown, assuming each A, B
and C allele adds one unit to the phenotype:
• Type-I error
• Rejection of the null hypothesis when it is true
• If you get significance and you’re wrong, it’s a
false-positive
• The probability of finding a difference with our
sample compared to population, and there
really isn’t one
• Type-II error
• Acceptance of the null hypothesis when it is false
• If you get non-significance and you’re wrong, it’s
a false negative
• The probability of not finding a difference that
actually exists between our sample compared to
the population.
Characteristics of QT cont.
No. Genotype Phenotype
1. AABBCC 6 units
2. AABBCc 5 units
3. AABBcc 4 units
4. AABbCC 5 units
5. AABbCc 4 units
6. AAbbCc 3 units
7. AAbbCC 4 units
. .
. .
. .
27. aabbcc 0 units
Characteristics of QT cont.
1 gene → 3 genotypes = 3 phenotypes
2 genes → 9 genotypes = 5 phenotypes
3 genes →27genotypes = 7 phenotypes
n genes → 3n genotypes = 2n+1 pheno.
Genotypic & Metric values
• The A allele will give 4 units while the a
allele will provide 2 units. At the other
locus, the B allele will contribute 2 units
while the b allele will provide 1 units. With
two genes controlling a trait, nine different
genotypes are possible. Below are the
genotypes and their associated metric
values:
Genotype Ratio in F2 Metric value
AABB 1 12
AABb 2 11
AAbb 1 10
AaBB 2 10
AaBb 4 9
Aabb 2 8
aaBB 1 8
aaBb 2 7
• A factor is a procedure or condition whose effect is to
be measured.
• Treatment
• Is the level or rate of a certain experimental factor
• a treatment may be a standard ration, inoculation,
and a spraying rate/spraying schedule
Characteristics of QT cont.
2. Dominance (allelic-interaction) can
obscure the true genotype effects.
3. Environmental variation and the
interaction of genotype with environment
obscure genetical effects.
4. Epistasis (non-allelic interaction) would
impose limitation to make prediction, for
example, predicted response to
selection.
Organization and Description of
Data
Categories of genetic studies
2. Molecular
Molecular genetics on the other hand, deals
with biochemical and molecular mechanisms
by which hereditary information is stored in
DNA (deoxyribonucleic acid) and
subsequently transmitted to proteins.
DNA is the molecule that stores genetic
information within the cell.
• Continuous Vs Discrete variables
• Continuous
– Infinite values in between
– eg. height of students, GPA etc
• Discrete
– separate categories
– eg. letter grade
2. Gene and Genotype Frequencies
Assuming that, in a population of diploid
organisms, the composition of a population, in
terms of gene A and a is as follows:
AA Aa aa Total
Number 2 12 26 40
Proportion 2/40 12/40 26/40
0.05 0.30 0.65 1.0
No. (A) 2(2) = 4 1(12) = 12 0(26) = 0
No. (a) 0(2) = 0 1(12) = 12 2(26) = 52
Total Alleles 4 24 52 80
2 x No. aa + 1 x No. Aa
Freq. a = q = ------------------------------
Total No. Alleles
= (2 x 26) + (1 x 2)
---------------------
80
= 52 + 12
----------
80
= 64
---
80
= 0.8
Random Mating
• Random mating occurs when every individual in
the population has the same probability (chance)
to mate with every other individual in the
population.
• Random mating is also called panmixia, while
the population involved is called a panmictic
population.
• In a panmictic population, panmixia usually only
occurs in large populations - with hundreds or
thousands of individuals.
Measure of central tendency
• The three most common measures of central tendency
• Mean
o Median
o Mode
Mean
• Mean is the arithmetic average of the values.
• To calculate the mean, all measurements are added and
then be divided by the number of observations.
Median
• Is the value that exactly separates the upper half of the
distribution from the lower half.
• Median is the point located in such a way that 50% of the
scores are lower than the median and the other 50% are
greater than the median.
Mode
• Mode is the most frequent value.
• It is categorized as a measure of central tendency,
because a glance at a graph of the frequency distribution
shows the grouping about a central point
• Mode is the highest point in the hump or it is the most
frequent score.
Measure of dispersion
• Range
• Standard deviation
• Variance
Methods of Data Collection
• Observation
• Interview
• Questionnaire
Methods of Data Collection
• Observation
• Interview
• Questionnaire
2. These traits are controlled by many
genes, and greatly influenced by
environmental factors. Therefore, it is
important to know how much (percentage)
of the variation is heritable and how much
is not. Information is important in selection
of traits in breeding and selection program.
3. Important in evolution studies.
4. Important in population studies.
Importance of Quantitative Genetics
Importance of Quantitative Genetics
1. Most economically important traits are
categorized here. Products of:
• Crops
• Livestock
• Micro-organisms
• Sampling techniques
 Probability (Random) Sampling
 Non-probability (Non-random) Sampling
• Probability (Random) Sampling
 Simple random sampling
 Systematic sampling
 Stratified sampling
 Clustered sampling
 Multistage random sampling
 Stratified multistage random sampling
• Non-probability (Non-random) Sampling
 Quota sampling
 Purposive Sampling
 Convenience sampling
Sampling methods
•Probability Sample
• Every unit in the population has a chance (greater
than zero) of being selected in the sample
• Probability samples are the best to ensure
representativeness and precision
Simple random sampling
• Applicable when population is small, homogeneous
& readily available
• This is done by assigning a number to each unit in
the sampling frame.
• A table of random number or lottery system is used
to determine which units are to be selected.
• Systematic sampling
• Relies on arranging the target population according to
some ordering scheme and then selecting elements at
regular intervals through that ordered list.
• Involves a random start and then proceeds with the
selection of every kth element from then onwards.
• A simple example would be to select every 10th name
from the telephone directory
• Stratified sampling
• Where population embraces a number of distinct
categories, the frame can be organized into separate
"strata".
• Each stratum is then sampled as an independent
sub- population, out of which individual elements
can be randomly selected.
• Cluster sampling
• An example of 'two-stage sampling'
• First stage a sample of areas is chosen;
• Second stage a sample of respondents within those
areas is selected.
• Population divided into clusters of homogeneous units,
usually based on geographical contiguity
• The most common variables used in the clustering
population are the geographical area, buildings, school,
etc
• Non- probability samples
– Probability of being chosen is unknown, cheaper- but
unable to generalize;potential for bias
• Convenience samples (ease of access)
– Sample is selected from elements of a population that
are easily accessible
• Purposive sampling (judgemental)
• You chose who you think should be in the study
• This is used primarily when there is a limited
number of people that have expertise in the area
being researched
• Quota sample
• The selection is non-random
• For example, interviewers might be tempted to
interview those people in the street who look most
helpful, or may choose to use accidental sampling
to question those closest to them, to save time.
• Quota sample
• The selection is non-random
• For example, interviewers might be tempted to
interview those people in the street who look most
helpful, or may choose to use accidental sampling
to question those closest to them, to save time.
• Quota sample
• The selection is non-random
• For example, interviewers might be tempted to
interview those people in the street who look most
helpful, or may choose to use accidental sampling
to question those closest to them, to save time.
With random mating,
AA Aa aa
P H Q
__________________________________
AA P P2 PH PQ
Aa H PH H2 HQ
aa Q PQ HQ Q2
As a result of panmixia, progenies with the
following proportions are obtained:
Mating Frequency Progeny Genotype Frequency
_________________________________________________
AA Aa aa
_____________________________________________________________________________
AA x AA P2 P2 - -
AA x Aa 2PH PH PH -
AA x aa 2PQ - 2PQ -
Aa x Aa H2 1/4H2 1/2H2 1/4H2
Aa x aa 2HQ - HQ HQ
aa x aa Q2 - - Q2
_____________________________________________________________________________
Total 1 (P + 1/2H)2 2(P + 1/2H)(Q + 1/2H) (Q + 1/2H)2
p2 2pq q2
____________________________________________________________________________
In random mating, mating of gametes is also random, therefore, in a
population with genotypes AA, Aa, aa, gamete frequencies are A=p
and a=q, and fusion between the two gametes will produce:
Male A a
p q
Female
A p AA p2 Aa pq
a q Aa pq aa q2
i.e.
AA Aa aa
Frequency p2 2pq q2
The formation of this new progeny population showed that the
composition of the succeeding generation depends on the gene
frequencies of the initial population.
AA Aa aa
A a
p q
_______________________
AA
A p p2 pq
Aa
a q pq q2
aa
The gene frequencies in this population are:
p = P + 1/2H = p2 + 1/2 (2pq)
= p2 + pq
= p (p + q)
= p
q = Q + 1/2H = q2 + 1/2(2pq)
= q2 + pq
= q(p + q)
= q
• This shows that, in a panmictic population, gene and
genotype frequencies remain constant.
Hardy-Weinberg Law of Equilibrium
• In a large and panmictic population,
considering one locus (unlinked gene), in
the absence of migration, mutation and
selection, gene and genotype frequencies
in the population remain constant from one
generation to another.
Hardy-Weinberg equilibrium
The relationship between gene and genotype
frequencies in the population in Hardy-Weinberg
equilibrium is:
Gene Genotype
A a AA Aa aa
p q p2 2pq q2
________________________________________
1 0 1 0 0
0.8 0.2 0.64 0.32 0.04
0.5 0.5 0.25 0.5 0.25
0.2 0.8 0.04 0.32 0.64
0 1 0 0 1
Hardy-Weinberg Law of Equilibrium
involves four situations/ stages to be
true
1. Gene frequency of parent Gene segregation - normal
Parent/Gamete - normal
Mating of Gametes – random
(large population)
2. Zygote genotype frequency
3. Progeny genotype frequency Equal viability
4. Progeny gene frequency
Multiple Alleles
• In some situations, there are more than two alleles on a locus. In
this case, the population will reach equilibrium after one generation
of random mating. This can be shown either by
- random mating of gametes, or
- random mating of genotypes
• Assuming the case of three alleles on one locus: A,a' and a
Gene Genotype
A a’ a AA Aa’ Aa a’a’ a’a aa
f p q r p2 2pq 2pr q2 2qr r2
The proof, after random mating of
gamete:
A a’ a
p q r
A p AA p2 Aa’ pq Aa pr
a’ q Aa’ pq a’a’ q2 a’a qr
a r Aa pr a’a qr aa r2
Inference:
Genotype AA Aa’ Aa a’a’ a’a aa
Frequency p2 2pq 2pr q2 2qr r2
P Q R S T U
After random mating of gamete:
pA = 2P + Q + R = 2P + Q + R
2(P + Q + R + S + T + U) 2
= P + 1/2Q + 1/2R
= p2 + 1/2(2pq) + 1/2(2pr)
= p2 + pq + pr
= p(p + q + r)
= p
After random mating of gamete:
qa’ = S + 1/2Q + 1/2T
= q2 + 1/2(2pq) + 1/2(2qr)
= q2 + pq + qr
= q(q + p + r)
= q
ra = U + 1/2R + 1/2T
= r2 + 1/2(2pr) + 1/2(2qr)
= r2 + pr + qr
= r(r + p + q)
= r
Multiple Alleles
• However, sometimes each of those genotype
cannot be differentiated by type, for example,
Genotype Aa’ AA,Aa a’a’,a’a aa
Blood group AB A B O
Frequency 2pq p2 + 2pr q2 + 2qr r2
The easiest way to calculate the gene frequencies is by the
reverse method, as follows:
ra = r2
= O
pA ?
The reverse method, cont.
B + O = q2 + 2qr + r2
= (q + r)2
but, q + r = 1 - p
therefore, (1 – p)2 = B + O
1 – p = (B + O)
p = 1 - (B + O)
qa’ ?
A + O = p2 + 2pr + r2
= (p + r)2
= (1 – q)2
(A + O) = 1 – q
= 1 - (A + O)
Factors affecting Equilibrium
1. Sex Linkage
• There are genes located on sex chromosomes,
i.e. these genes are always with a certain sex.
There are two forms of combinations of sex
chromosomes, homogamete (XX - female) and
heterogamete (XY or XO - male). Therefore, the
possible genotypes would be more.
Sex Linkage
For one locus, A/a, the possible genotypes
are:
Male Female
XY XX
A a AA Aa aa
XAY XaY XAXA XAXa XaXa
• Assuming that the gene frequencies in the
female and male populations are equal,
A=p, a=q, the panmictic population will
reach equilibrium.
p2 AA 2pq Aa q2 aa
p A p3 2p2q pq2
q a p2q 2pq2 q3
Sex Linkage
Progenies
Mating Female Male
Freq
AA Aa Aa A a
AA X A p3 p3 - - p3 -
Aa X A 2p2q p2q p2q - p2q p2q
aa X A pq2 - pq2 - - pq2
AA X a p2q - p2q - p2q -
AaX a 2pq2 - pq2 pq2 pq2 pq2
aa X a q3 - - Q3 - q3
Total p3+p2q
=p2(p+q)
=p2
2pq2+2p2q
=2pq(q+p)
=2pq
pq2+q3
=q2(p+q)
=q2
p3+p2q+pq2
=p(p2+2pq+q2
= p
q3+2pq2+p2q
=q(q2+2pq+p2)
=q
Sex Linkage
Equilibrium will only be reached if the gene
fruquencies in the male and female are the
same, i.e.,
pf = pm
Example:
Let pf =pm = 0.4; qf = qm = 0.6,
Male Female
A a AA Aa aa
0.4 0.6 0.16 0.48 0.36
Sex Linkage
• If the gene frequencies in the males and females
are not equal, equilibrium will not be reached
after one generation of panmixia. This is shown
below:
Female Male
AA Aa aa A a
P H Q R S
pf = P + 1/2H pm = R
p = 1/3 pm + 2/3 pf
Sex Linkage
• Since after panmixia, the male progenies received genes from the
female parents, while female progenies received half of the genes
from female parents, while the other half from the male parents, the
gene frequencies after one generation of panmixia are:
pm = pf'
pf = 1/2 (pf' + pm')
pf - pm = 1/2 (pf' + pm') - pf'
= -1/2pf' + 1/2pm'
= -1/2(pf' -pm')
i.e.;
1. the difference in gene frequencies between the males and females
is ½ after every generation of panmixia,
2. the direction of the difference is reverse every generation.
Example:
Initial population:
Male Female
A a AA Aa aa
0.2 0.8 0.2 0.6 0.2
pm = 0.2 pf = 0.2 + 1/2 (0.6)
= 0.5
pm = pf'
pf = 1/2(pf' + pm')
p = 1/3(0.2) + 2/3(0.5)
= 0.4
Generation pm pf pf - pm
________________________________________________________________
0 0.2 0.5 +0.3
1 0.5 0.35 -0.15
2 0.35 0.425 +0.075
3 0.425 0.3875 -0.0375
4 0.3875 0.40625 +0.01875
5 0.40625 0.396875 -0.009375
6 0.396875 0.4015625 +0.0046875
.
.
.
n 0.40000 0.40000 0.00000
____________________________________________________________
0
0.1
0.2
0.3
0.4
0.5
0.6
0 1 2 3 4 5 6 . . . n
pm pf
2. Two (or more) Linked Loci
• Equilibrium in the population is reached after one
generation of random mating if all loci are considered
separately.
• Equilibrium is not reached if the loci are considered
together. The rate in achieving equilibrium will be slower
if the loci are more tightly linked.
Assuming 2 loci A/a and B/b, with the gene frequency of:
A a B b
p q r s
At equilibrium, the genotype frequencies are:
AABB AABb Aabb AaBB AaBb Aabb aaBB aaBb aabb
p2r2 2p2rs p2s2 2pqr2 4pqrs 2pqs2 q2r2 2q2rs q2s2
Equilibrium will be reached, depending on the gamete frequencies
Gamete: AB Ab aB ab
Frequency: pr ps qr qs
• Equilibrium will be reached after one generation of random mating, if
all the gene frequencies are the same, i.e. p=q=r=s=0.5; or
pr=ps=qr=qs=0.25.
• At equilibrium, it is expected that the frequency of the repulsion
phase gametes equals to the frequency of the coupling phase
gametes.
A B
______________________________
______________________________
………………. X …..………………....
………………………………………....
a b
AB, ab = coupling phase gametes
Ab, aB = repulsion phase gametes
2. Two (or more) Linked Loci
At equilibrium,
(AB)(ab) = (Ab)(aB)
for example: A = B = 0.6, a = b = 0.4;
AB Ab aB ab
0.36 0.24 0.24 0.16
(0.36 x 0.16) = (0.24 x 0.24)
0.0576 = 0.0576
3. Changes in Gene Frequencies in
Populations
• According to Hardy-Weinberg Law of Equilibrium,
considering only one locus (gene), a population will
be at equilibrium after one generation of random
mating, in the absence of migration, mutation and
selection.
Migration
Let, in a large poplation:
m = proportion of new immigrants
1-m = proportion of natives.
Let the gene frequency of a certain gene among the
immigrants = qm and among the natives = q0. Then, the
gene frequency in the combined population:
q1 = mqm + (1 - m)q0
= m(qm - q0) + q0
Change in gene frequency as a
result of immigration:
(q ) = q1 - q0
= m(qm-q0)
• It can therefore be concluded that the
change in gene frequency in the new
population depends on:
– migration rate, and
– the difference in gene frequencies between
the immigrants and the natives.
Mutation
• Mutation is the sudden change of a gene
(allele) in a population to a different form.
The effect on the population depends on
the kinds of mutation.
2 kinds of mutation:
a. Non-Recurrent mutation
AA Aa
• This kind only involves a small change in the
large population. It is not important and not
effective, because its product has a small
chance to be viable in a large population.
Normally lost and does not show changes in the
succeeding generation, as it is usually in the
form of heterozygote.
b. Recurrent Mutation
This kind affects the gene frequency. Its occurance
is recurring, and has a certain frequency of
occurance in the population.
i. Unidirectional mutation
A a
Let, mutation rate/ generation =  (  = rate of
gene A changing to a per generation)
If frequency of A in a population = p0,
Freq. of new a genes in the next generation = p0.
At equilibrium,
p0 = q0, or q = 0,
p0
q0 = ------ ;

 ( 1 - q0 )
q0 = -------------- ;

 - q0
q0 = ---------- ;

q0  =  - q0 ;
q0 (  +  ) =  ;

q0 = ------- ;
 + 

 q = -------
 + 
 (not influenced by the initial gene frequency, but influenced by rate of
mutation).
The effect of mutation on gene frequency:
1. Normally low; 10-5 to 10-6 per generation (1 in 100,000 or
1,000,000 gametes carries the new allele mutated at any loci)
2. Mutations are more frequent from the wild type to mutant type,
rather than the reverse.
Example:
 = 0.00003,  = 0.00002. Gene frequency at equilibrium:
0.00003
q = -------------------------
0.00003 + 0.00002
= 0.6
I  I
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
 = 0.00003  = 0.00002
Number of generations needed for a certain frequency to be reached:
q0 - q
 = ln -------
qn – q
________
 + 
Example:
 = 0.00003
 = 0.00002
q = 0.6,
q0 = 0.10,
q1 = 0.20 0.1 - 0.6
 = ln --------------
0.2 - 0.6
------------------------
0.00003 + 0.00002
= ln 1.25
----------
0.00005
= 4463 generation
Two factors determine fitness:
a. Long life span.
b. Number of offsprings produced within a period.
These two factors lead to higher contribution to the
succeeding generation.If the difference in fitness is
associated with the presence or absence of a gene in the
genotype of an individual, selection is said to have been
done on the gene.
 The gene freq. in the offsprings will not be the
same as it was in the parents, because individuals in the
parental generation contribute genes to the next
generation at different rates among genes.
 Selection results in changes in gene frequencies,
and hence genotype frequencies.
Kinds of Selection
• The kinds of selection consider degree/rate of
dominance for the gene involved.
1. Selection Against Recessives
Selection depends on the degree of dominance of the gene
involved.
s = coefficient of selection;
1 = fitness : contribution of the favoured genotype;
1-s : contribution of the genotypes selected against.
Degree of dominance vs. fitness:
a. No dominance
Genotype aa Aa AA
Fitness 1-s 1-1/2s 1 against a
b. Partial dominance
Genotype aa Aa AA
Fitness 1-s 1-hs 1 aginst a
degree of dominance
c. Complete dominance
Genotype aa Aa;AA
Fitness 1-s 1 against a
d. Over dominance
Genotype aa AA Aa
Fitness 1-s2 1-s1 1 against homozygotes
- selection against AA and aa
Selection Against Recessives - Complete
Dominance
(Partial Elimination of Recessives)
AA Aa aa Total
Initial Freq. p2 2pq q2 1
Sel. Coef. 0 0 s
Fitness 1 1 1-s
Gamete p2 2pq q2(1-s)1-sq2
Contribution)
q1 = freq. of gene 'a' in the following generation
q2(1-s) + pq
q1 = ---------------
1-sq2
q = q1 - q0
pq + (1-s) q2
= ---------------- - q
1-sq2
pq + (1-s)q2 - q (1-sq2)
= ----------------------------
1-sq2
pq + q2 - sq2 - q + sq3
= --------------------------
1-sq2
q - q2 + q2 - sq2 - q + sq3
= ------------------------------
1-sq2
-sq2 (1-q)
q = -------------
1-sq2
Partial Elimination of Recessives
• Determining factors:
1. initial gene freq.
2. selection coefficient.
2. Complete Elimination of Recessives
AA Aa aa
Initial Freq. p2 2pq q2
Fitness 1 1
Gamete p2 2pq 0
q
q1 = ---------
(1+q)
q = q1 - q0
q -q2
= -------
1+q
q depends on initial gene freq.
Freq. decrease at higher rate if initial freq. is high.
Freq. decrease at lower rate if gene freq.
gradually reduced.
q2 q1 q / (1+q)
= ------- = --------------------
1+q1 1+ { q / (1+q) }
q
-----
1+q
= -------------
1+q+q
----------
1+q
q 1+q
= ----- x --------
1+q 1+2q
q
q2 = -----------
1 + 2q
q3 = q
-----------,
1 + 3q
q
qn = ---------
1 + nq
qn (1 + nq) = q,
qn + nqnq = q
q - qn
n = --------
qqn
1 1
n = ---- - ----
qn q
Example:
q0 = 0.2
qn = 0.1
How many generations are required to reduce the frequency of
the recessive gene from q0 to qn through selection by
elimination of all recessives?
1 1
n = ---- - ----
0.1 0.2
= 5 generations
if q0 = 0.02, qn = 0.01,
1 1
n = ----- - -----
0.01 0.02
= 50 generations
Selection Against Heterozygotes
AA Aa aa Total
Initial freq. p2 2pq q2 1
Fitness 1 1-s 1 -
Gamete p2 2(1-s)pq q2 1-2pqs
contribution
pq (1 - s) + q2
q1 = ----------------
1 - 2pqs
q - spq
= ----------
1 - 2pqs
Δq = q1 - q
q - spq
= ---------- - q
1 - 2pqs
q - spq - q + 2pq2s
= ------------------------
1 - 2pqs
spq (2q - 1)
= ---------------
1 - 2pqs
if s too small, 2pqs approaching 0
Δq = spq (2q - 1)
= 2spq (q – 1/2)
If q = 1/2, Δq = 0;
If q  1/2, Δq positive, and increase
with generation.
If q < 1/2, Δq Negative, and decrease
every generation.
.
+ q
Δq 
0 1/2 1.0
_ q
Selection For Heterozygotes
• Normal case in natural situation
• Both alleles will be maintained in the population
and will not be lost.
• With random mating, equilibrium is reached.
AA Aa aa Total
Initial freq. p2 2pq q2 1
Fitness 1-s1 1 1-s2 -
Gametic p2 (1-s1) 2pq q2(1-s2) 1-s1p2-s2q2
contribution
pq + q2 (1 - s2)
q1 = ------------------- p + q = 1
1 - s1 p2 - s2q2
q - s2q2
= ---------------
1 - s1 p2 - s2q2
q - s2q2
Δq = ---------------- - q s gets small,
1 - s1 p2 - s2q2
= q – s2q2 – q + s1p2q + s2q3
= s1p2q – s2q2(1 – q)
= s1p2q – s2pq2
Δq = pq (s1p – s2q)
Selection For Heterozygotes
• If Δq = 0,
s1p = s2q
s1(1 –q) = s2q
s1 - s1q = s2q
q (at equilibrium) s1
= ----------
balanced polymorphism s1 + s2
Conclusion
1. In natural selection, if selection is against H is
conducted, then q will increase or decrease
depending on q and s, and q will remain
constant at q = ½
2. In selection for H, no gene will be lost or
eliminated and the rate of gene frequency
depends on the initial gene frequency and
selection coefficients
3. For selection against recessive, the recessive
will be lost very fast, if the initial frequency is
high and vice versa.
4. SMALL POPULATION SIZE
• Introduction
– In previous lectures, we discussed on agents
of change in gene and genotype frequencies
where the population size is large, i.e. in the
absence of migration, mutation or selection,
gene and genotype frequencies remain
constant from one generation to another, in a
random mating situation, in a large population
(systematic process).
SMALL POPULATION SIZE
• These features are not true in small populations.
The gene frequencies are exposed to random
increase and decrease which occur from gamete
sampling, because small populations can be
considered as samples of large populations. If
the sample size is not large enough, it will not
represent the large population, and thus
changes of gene frequencies occur. The
process of change in gene frequencies at
random in a small population is called a
dispersive process.
Prevailing Situations in a Dispersive Process:
1. Random Drift (Wright's Effect)
- Changes in gene frequencies at random.
- Frequency changes irregularly from one generation
to another, and normally does not return to its initial value.
2. Differentiation among sub-populations
Drifts occur independently within the small populations which are
contained in the large population. Matings are only confined within
the sub-populations. No random mixing of the large population.
Prevailing Situations in a Dispersive Process
Small pop.
• Large pop.
Prevailing Situations in a Dispersive Process
3. Uniformity in small populations
– Genetic variations within small populations become
small.
– Because of inbreeding, etc., many unfavourable effects
are seen.
4. Homozygosity increases among individuals
within small population.
- many unfavourable effects to population.
- fertility
- viability, etc.
Incomplete Block Designs
• Large number of treatments to be tested
• It is difficult to get uniform blocks large enough to
accommodate a complete replication of all the
treatments
• Precision increases as the block size decreases
• Smaller blocks are preferred to larger ones
Idealised Population
• A large population where mating is at random,
and population then sub-divided into many sub-
populations. This is due to geographical or
ecological factors (natural), or controlled mating
(laboratory or controlled environment).
• Initial population, which undergoes random
mating is called base population, and the sub-
populations called lines.
Idealised Population
Lines
Base populations
• Characteristics of lines can be combined to form
the characteristics of the base population.
Balanced Incomplete Block Design (BIBD)
• Every pair of treatments occurs once in the same
incomplete block
• All pairs of treatments are compared with the same degree
of precision
• Each treatment occurs together with every other treatment
in the same block equal number of times
Balanced Incomplete Block Design (BIBD)
• Each block contains the same number of units
• Each treatment occur the same number of times in total
• Each pair of treatments occurs together the same number
of times in total
 A design that satisfies these conditions is called Balanced
Incomplete Block Design
Characteristics of Idealised Population
1. Mating only occurs among individuals within a line.
= No migration between lines.
2. Generations do not overlap among each other.
3. Number of individuals in each line is the same, = N
4. Random mating among individuals within lines.
5. No selection or mutation at any level.
RCBD with
seven treatments
Balanced Incomplete Block Design (BIBD)
Idealised Population
Base Population (n = )
Individual N N N N
Gamete 2N 2N 2N 2 N
Individual N N N N
Sampling in Idealised Population
• For idealized population q = qo
If error is committed,
p0q0
2
Δq = ____
2N
= variance to the differences in gene frequencies.
This difference occurs when sampling is done from each of the lines.
This caused the final gene frequency not the same as the initial gene
frequency.
ie q  q o
Sampling in Idealised Population
• sub-populations have different
characteristics
– random drift
– some genes will be lost, while others fixed
in the population
Balanced Incomplete Block Design (BIBD)
The sum of squares for total, replication, treatment and error are computed as in any other
designs. The sum of squares due to block is a new statistic to be computed in lattice
designs.
1. Correction factor C.F. = 2
2
)
(
rq
GT
2. Total SS = ∑∑X2
ij(l) – C.F.
3. SSR = 2
2
q
Rj
 –C.F.
4. SSB =
)
1
(
2


r
qr
Cij
-
)
1
(
2
2


r
r
q
Ci
5. SSt =
r
Ti

2
– C.F.
6. SSE = Total SS – SSR – SSB – SSt
Balanced Incomplete Block Design (BIBD)
Practical Example……..
1.A breeder would like to evaluate 16 highly advanced hybrids in balanced lattice
design as the experimental area has variability in terms soil acidity with unknown
direction of the gradient. Then he conducted the experiment and obtained the
following measurements. The statistical objective of this example is to get
familiarize with the analysis of variance for balanced lattice design.
Balanced Incomplete Block Design (BIBD)
Balanced Incomplete Block Design (BIBD)
The stepwise analysis is as follows:
1.Compute Bj
a. B1 = Bi1 + Bi5 + Bi9 + Bi13 + Bi17
= 62.4 + 63.2 + 65.1 + 81.1 + 58.1
=329.9
a. B2 = Bi1 + Bi6+ Bi10 + Bi14 + Bi18
62.4 + 58.9 + 65.7 + 74.2 + 75
=336.2
a. B3 = Bi1 + Bi7 + Bi11 + Bi15 + Bi19
= 62.4 + 72.5 + 63.7 + 69.7 + 77.9
=346.2
a. B4 = Bi1 + Bi8+ Bi12 + Bi16 + Bi20
= 62.4 + 76.8 + 69.0 + 69.5 + 83.5
=361.2
a. B5 = Bi2 + Bi5 + Bi10 + Bi15 + Bi20
= 61.6 + 63.2 + 65.7 + 69.7 + 83.5
=343.7
a. B6 = Bi2 + Bi6 + Bi9 + Bi16 + Bi19
61.6 + 58.9 + 65.1 + 69.5 + 77.9
=333.0
a. B7 = Bi2 + Bi7 + Bi12 + Bi13 + Bi18
= 61.6 + 72.5 + 69.0 + 81.1 + 75.0
=359.2
a. B8 = Bi2 + Bi8 + Bi11 + Bi14 + Bi17
= 61.6 + 76.8 + 63.7 + 74.2 + 58.1
=329.9
a. B9 = Bi3 + Bi5 + Bi11 + Bi14 + Bi17
= 60.9 + 63.2 + 63.7 + 74.2 + 58.1
=332.3
Balanced Incomplete Block Design (BIBD)
a. B10 = Bi3 + Bi6 + Bi12 + Bi15 + Bi17
• = 60.9 + 58.9 + 69 + 69.7 + 58.1
• =329.9
a. B11 = Bi3 + Bi7 + Bi9 + Bi14 + Bi20
• = 60.9 + 72.5 + 65.1 + 74.2 + 83.5
• =356.2
a. B12 = Bi3 + Bi8 + Bi10 + Bi13 + Bi19
• = 60.9 + 76.8 + 65.7 + 81.1 + 77.9
• =362.4
a. B13 = Bi4 + Bi5 + Bi12 + Bi14 + Bi19
• = 73.7 + 63.2 + 69 + 74.2 + 77.9
• =358
a. B14 = Bi4 + Bi6 + Bi11+ Bi13 + Bi20
• = 73.74 + 58.9 + 63.7 + 81.1 + 83.5
• =360.9
•
a. B15 = Bi4 + Bi7 + Bi10 + Bi16 + Bi17
• = 73.7 + 72.5 + 65.7 + 69.5 + 58.1
• =339.5
a. B16 = Bi4 + Bi8 + Bi9 + Bi15 + Bi18
• = 73.7 + 76.8 + 65.1 + 69.7 + 75 = 360.3
Balanced Incomplete Block Design (BIBD)
1. Compute Wj
a. W1 = qT1 – (q+1)B1 + GT
= (4x80.7) – (4+1) x 329.9 + 1382.5
= 55.8
b. W2 = qT2 – (q+1)B2 + GT
= (4x80.4) – (4+1) x 336.2 + 1382.5
= 23.1
c. W3 = qT3 – (q+1)B3 + GT
= (4x91.3) – (4+1) x 346.2 + 1382.5
= 16.7
d. W4 = qT4 – (q+1)B4 + GT
= (4x91.5) – (4+1) x 361.2 + 1382.5
= -57.5
e. W5 = qT5 – (q+1)B5 + GT
= (4x81.3) – (4+1) x 343.7 + 1382.5
= -10.8
f. W6 = qT6 – (q+1)B6 + GT
= (4x85.3) – (4+1) x 333 + 1382.5
= 58.7
g. W7 = qT7 – (q+1)B7 + GT
= (4x87.7) – (4+1) x 359.2 + 1382.5
= -62.7
h. W8 = qT8 – (q+1)B8 + GT
= (4x87.7) – (4+1) x 334.4 + 1382.5
= 61.3
i. W9 = qT9 – (q+1)B9 + GT
= (4x80.2) – (4+1) x 332.3 + 1382.5
= 41.8
Balanced Incomplete Block Design (BIBD)
a. W10 = qT10 – (q+1)B10 + GT
= (4x59.4) – (4+1) x 316.6 + 1382.5
= 37.1
b. W11 = qT11 – (q+1)B11 + GT
= (4x95.8) – (4+1) x 356.2 + 1382.5
= -15.3
c. W12 = qT12 – (q+1)B12 + GT
= (4x95.1) – (4+1) x 362.4 + 1382.5
= -49.1
d. W13 = qT13 – (q+1)B13 + GT
= (4x87.5) – (4+1) x 358 + 1382.5
= -57.5
e. W14 = qT14 – (q+1)B14 + GT
= (4x99.9) – (4+1) x 360.9 + 1382.5
= -22.4
f. W15 = qT15 – (q+1)B15 + GT
= (4x90.8) – (4+1) x 339.5 + 1382.5
= 48.2
g. W16 = qT16 – (q+1)B16 + GT
= (4 x 87.9) – (4+1) x 360.3 + 1382.5
= -67.4
Balanced Incomplete Block Design (BIBD)
1. Compute Sum of squares for the different components. The best way is to start with the
adjusted block sum of squares because the mean square of the block is an important
component for making decision whether we continue the analysis as lattice or as RCBD
after comparing it with MS of error.
a. SS block (adjusted) =
)
1
(
3
2


q
q
Wj
=
)
1
4
(
4
]
)
4
.
67
(
...
)
1
.
23
(
)
8
.
55
[(
3
2
2
2





=109.14
b. MS block =
1
)
(
2

q
adj
block
SS
=
1
4
14
.
109
2

= 7.28 = Eb
c. Correction factor(C.F.) = 2
2
)
(
rq
GT
= 2
2
4
5
)
5
.
1382
(
x
= 23891.33
d. Total SS = ∑Xijk
2
– CF = [(14.9)2
+ (15.2)2
+ … + (22.5)2
] - 23891.33
= 566.4
Balanced Incomplete Block Design (BIBD)
a. SS treatment (unadj.)
SSt (unadj.) = .
.
2
F
C
r
Ti


=
5
]
)
9
.
87
(
...
)
4
.
80
(
)
7
.
80
( 2
2



= 257.13
b. SS replication (SSR)
SSR = .
.
2
2
F
C
q
Rj


= 2
2
2
4
)
5
.
294
(
...
)
4
.
271
(
)
6
.
258
[( 


-23891.33
= 72.71
c. SS due to error (SSE)
SSE = Total SS – SSt(unadj) – SS block(adj) – SSR
= 566.4 – 257.13 – 109.14 – 72.71
= 127.42
d. Degree of freedom for error = (q-1)(q2
-1) = (4-1)(42
-1) = 45
e. MSE = SSE/d.f. for error = 127.42/45 = 2.83 = Ee, Once the two statistics are obtained, it
is possible to check whether μ is positive or not. If it is positive we will continue the
analysis as lattice, if not as in RCBD.
Balanced Incomplete Block Design (BIBD)
a. μ =
b
e
b
E
q
E
E
2

=
28
.
7
4
83
.
2
28
.
7
2
x

= 0.04, since μ is
positive we will proceed to adjust
treatment means as in Table 4.72.
Let T’j = Tj + μWj where Tj is unadjusted treatment total
Table 4.72. Computing adjusted treatment means
Treatment Tj Bj Wj T’j = Tj + μWj Adjusted mean(T’j/r)
T1
T2
T3
T4
T5
T6
T7
T8
T9
T10
T11
T12
T13
T14
T15
T16
80.7
80.4
91.3
91.5
81.3
85.3
87.7
87.7
80.2
59.4
95.8
95.1
87.5
99.9
90.8
87.9
329.9
336.2
346.2
361.2
343.7
333.0
359.2
334.4
332.3
316.6
356.2
362.4
358.0
360.9
339.5
360.3
55.8
23.1
16.7
-57.5
-10.8
58.7
-62.7
61.3
41.8
37.1
-15.3
-49.1
-57.5
-22.4
48.2
-67.4
82.93
81.32
91.97
89.20
80.87
87.65
85.19
90.15
81.87
60.88
95.19
93.14
85.20
99.00
92.73
85.20
16.59
16.26
18.39
17.84
16.17
17.53
17.04
18.03
16.37
12.18
19.04
18.63
17.04
19.80
18.55
17.04
Balanced Incomplete Block Design (BIBD)
1. SSt (adjusted) =
1
2


q
Ti
– CF
=
1
4
]
)
20
.
85
(
...
)
32
.
81
(
)
93
.
82
[( 2
2
2




- 23891.33
= 223.77
2. MSt = SSt(adj)/q2
–1= 223.77/42
-1 = 14.92
3. Effective error MS = Ee (1 + qµ) = 2.83(1+4x0.04) = 3.2828
4. F-calculated = MSt/Effective error MS = 14.92/3.2828 = 4.54
5. Finally ANOVA table can be constructed as in Table 4.73
Table 4.73. ANOVA for balanced lattice design
Sources of d.f. SS MS F-cal F-tabulated
Variation 0.05
0.01
Replication 4 72.71 18.18
Block (adj) 15 109.14 7.28 2.57* 1.90
Treatment(adj) 15 223.77 14.92 4.54* 1.90
Intra-block error 45 127.13 2.83
Effective error 45 3.2828
*, significant at 0.05 level of probability,
Balanced Incomplete Block Design (BIBD)
CV = 100
)
( x
GM
MS
Effective E
= 56
.
11
28
.
17
28
.
3

1. Compute standard errors
SE(m) =
r
q
Ee )]
1
(
[ 

=
5
)
04
.
0
4
1
(
83
.
2 x

= 0.81
SE(d) =
r
q
Ee )]
1
(
2
[ 

=
5
)]
04
.
0
4
1
(
83
.
2
2
[ x
x 
] =
CD/LSD = SE(d) x t0.05 at error degree of freedom
= 1.15 x 2.019
= 2.32
2. Compute relative efficiency of lattice over RCBD
MSERCBD =
error
block
E
B
f
d
f
d
SS
adj
SS
.
.
.
.
)
(


=
45
15
)
42
.
127
(
)
14
.
109
(


= 3.94
Effective error MS = 3.2828
RE = MSERCBD/Effective error MS = 3.94/3.2828 = 1.20
Interpretation
 Treatment effect was found to be significant
 T14 had the highest grain yield (19.80 t/ha) followed by
T11 and T12 (but statistically these three treatments
did not show differences among themselves)
 There was also a significant block effect implying that
blocking helped in reducing experimental error
 The relative efficiency of 1.20 indicates that the use of
lattice design instead of RCBD improved precision by
20%
Partially Balanced Design
 Characteristics of partially balanced design:
 All treatments do not occur together in the same block
 The number of replications is not restricted
 Number of treatment must be a perfect square
Differences in genotype frequencies
• Therefore: Difference from initial
AA : p2 = ( p )2 + q
2
Aa : H = 2pq - 2 q
2
aa : q2 = ( q)2 - q
2
= Wahlund’s Formula
Inbreeding
• Inbreeding-breeding together of individuals
more closely related than mates chosen at
random from a population (mating of
relatives)
• Inbreeding coefficient – probability of any
individual (diploid individual) being an
identical homozygote.
• probability that the 2 genes of a random
member of a pop. are identical by descent
Inbreeding
A B C D
E F
G
Inbreeding
a a - allele
-identical by descent
• When we have more genes/genotypes that are identical
by descent, then the higher is the chance of inbreeding
incidence in the population. The incidence of inbreeding
is measured by coefficient of inbreeding (F)
aa
Inbreeding
m
F = 1/2  (1/2)n (1 + FA)
i=1
Fi = Inbreeding coefficient of an individual or a population of
individuals
n = No. of generations separating the male and female
parent through the common ancestor.
m = No. of pathways to get the individual through common
ancestor
FA= coefficient of inbreeding for common ancestor for seed
and pollen parent.
In any breeding program,
population mean refers to both phenotypic and
genotypic values, because we consider the
environmental deviation = 0 in the population.
To assume this, we have to have a good control
of the environment, i.e. by growing or raising all
individuals in the population in the area with no
difference in the environmental influence.
Alpha Lattice Design
 Alpha-lattice design is replicated designs that divide the
replicate into incomplete blocks that contain a fraction of the
total number of entries
 It bridges the gap between RCBD and lattice designs
 The number of treatments should not necessarily be a
perfect square
 Genotypes are distributed among the reps so that all pairs
occur in the same replication in nearly equal frequency
Alpha lattice design
• Suppose a researcher is interested in evaluating 20 genotypes in
alpha (0,1)-lattice design. Then, there will be t = 20, q = 4 and
number of treatments in each block = 5
Alpha Lattice Design
Used when we have large number of genotypes and small area
There are no checks varieties for estimation error
It reduce the effect of within-complete-block variation
They can also provide repeatability, particularly in trials
Maximizes the use of comparisons between genotypes in the
same incomplete-block
Advantages of alpha lattice design
 It allows the adjustment of treatment means for block effects
 The small incomplete blocks create homogeneous comparisons
 It provides effective control within replicate variability
Augmented Design
 Augmented designs also use grids or incomplete blocks to
remove some field variation from the plot residuals
 In an augmented design, a large set of experimental lines is
divided into small incomplete blocks
 In each incomplete block, a set of checks is included; every
check occurs in each incomplete block
Augmented Design
 Because the design is unreplicated, the repeated checks are
used to estimate the error mean square and the block effect
 The block effect is estimated from the repeated check means
and then removed from the means of the test varieties
 This reduces error and increases precision somewhat
Augmented Design
Augmented Designs
Developed by Federer (1956)
Used to test a large number of lines in a limited area
Used when other designs are not appropriate due to large
number of entries
In augmented designs the goal is to compare existing (control)
treatments with new treatments that have an experimental
constraint of , limited replication and resources
Augmented Design
Experimental lines replicated once
Checks occur in each block
Checks used to estimate block effects
Checks provide error term
Difficult to maintain homogeneous blocks when comparing
Flexible – blocks can be of unequal size
Disadvantage of Augmented Design
Considerable resources are spent on production and
processing of control plots
Relatively few degrees of freedom for experimental
error, which reduces the power to detect differences
among treatments
Un-replicated experiments are inherently imprecise, no
matter how sophisticated the design
Analysis of Quantitative Traits
Consider,One locus, two alleles - A1/A2
Genotypes: A2A2 A1A2 A1A1
Value -a 0 d +a
Alelle A1 has value that increases the mean value.
d : depends on the degree of dominance
d = 0, no dominance
d = +, A1 > A2
if complete dominance, d = +a, -a
Over dominance, d > +a or d < -a
Degree of dominance = d/a
Population Mean
In a population, population mean is product of the above genotypic
values, after the effects of all loci controlling the trait is combined. i.e.
when every genotypic value is multiplied with its frequency, and then
total for all three genotypes taken.
Genotype Frequency Value Frequency x Value
A1A1 p2 +a p2a
A1A2 2pq d 2pqd
A2A2 q2 -a -q2a
Population Mean = a(p - q) + 2dpq
Population Mean
Population Mean:
M = a(p – q) + 2dpq
M = a(p - q) + 2dpq
Produced by Produced by
Homozygotes heterozygotes
Population Mean
• The value is the product of gene
contribution from all/many loci, effect of
combination = population mean.
• Assume here that they all combine
additively,
M = a(p - q) + 2dpq
Average Effect
• Average effect of a gene is the average deviation from the
population mean, of individuals receiving one gene from one parent,
while the other is received at random from the population.
• Average effect of a gene substitution - effect on the population
mean, when a gene is substituted with another with a different form.
ie. A1 → A2 A1A1 → A1A2
A2 → A1 A1A2 → A1A1
1 locus, 2 alleles,
Frequency of A1 = p
Frequency of A2 = q
Average Effect
• Average effect of gene A1= 1,
• If a gamete carrying A1 combines with
gametes at random in the population, the
genotype frequencies resulting would be,
A1A1 = p
A1A2 = q
Genotypic value for A1A1 = a
Genotypic value for A1A2 = d
Mean for both = pa + qd
Average Effect
• Difference between this mean value and the population mean is
average effect of gene A1.
 1 = pa + qd - a (p - q) + 2dpq
 1 = qa + d(q - p)
For gene A2,
2 = -pa + d(q - p)
If A2 is taken at random, genotype frequency,
A1A2 = p
A2A2 = q
Changing A1A2 A1A1, changing value of d to +a
 effect = (a - d)
Average Effect
• Changing A2A2 A1A2, changing value of -a to d
 effect of gene substitution
 = p (a - d) + q(d+a)
Computation: pa - pd + qd + qa
= a(p+q) + d(q - p)
= a + d(q - p)
  = a + d(q - p)
Relate with 1, 2,
 = 1 - 2
Breeding Value
• The value of an individual as judged by the
mean value of the progenies is called the
breeding value.
• Breeding value can be measured by the
value of deviation from the population
mean.
Breeding Value
population
X
A
• If a certain individual is mated with a group of individuals at random
from a population,
• Breeding value = 2 x the average deviation of the progenies from
the population mean.
• In the context of average effects, the breeding value of an individual
= total average effects of the genes it carries, summed up for all
pairs of genes (alleles) at every locus, for all loci involved.
XXXXXXXX
XXXXX
Breeding Value
• Considering one locus, the breeding value
for the genotypes:
B.V. A1A1 = 21 = 2q
B.V. A1A2 = 1 + 2 = (q - p) 
B.V. A2A2 = 22 = -2p
Values of progenies
Breeding Value
Arbitrary Value Breeding value
+ a 2q

d (q-p)
0 

-a -2p
0 1 2
A2A2 A1A2 A1A1
q2 2pq p2 No. of A1 gene
• We have discussed about only a component of the
genotype value, i.e. additive effects; i.e. breeding value
ie. G = A + D + I
G = genotypic value
A = breeding value
D = dominance deviation
I = interaction deviation
• For one locus only:-
G = A + D
Breeding Value
Dominance Deviation
 Dominance Deviation is a function of d
d = 0, DD = 0
 all genes are additive in nature
I Deviation Interaction
G = A + D + I
• When more than one locus involved. If I  0, there is locus
interaction contributing to the genotypic value. It is called epistasis. I
is also called epistatic deviation.
• If I = 0, the genes are said to act additively among loci.
• If 1 locus involved, additive action means the absence of
dominance.
• If more than 1 locus involved, additive action means absence of
epistasis.
6. MATING DESIGNS AND ESTIMATION
OF GENETIC PARAMETERS
• Heterosis = hybrid vigor
– the superiority of F1 over its parents
– Positive traits
– Negative traits
Heterosis cont.
• Superiority of the cross over its
parents
– A/a locus, AA x aa = Aa, position
– Dominance √
– Epistasis √
– Additive X
AA MP aa
Aa
Aa Aa
Heterosis cont.
• Average parent,
MPH = 100(F1-MP)/MP
• Better parent,
BPH = 100(F1-BP)/BP
Heterosis cont.
• What is better parent?
• For which traits
• Positive traits:
• Negative traits:
• Measurement and score:
Heterosis cont.
• What is the basis for heterosis?
– Theory of dominance
– Theory of over-dominance
– Theory of physiological enhancement
Heterosis cont.
• Heterotic patterns
• Theory of testers
– Broad-base
– Narrow-base
• General combining ability
• Average performance,
• Additive genetic variance, σ2
A
• Specific combining ability
• Specific combinations,
• Non-additive genetic variance, σ2
I
Combining Ability
DIALLEL MATING DESIGN
Introduction
Diallel cross - mating design where all possible
crosses are made on an individual or
population (inbred, variety) to obtain all
possible combinations.
- Complete diallel
- Partial diallel (half-diallel)
DIALLEL MATING DESIGN
Exmaple: n inbred lines, therefore
n x n = n2 Components:
n parents = 7
½ n (n-1) crosses = 21
½ n (n-1) reciprocals = 21
DIALLEL ANALYSIS
• Diallel is difficult to construct but useful
to obtain genetic information from
populations.
• Started by Sprague and Tatum (1942)
• Uses of diallel (Hayward, 1979):
Application of the dialel cross to
outbreeding crop species.
USES OF DIALLEL ANALYSIS
1. Strategic survey on populations
as initial breeding materials in
breeding programmes.
– observe variance components
– observe genetic variablility
– estimate heritability
- Use top–cross
USES OF DIALLEL ANALYSIS
2. Tactical assessment on genetic
relationships among selected elite
genotypes.
– Selection can be done to select parents
that have good combining ability.
– Inbred lines have to be first developed and
then tested.
USES OF DIALLEL ANALYSIS
Examples: - Evaluate SCA, GCA on hybrid
combinations among inbred lines.
Methods of estimating GCA and SCA:
1. Diallel Analysis
2. Mating Designs I, II, III
3. Test cross performance
- top cross, inbreds, hybrids, full/half-sibs.
4. Self-progeny performance.
Methods of Diallel Analysis
Many methods have been proposed
including,
a) Hayman's (1954)
b) Griffing's (1956) - 4 Methods
c) Gardner and Eberhart's (1966)
- Analysis II, Analysis III.
- Focus on Griffing's Methods
(Ref: Issues of Dialel Analysis by Baker, 1978)
Assumptions used in diallel analysis
a. Diploid segregation of the individuals involved.
b. Homozygous parents.
c. No reciprocal differences.
d. No epistasis.
e. No multiple alleles.
f. Uncorrelated gene distribution in the two parents
a) Example of strategic survey on populations: To determine the features of a trait in
terms of its genetic components, regression of Wr against Vr is used.
Lets say, we test five parents which were entered into a diallel i.e. A, B, C, D
and E, to form progenies.
Partial dominance
Wr Complete dominance
W2 = Vr Vp
xA Over dominance
xC
xD xB
xE
0 Vr
Wr = covariance of progenies on parents
Vr = variance of progenies on parents.
Diallel Analysis
• From the Wr - Vr graph above:
- The line that passes through the origin (0)
shows complete
dominance as the main feature of the
control of the trait concerned.
Diallel Analysis cont.
a. Above origin - partial dominance.
b. Below origin - over dominance.
c. The larger the Vr value, the
higher is the interloci interaction.
Diallel Analysis cont.
d. E.g. A and E are more different
from each other genetically
because their points are far apart
on the graph, as compared to e.g.
D and E.
e. All points within the parabola -
parabola limits the values of the
coordinates.
Diallel Analysis cont.
f. If points close to the origin
- more dominant genes.
g. If far from origin
- more recessive genes.
Example, 7 x 7 half diallel:
1 2 3 4 5 6 7
1 37.250 38.500 38.375 39.500 37.375 38.125 38.375
2 30.500 32.125 32.750 34.875 38.750 32.625
3 31.000 32.625 34.875 39.000 35.125
4 32.250 36.375 37.500 35.375
5 35.250 38.875 35.625
6 38.500 38.625
7 34.250
Diallel Analysis
Correction factor = ( one parental cross value )2
n
= (37.250 + 30.500 + ......+ 34.250)2
7
= 8160.143
Variance:
Vp (phenotypic variance of population):
= 1 [(one-parent cross value)2 ] - Correction factor (C.F.)
n-1
= 1/6 x [37.2502 + 30.5002 + ..... + 34.2502 ]2 – C.F.
= 9.4345
Correction factor = (Grand total)2/Total number of observations.
= (23231.82)2
4x64
= 2108271.3301
Total S.S. = (104.86)2 + (88.66)2 + .......... + (81.48)2 - C. F.
= 127712.5000
Treatments S. S. = (342.58)2 + (348.05)2 +.......... + (328.00)2 -
C.F.
4
= 104924.1604
Replication S. S. = (5811.48)2 + .......... + (5951.34)2 - C.F.
64
= 1037.0241
Error S. S. = Total S. S. - Treatment S. S. - Replication S. S.
= 21751.3155.
Diallel Analysis
Vi = 1 {  (value of all crosses to i )2 -[ ( value of all crosses
n - 1 to i )2/n ] }
V1 = 1/6 { ( 37.2502 + 38.5002 + .... + 38.3752) -
(37.250 + 38.500 + .... + 38.375)2/7)
= 0.57143
V2 = 1/6 { ( 38.5002 + 30.5002 + .... + 32.6252) -
(38.500 + 30.500 + .... + 32.625)2 / 7)
= 10.35863
V3 = 9.47098
V4 = 7.75446
V5 = 2.21801
V6 = 0.26786
V7 = 4.61830.
Diallel Analysis
Covariance:
Wi = 1 X { [ (cross of parent with i X one-parent cross of the specific
n – 1 parent concerned)] - [(total of all crosses to i) ( total of all
one-parent cross ) / n]}
W1= 1/6 X [(37.250 X 37.250) + (38.500 X 30.500) + .....+ 38.375 X
34.250)] - [(37.250 + 38.500 + ....+ 38.375) ( 37.250 +
30.500 +........+ 34.250)] / 7
= -1.37946
W2 = 9.41815
W3 = 9.22173
W4 = 7.88373
W5 = 3.30878
W6 = 0.22098
W7 = 5.74033
Finally, the graph of Wr vs. Vr can be constructed:
Wr
10 -
9 -
8 - 2*
7 - 3 *
6 - 4 *
5 - 7*
4 -
3 - 5*
2 -
1 - * 6
l l l l l l l l l
0 1 2 3 4 5 6 7 8 9 10
Vr
*1
Diallel Analysis
• Deductions from the graph:
– Points close to each other, parents are similar.
– Points far from each other, parents are different.
– Generally, this trait is controlled by genes with complete
dominance.
– Example, 1, 6 carry more dominant genes, 2, 3, 4 carry
more recessive genes.
– This analysis is called graphical analysis of a diallel cross.
– Convenient with the use of computers.
Diallel Analysis
b. Example of the Use of Diallel Analysis for Tactical Assessment
• To test the GCA and SCA for certain hybrid combinations.
– GCA – to determine the average performance of lines/inbreds in
hybrid combinations.
– SCA – to compare performance of one cross with the other crosses.
i.e. is it better or worse than the average performance
of all crosses.
• Example: A x B; A x C; A x D; A x E; A x F.
Average for A crosses = ?
Compare with A x F, for example to determine SCA(A x F)
• Griffing’s Method (1956) : For n2 diallel table
‘v’ genotype ‘b’ block ‘c’ individuals/plot
Diallel Analysis
• Observation on performance:
Xijkl = μ + νij + bk + (bν)ijk + ejkl
μ = overall population mean
νij = genotype
bk = k th block effect
(bν)ijk = block and genotype interaction effect
eijkl = experimental error
• then use analysis of variance to look at significance of differences.
• genotypes were normally chosen for specific goals, i.e. hybrids, etc.
MSv
F = ........
MSe
• - if the effect of genotypes is significant, look at the components of M.S., to
determine GCA, SCA and other effects.
Diallel Analysis
• Values were given for each effect. The break-down of the genotype
effects are as follows:
Xij = μ + gi + gj + sij + rij + Σ Σ eijkl /bc
g = GCA i and j = parents
s = SCA
r = reciprocal effects
b = no.of blocks
c = no. of individuals
e = effects of environmental factors
μ = overall mean
• Analysis is limited to the following conditions:
– sij = sij Σ gi = 0
– rij = -rji Σ sij = 0
Diallel Analysis
ANOVA table
______________________________________________________________________________________
Source df SS MS EMS
_______________________________________
Fixed Model Random Model
_____________________________________________________________________________________
GCA n-1 Sg Mg σ2 + 2n Σ gi
2 σ2+2(n-1)σs2+2nσg2 +2r
----------- ------------- ---------------------------------------------
n-1 n
SCA n(n-1)/2 Ss Ms σ2 + 2Σ Σ sij
2 σ2+2(n2-n+1)σs2
------------------ ---------------------
n(n-1) n2
Reciprocal n(n-1)/2 Sr Mr σ2 + 4 Σ Σ rij
2 σ2+2σr2
----------------------
n(n-1)
Error Se Me’ σ2 σ2
______________________________________________________________________________________
Me’ = Me (MSe) r: Number of replications (observations)
-----
r
Diallel Analysis
• To get more detailed breakdown of the combinations:
gi = 1 (Xi. + X.i ) - X../n2
---
2n
sij = 1 (Xij + Xji ) - 1 (Xi. + X.i + Xj. + X.j ) + X..
--- --- ----
2 2n n2
rij = 1 (Xij + Xji )
---
2
• References:
– Biometrical Genetics (Mather and Jinks)- Diallel.
Example of a complete 7 x 7 diallel cross, in a tactical assessment involving
4 replications in RCBD:
1 2 3 4 5 6 7
1 37.25 38.50 38.25 40.00 35.75 38.75 38.25
2 39.00 30.50 32.75 32.00 35.50 39.25 32.73
3 38.50 31.50 31.00 32.75 34.75 38.50 34.75
4 39.00 33.50 32.50 32.25 36.25 35.75 35.25
5 39.00 34.25 35.00 36.50 35.25 39.25 36.25
6 37.50 38.25 39.50 39.25 38.50 38.50 39.50
7 38.50 32.50 35.50 35.50 35.20 37.75 34.25
Example of a complete 7 x 7 diallel cross
ANOVA table:
_______________________________________________________
Source df SS MS F
_______________________________________________________
Reps (Blks) 3 19.1875 6.3958 268.46
Genotypes 48 1380.1250 28.7526 12.0689**
Error 144 343.0625 2.3824
_______________________________________________________
Total 195 1742.3750
1 2 3 4 5 6 7 Yi.
1
2
3
4
5
6
7
Y.i GT
Yi. Y.i (Yi. + Y.i)
1 Y1
2 Y2
3 Y3
4 Y4
5 Y5
6 Y6
7 Y7
GT
SSGCA = 1/2n(Σ(Y1
2+Y2
2….Y7
2) – 2/n2(GT)2
Yij (Yij + Yji) Yij(Yij + Yji)
1
2
3
4
5
6
7
GT
SSSCA = 1/2ΣΣ Yij(Yij + Yji)– 1/2n(Yi.+Y.i)2+1/n2(GT)
Testing for significance
MSe for testing GCA & SCA= MSE/r
MSGCA ** MSGCA
ns
MSSCA ** MSSCA **
MSGCA**
MSSCA
ns
Example of a complete 7 x 7 diallel cross
Breakdown of Genotype effects:
______________________________________________________
Source d.f. MS F
______________________________________________________
Genotypes (48) 6 (GCA) 37.8310 63.518*
21 (SCA) 4.6701 7.841*
21 (Residual) 0.9509 1.597
48
Error 144 MSei = MSe’ = 2.3824 = 0.5956
------- ---------
4 4
______________________________________________________
Example of a complete 7 x 7 diallel cross
• The significant varaition among genotypes is caused by GCA and SCA
effects. GCA has a larger contribution to the genotype differences.
Genotypic variances are mainly due to additive gene action, and a little
amount of non-additive gene action.
When further subdivided:
g1 = 2.0969
g2 = -1.8138
g3 = -1.3852
g4 = -0.9209
g5 = 0.0612
g6 = 2.3648
g7 = -0.4031
 = 0
1 2 3 4 5 6 7
1 -3.062
2 2.0995 -1.9898
3 1.5459 -0.7934 -2.3469
4 2.2066 -0.6327 -1.8620 2.0255
5 -0.9005 0.5102 0.0816 1.1173 -0.9898
6 -2.4541 2.0816 1.9031 -0.0612 0.3316 -2.3469
7 0.5638 -1.2755 0.7959 0.5816 -0.1505 0.5459 -1.0612
Conclusion in Diallel Analysis
Therefore, when selecting for traits with
small figures, example, earliness, need to
go for parents with high negative values,
while when higher figures e.g. yield is
favoured, high positive values is
selected.
North Carolina Mating Designs
• Mating designs are normally termed as the
North Carolina Design, because they were
first introduced by the North Carolina State
University, USA (by Comstock,
Cockerham and Robinson).
North Carolina Mating Designs
• Mating designs are designs used in
cyclic selection schemes, where
progenies and families are created, and
then used for the purpose of:
– estimation of genetic components in the
control of a trait, calculation of gain from
selection, and development of new
populations.
North Carolina Mating Designs
• There are many kinds and variations as
well as modifications of the designs, as
proposed. However, in principle, they
are categorised as follows:
Design I
Uses:
• to estimate genetic components of variance
• to estimate degree of dominance
• to calculate gain from selection.
Design-I
• Design I is a nested design, where every
male is mated to a number of females in a
set. This is done in Season I, i.e. at the
mating nursery stage.
Season I
From the base population:
Male Female  4 half-sib families (HS)
1 x x x x (HS) formed
2 x x x x
 4 HS families
3 x x x x x 4=16 HS families
4 x x x x
____________________________________________
. . . . .
. . . . .
n . . . .
Season 2:
After the half-sib and full-sib families were formed from
the crosses in Season I, the progenies were then
eveluated for performance in Season 2, following the
sib identities.
Example:
Set-I: 16 HS families + 2 check varieties
= 18 x 2 rep
9 entries/block:
Example: Block No. 1 Block No. 2 ............., n
II 36 -------------- 28
19 -------------- 27
I 18 -------------- 10
1 -------------- 9
Stages in the cyclic selection schemes:
Yield Trial on Progenies
From Crosses
(Season 2)
(Season 5) Data Collection
Estimation of Predicted Gain from
Selection, from Yield Trial Data Analysis
h2 estimation
Estimation of
Variance components Selection
Estimation of
Mating Nursery degree of dominance
- Formation of families
(Season I) Recombination
(Season 4) (Season 3)
ANOVA – DESIGN I
For 1 Block:
__________________________________________________________
Source d.f. EMS MS
__________________________________________________________
Rep (2) r-1 = 1
Male (4) m-1 = 3 2
e + r2
f/m + rf2
m M1
Female/Male m(f-1) = 12 2
e + r2
f/m M2
M x F (m-1)(r-1) = 3
B/M x F (f)(m)(r-1) = 12 2
e M3
___________________________________________________________
Total n-1 = 31
(rmf-1)
DESIGN I
Calculation of Heritability:
M3 = 2
e
2
f/m = (M2-M3)/r
2
m = (M1-M2)/rf
2
T = 2m + 2f + 2
e
2
m = covariance of half-sibs
= 1/4 VA (Falconer)
DESIGN I
2
m = 1/4 2
A
2
A = 42
m
2
f/m = (M2 - M3)/ r
= 1/4 VT
2
f/m = Cov. FS - Cov. paternal half sibs
= 1/2 2
A + 1/4 2
D - 1/4 2
A
= 1/4 2
A + 1/4 2
D
= 1/4 (2
A + 2
D )
42
f/m = 2
A + 2
D
2
D = 42
f/m - 42
m
= 4(2
f/m - 2
m )
DESIGN I
h2
(m) = 2
A / 2
T
= 42
m/2
T
= h2
N
h2
(f) = (2
A + 2
D )/2
T
= 42
f/m /2
T
= h2
B
h2
(m+f) = 2 (2
m+2
f/m)/2
T
DESIGN I
Selection Phase:
HS Family Selection
– based on performance of HS families in
Season 2 Yield Test. For Recombination
phase, use remnant self seeds from males in
Season 1.
• The phases involving mating, testing,
selection and recombination of selected
families are conducted in a cyclic manner.
Design II
• Uses:
– to estimate genetic components of variance
– to estimate degree of dominance
– to estimate epistatic variance
– to calculate progress from selection
• Also called Factorial Mating Design, where
every male is crossed to one female in a
factorial manner.
Design II
Example:
Male inbred = 4
1 2 3 4
5
Female inbred= 4 6
7
8  produce 16 FS families
- Population size to be tested is bigger – about twice the size that
of Design I
Example: with 4 males, 4 females, 16 crosses:
Design II
• Requires bigger population size in order to
obtain information with the same precision
as Design I,
• Although the population to be used is
much larger, the advantages of Design II
are that:
– it can estimate epistasis
– suitable to be used in situation where some
degree of inbreeding occurs in the population.
ANOVA – Design II
__________________________________________________________
Source d.f. EMS
_________________________________________________________
Rep (2) r-1 = 1
Males (4) m-1 = 3 2
e + r2
MF + rf2
M
Females (4) f-1 = 3 2
e + r2
MF + rm2
F
M x F (m-1)(f-1) = 9 2
e + r2
MF
Error (m-1)(f-1)(r-1) = 9 2
e
____________________________________________________________
Total 31
• From here, the calculations for genetic
components of variance and heritability
can be computed.
Given:
2 = 2
e
2
F = MSf-MSmxf/r
2
M = MSm-MSmxf/r
= 1/4VA
Design II
Design III
• Uses:
1. more powerful in estimating the
degree of dominance.
i.e. with a lesser amount of data, it
gives a stronger estimate of the degree
of dominance.
Design-I = 10-12 times
Design-II = 3-4 times
Design-III = 1 time
Design-III: Uses
2. In determining which generation
to use, i.e F2, F4, etc, depends on
the presence or absence of linkage
– the stronger the linkage, the more
advance is the generation required.
♀ ♀ (Original stock P1 – e.g. Inbred line A)
Female ♀ ♀
♀ ♀
♀ ♀
Male ♂ ♂ ♂ ♂ ♂ ♂ ♂ ♂...... (any generation from the cross between the
2 parental stocks:)
♀ ♀
Female ♀ ♀
♀ ♀
♀ ♀ (Original Stock P2 – e.g. Inbred line B)
• The source populations for this design are normally the
product of a certain programme with specific objectives
• Therefore, the evaluation on the progenies of of the 16
male parents
(e.g. F2) in Season 2 will involve:
16 x2 (parents) = 32 FS families/ block + 2 checks/rep
x 2 reps/block
____________
72 plots/block
Design III
ANOVA – Design-III (1 block)
_______________________________________________
Source df EMS
_______________________________________________
Rep r-1
Female parent p-1 2e + r2MF + rm2F
Male parent n-1 2e + r2MF + rp2M
M x F (n-1)(p-1) 2e + r2MF
Error (n-1)(p-1)(r-1)2e
_______________________________________________
Design III
Pascal’s triangle.
1 no segregating alleles
1 1
1 2 1 two alleles,
1 3 3 1
1 4 6 4 1 four alleles,
1 5 10 10 5 1
1 6 15 20 15 6 1 six alleles,
Line x Tester Analysis
• Kempthorre (1957)
• Broad-based Tester
• Narrow-based Testers
• Why L x T
– Cost
Line x Tester Analysis
• Uses
– Information on GCA
– Information on SCA
– Information on gene effects
– Male female relationship
– Grouping
Line x Tester Analysis
T1 T2 T3
L1
L2
L3
.
.
Line x Tester Analysis
based on performance of hybrid, in
Season 2 Yield Test.
Select good combinations
The phases involving crossing and testing.
Line x Tester Analysis
ANOVA – Design-III (1 block)
_____________________________________________
Source df MS
______________________________________________
Rep r-1
Genotypes g-1
Parents p-1
P vs. C 1
Crosses c-1
Lines l-1
Testers t-1
L x T (l-1)(t-1)
Error (r-1)(g-1)
_______________________________________________
SSc = ΣCi
2/r– C.F. (GTc)2/rc
SSp = Σpi
2/r– C.F. (GTp)2/rp
SSpvs.c = SSg–SSc–SSp
T1 T2 T3 Total
1 C1 C2 L1
2 L2
3 L3
4 L4
5 L5
T1 T2 T3 GT
SSL = ΣLi
2 /tr– C.F.(crosses)
SST = ΣTi
2/lr – C.F.(crosses)
SSLxT =SSc-SSL-SST
SSc =SSg-SSp-SS p vs. c
SSp =SSg-SSc-SS p vs. c
5 lines, 3 testers, 4 reps.
Blocks df
Genotypes
Parents, P
P vs. C
Crosses
Lines
Testers
L x T
Error
5 lines, 3 testers, 4 reps.
Sources d.f. MS
Blocks 3 27.66ns
Genotypes 22 1479**
Parents, P 7 899**
P vs. C 1 53ns
Crosses 14 1871**
Lines 4 2579ns
Testers 2 859ns
L x T 8 1770**
Error 66 91

More Related Content

Similar to Advanced Biometrics Course on Plant Breeding and Biotechnology

Clinical trials are about comparability not generalisability V2.pptx
Clinical trials are about comparability not generalisability V2.pptxClinical trials are about comparability not generalisability V2.pptx
Clinical trials are about comparability not generalisability V2.pptxStephenSenn3
 
Clinical trials are about comparability not generalisability V2.pptx
Clinical trials are about comparability not generalisability V2.pptxClinical trials are about comparability not generalisability V2.pptx
Clinical trials are about comparability not generalisability V2.pptxStephenSenn2
 
Cohort studies..NIKHNA JAYAN
Cohort studies..NIKHNA JAYANCohort studies..NIKHNA JAYAN
Cohort studies..NIKHNA JAYANNikhna jayan
 
Gene hunting strategies
Gene hunting strategiesGene hunting strategies
Gene hunting strategiesAshfaq Ahmad
 
David Moher - MedicReS World Congress 2012
David Moher - MedicReS World Congress 2012David Moher - MedicReS World Congress 2012
David Moher - MedicReS World Congress 2012MedicReS
 
De-Mystifying Stats: A primer on basic statistics
De-Mystifying Stats: A primer on basic statisticsDe-Mystifying Stats: A primer on basic statistics
De-Mystifying Stats: A primer on basic statisticsGillian Byrne
 
PARAMETRIC TESTS.pptx
PARAMETRIC TESTS.pptxPARAMETRIC TESTS.pptx
PARAMETRIC TESTS.pptxDrLasya
 
Genome wide association studies seminar Prepared by Ms Varsha Gaitonde.
Genome wide association studies seminar Prepared by Ms Varsha Gaitonde.Genome wide association studies seminar Prepared by Ms Varsha Gaitonde.
Genome wide association studies seminar Prepared by Ms Varsha Gaitonde.Varsha Gayatonde
 
Genome wide association studies seminar
Genome wide association studies seminarGenome wide association studies seminar
Genome wide association studies seminarVarsha Gayatonde
 
importance of biostatics in modern reasearch
importance of biostatics in modern reasearchimportance of biostatics in modern reasearch
importance of biostatics in modern reasearchsana sana
 
Amia tb-review-08
Amia tb-review-08Amia tb-review-08
Amia tb-review-08Russ Altman
 
2-Epidemiological studies
2-Epidemiological studies2-Epidemiological studies
2-Epidemiological studiesResearchGuru
 
biostatistics-220223232107.pdf
biostatistics-220223232107.pdfbiostatistics-220223232107.pdf
biostatistics-220223232107.pdfBagalanaSteven
 
Data type source presentation im
Data type source presentation imData type source presentation im
Data type source presentation imMohmmedirfan Momin
 
Introduction to epigenetics and study design
Introduction to epigenetics and study designIntroduction to epigenetics and study design
Introduction to epigenetics and study designamlbinder
 

Similar to Advanced Biometrics Course on Plant Breeding and Biotechnology (20)

Clinical trials are about comparability not generalisability V2.pptx
Clinical trials are about comparability not generalisability V2.pptxClinical trials are about comparability not generalisability V2.pptx
Clinical trials are about comparability not generalisability V2.pptx
 
Clinical trials are about comparability not generalisability V2.pptx
Clinical trials are about comparability not generalisability V2.pptxClinical trials are about comparability not generalisability V2.pptx
Clinical trials are about comparability not generalisability V2.pptx
 
Cohort studies..NIKHNA JAYAN
Cohort studies..NIKHNA JAYANCohort studies..NIKHNA JAYAN
Cohort studies..NIKHNA JAYAN
 
Gene hunting strategies
Gene hunting strategiesGene hunting strategies
Gene hunting strategies
 
David Moher - MedicReS World Congress 2012
David Moher - MedicReS World Congress 2012David Moher - MedicReS World Congress 2012
David Moher - MedicReS World Congress 2012
 
Sample and sampling
Sample and samplingSample and sampling
Sample and sampling
 
Clinical Trials, Epidemiology and Biostatistics in Skin Disease
Clinical Trials, Epidemiology and Biostatistics in Skin DiseaseClinical Trials, Epidemiology and Biostatistics in Skin Disease
Clinical Trials, Epidemiology and Biostatistics in Skin Disease
 
De-Mystifying Stats: A primer on basic statistics
De-Mystifying Stats: A primer on basic statisticsDe-Mystifying Stats: A primer on basic statistics
De-Mystifying Stats: A primer on basic statistics
 
Lecture 7 gwas full
Lecture 7 gwas fullLecture 7 gwas full
Lecture 7 gwas full
 
PARAMETRIC TESTS.pptx
PARAMETRIC TESTS.pptxPARAMETRIC TESTS.pptx
PARAMETRIC TESTS.pptx
 
Genome wide association studies seminar Prepared by Ms Varsha Gaitonde.
Genome wide association studies seminar Prepared by Ms Varsha Gaitonde.Genome wide association studies seminar Prepared by Ms Varsha Gaitonde.
Genome wide association studies seminar Prepared by Ms Varsha Gaitonde.
 
Genome wide association studies seminar
Genome wide association studies seminarGenome wide association studies seminar
Genome wide association studies seminar
 
importance of biostatics in modern reasearch
importance of biostatics in modern reasearchimportance of biostatics in modern reasearch
importance of biostatics in modern reasearch
 
Amia tb-review-08
Amia tb-review-08Amia tb-review-08
Amia tb-review-08
 
2-Epidemiological studies
2-Epidemiological studies2-Epidemiological studies
2-Epidemiological studies
 
biostatistics-220223232107.pdf
biostatistics-220223232107.pdfbiostatistics-220223232107.pdf
biostatistics-220223232107.pdf
 
Biostatistics
BiostatisticsBiostatistics
Biostatistics
 
Data type source presentation im
Data type source presentation imData type source presentation im
Data type source presentation im
 
Introduction to epigenetics and study design
Introduction to epigenetics and study designIntroduction to epigenetics and study design
Introduction to epigenetics and study design
 
Research methodology by hw
 Research methodology by hw Research methodology by hw
Research methodology by hw
 

Recently uploaded

Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...Suhani Kapoor
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 

Recently uploaded (20)

Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
VIP High Class Call Girls Bikaner Anushka 8250192130 Independent Escort Servi...
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 

Advanced Biometrics Course on Plant Breeding and Biotechnology

  • 2. Course • Biometry – Plant Breeding • Biostatistics – Plant Biotechnology
  • 3. Course content…  Introduction to basic principles of AE  Experimental Design and Field Management  Multivariate analysis  Incomplète Block design  Practical data analysis with R, SAS & Tassel  Data transformation
  • 4. What is research? • Research means an organized and systematic way of finding solution to a question • Is a planned inquiry to obtain new facts or to confirm or deny the results of the previous experiments
  • 6. Experimental research or non-experimental research • One may simply observe a scenario and decide based on own subjective judgment or • Require other tools and methods to assist in the process of decision making
  • 7. Development of Quantitative Genetics • Johannsen, 1903, Seed weight of inbred lines – Variation among lines, heritable – Parents with heavy lines gave heavy offsprings – Variation within lines, not heritable, environmental • Nilsson-Ehle in 1909 studied kernel color in wheat – crossed red lines to white – F1 red intermediate between the two parents, and – F2 ranged from red to white. – Some lines segregated 3:1 (red:white) in the F2, – whereas some segregated 15:1 and some 63:1. – Kernel color is ontrolled by three genes. – These three genes act additively independently that gives a continuous distribution
  • 8. Field experimentation is used to obtain New information or to improve the results of previous findings • It helps to answer questions such as:  Which fertilizer level gives optimum yield?  Which insecticide is the most effective?  Is the improved variety higher yielding than the local varieties?
  • 9. Development of Quantitative Genetics cont. • Fisher (1918) introduced statistics in Mendelian genetics, where variance (2) was used to measure differences in a population. • This analysis involves population, not an individual. • Population - Group of individuals belonging to a certain class.
  • 10. Development of Quantitative Genetics cont. • Wright (1932) studied coat color in guinea pigs and recognized the importance of gene interaction • Inbreeding, non-random mating and selection on the genetic composition of a population. • The two most important contribution of Wright are the concept of inbreeding coeficient and effective population size.
  • 11. Steps in experimental methods: • Define and state the problem • State objectives • Develop hypothesis • Implement the experiment • Data collection and analysis • Interpretation of results • Preparation of complete & precise report
  • 12. To produce an acceptable result • The trials must be designed properly • Data must be collected properly • Correct analytical method must be used NB: It is quite difficult to compromise at analysis stage if the design was initially wrong
  • 13. Basic Statistical Terms and Concepts
  • 14. Categories of genetic studies 1. Qualitative characters • Characters that could be grouped into , kinds, types/classes.
  • 15. Statistics… Statistics Inferential (Test hypothesis, make conclusion ) (Making decision about population based on sample) Descriptive (Describe characteristics, organize and summarize) (mean, mode, median)
  • 16.
  • 17.
  • 18. Categories of genetic studies 3. Quantitative Genetics • Quantitative genetics is a field of science involving transmission, inheritance or heredity of variation among quantitative traits of individuals, i.e. variation in traits that can only be differentiated using measurements.
  • 19.
  • 20. Characteristics of QT cont. • Although there are 27 genotypes, many of them have the same phenotype and hence there are only seven phenotypes (0, 1, 2, 3, 4, 5 and 6). • Therefore there is no strict one to one realtionship between genotype and phenotype.
  • 21. Experimental Error In the process of experimentation there are several sources of errors that may be encountered at all stages of the work. • Inaccurate equipment • Personal bias • Inadequate replication • Lack of uniformity in soil fertility • Topography or drainage • Damage by rodents, birds, insects & diseases
  • 22. Experimental Error Precision • Precision is the closeness of repeated measurements Accuracy • Accuracy is the closeness of a measured or computed value to its true value
  • 23. Hypothesis • A proposed explanation made on the basis of limited evidence • A starting point for further investigation
  • 24. Hypothesis Null hypothesis • Any hypothesis to be tested and is denoted by H0 • There is no difference between treatment Alternate hypothesis • Denoted by H1 or HA • There is at least one treatment different from other If one plant is watered with distilled water and the other with mineral water, then there is no difference in the growth of these two plants
  • 25. Characteristics of QT cont. • Many different genotypes can have the same phenotype. Considering k number of genes, all having an equal effect on a trait. If there are two alleles at each locus and that they exhibit co- dominance (neither allele is dominant), then there will be a total of 3k genotypes. For example with k = 3 the following genotypes and phenotypes can be shown, assuming each A, B and C allele adds one unit to the phenotype:
  • 26. • Type-I error • Rejection of the null hypothesis when it is true • If you get significance and you’re wrong, it’s a false-positive • The probability of finding a difference with our sample compared to population, and there really isn’t one
  • 27. • Type-II error • Acceptance of the null hypothesis when it is false • If you get non-significance and you’re wrong, it’s a false negative • The probability of not finding a difference that actually exists between our sample compared to the population.
  • 28. Characteristics of QT cont. No. Genotype Phenotype 1. AABBCC 6 units 2. AABBCc 5 units 3. AABBcc 4 units 4. AABbCC 5 units 5. AABbCc 4 units 6. AAbbCc 3 units 7. AAbbCC 4 units . . . . . . 27. aabbcc 0 units
  • 29. Characteristics of QT cont. 1 gene → 3 genotypes = 3 phenotypes 2 genes → 9 genotypes = 5 phenotypes 3 genes →27genotypes = 7 phenotypes n genes → 3n genotypes = 2n+1 pheno.
  • 30. Genotypic & Metric values • The A allele will give 4 units while the a allele will provide 2 units. At the other locus, the B allele will contribute 2 units while the b allele will provide 1 units. With two genes controlling a trait, nine different genotypes are possible. Below are the genotypes and their associated metric values:
  • 31. Genotype Ratio in F2 Metric value AABB 1 12 AABb 2 11 AAbb 1 10 AaBB 2 10 AaBb 4 9 Aabb 2 8 aaBB 1 8 aaBb 2 7
  • 32. • A factor is a procedure or condition whose effect is to be measured. • Treatment • Is the level or rate of a certain experimental factor • a treatment may be a standard ration, inoculation, and a spraying rate/spraying schedule
  • 33. Characteristics of QT cont. 2. Dominance (allelic-interaction) can obscure the true genotype effects. 3. Environmental variation and the interaction of genotype with environment obscure genetical effects. 4. Epistasis (non-allelic interaction) would impose limitation to make prediction, for example, predicted response to selection.
  • 35.
  • 36. Categories of genetic studies 2. Molecular Molecular genetics on the other hand, deals with biochemical and molecular mechanisms by which hereditary information is stored in DNA (deoxyribonucleic acid) and subsequently transmitted to proteins. DNA is the molecule that stores genetic information within the cell.
  • 37.
  • 38. • Continuous Vs Discrete variables • Continuous – Infinite values in between – eg. height of students, GPA etc • Discrete – separate categories – eg. letter grade
  • 39.
  • 40.
  • 41.
  • 42. 2. Gene and Genotype Frequencies Assuming that, in a population of diploid organisms, the composition of a population, in terms of gene A and a is as follows: AA Aa aa Total Number 2 12 26 40 Proportion 2/40 12/40 26/40 0.05 0.30 0.65 1.0 No. (A) 2(2) = 4 1(12) = 12 0(26) = 0 No. (a) 0(2) = 0 1(12) = 12 2(26) = 52 Total Alleles 4 24 52 80
  • 43. 2 x No. aa + 1 x No. Aa Freq. a = q = ------------------------------ Total No. Alleles = (2 x 26) + (1 x 2) --------------------- 80 = 52 + 12 ---------- 80 = 64 --- 80 = 0.8
  • 44. Random Mating • Random mating occurs when every individual in the population has the same probability (chance) to mate with every other individual in the population. • Random mating is also called panmixia, while the population involved is called a panmictic population. • In a panmictic population, panmixia usually only occurs in large populations - with hundreds or thousands of individuals.
  • 45. Measure of central tendency • The three most common measures of central tendency • Mean o Median o Mode
  • 46. Mean • Mean is the arithmetic average of the values. • To calculate the mean, all measurements are added and then be divided by the number of observations.
  • 47. Median • Is the value that exactly separates the upper half of the distribution from the lower half. • Median is the point located in such a way that 50% of the scores are lower than the median and the other 50% are greater than the median.
  • 48. Mode • Mode is the most frequent value. • It is categorized as a measure of central tendency, because a glance at a graph of the frequency distribution shows the grouping about a central point • Mode is the highest point in the hump or it is the most frequent score.
  • 49. Measure of dispersion • Range • Standard deviation • Variance
  • 50. Methods of Data Collection • Observation • Interview • Questionnaire
  • 51. Methods of Data Collection • Observation • Interview • Questionnaire
  • 52. 2. These traits are controlled by many genes, and greatly influenced by environmental factors. Therefore, it is important to know how much (percentage) of the variation is heritable and how much is not. Information is important in selection of traits in breeding and selection program. 3. Important in evolution studies. 4. Important in population studies. Importance of Quantitative Genetics
  • 53. Importance of Quantitative Genetics 1. Most economically important traits are categorized here. Products of: • Crops • Livestock • Micro-organisms
  • 54. • Sampling techniques  Probability (Random) Sampling  Non-probability (Non-random) Sampling
  • 55. • Probability (Random) Sampling  Simple random sampling  Systematic sampling  Stratified sampling  Clustered sampling  Multistage random sampling  Stratified multistage random sampling
  • 56. • Non-probability (Non-random) Sampling  Quota sampling  Purposive Sampling  Convenience sampling
  • 57. Sampling methods •Probability Sample • Every unit in the population has a chance (greater than zero) of being selected in the sample • Probability samples are the best to ensure representativeness and precision
  • 58. Simple random sampling • Applicable when population is small, homogeneous & readily available • This is done by assigning a number to each unit in the sampling frame. • A table of random number or lottery system is used to determine which units are to be selected.
  • 59. • Systematic sampling • Relies on arranging the target population according to some ordering scheme and then selecting elements at regular intervals through that ordered list. • Involves a random start and then proceeds with the selection of every kth element from then onwards. • A simple example would be to select every 10th name from the telephone directory
  • 60. • Stratified sampling • Where population embraces a number of distinct categories, the frame can be organized into separate "strata". • Each stratum is then sampled as an independent sub- population, out of which individual elements can be randomly selected.
  • 61. • Cluster sampling • An example of 'two-stage sampling' • First stage a sample of areas is chosen; • Second stage a sample of respondents within those areas is selected. • Population divided into clusters of homogeneous units, usually based on geographical contiguity • The most common variables used in the clustering population are the geographical area, buildings, school, etc
  • 62. • Non- probability samples – Probability of being chosen is unknown, cheaper- but unable to generalize;potential for bias • Convenience samples (ease of access) – Sample is selected from elements of a population that are easily accessible
  • 63. • Purposive sampling (judgemental) • You chose who you think should be in the study • This is used primarily when there is a limited number of people that have expertise in the area being researched
  • 64. • Quota sample • The selection is non-random • For example, interviewers might be tempted to interview those people in the street who look most helpful, or may choose to use accidental sampling to question those closest to them, to save time.
  • 65. • Quota sample • The selection is non-random • For example, interviewers might be tempted to interview those people in the street who look most helpful, or may choose to use accidental sampling to question those closest to them, to save time.
  • 66. • Quota sample • The selection is non-random • For example, interviewers might be tempted to interview those people in the street who look most helpful, or may choose to use accidental sampling to question those closest to them, to save time.
  • 67. With random mating, AA Aa aa P H Q __________________________________ AA P P2 PH PQ Aa H PH H2 HQ aa Q PQ HQ Q2
  • 68. As a result of panmixia, progenies with the following proportions are obtained: Mating Frequency Progeny Genotype Frequency _________________________________________________ AA Aa aa _____________________________________________________________________________ AA x AA P2 P2 - - AA x Aa 2PH PH PH - AA x aa 2PQ - 2PQ - Aa x Aa H2 1/4H2 1/2H2 1/4H2 Aa x aa 2HQ - HQ HQ aa x aa Q2 - - Q2 _____________________________________________________________________________ Total 1 (P + 1/2H)2 2(P + 1/2H)(Q + 1/2H) (Q + 1/2H)2 p2 2pq q2 ____________________________________________________________________________
  • 69. In random mating, mating of gametes is also random, therefore, in a population with genotypes AA, Aa, aa, gamete frequencies are A=p and a=q, and fusion between the two gametes will produce: Male A a p q Female A p AA p2 Aa pq a q Aa pq aa q2 i.e. AA Aa aa Frequency p2 2pq q2
  • 70. The formation of this new progeny population showed that the composition of the succeeding generation depends on the gene frequencies of the initial population. AA Aa aa A a p q _______________________ AA A p p2 pq Aa a q pq q2 aa
  • 71. The gene frequencies in this population are: p = P + 1/2H = p2 + 1/2 (2pq) = p2 + pq = p (p + q) = p q = Q + 1/2H = q2 + 1/2(2pq) = q2 + pq = q(p + q) = q • This shows that, in a panmictic population, gene and genotype frequencies remain constant.
  • 72. Hardy-Weinberg Law of Equilibrium • In a large and panmictic population, considering one locus (unlinked gene), in the absence of migration, mutation and selection, gene and genotype frequencies in the population remain constant from one generation to another.
  • 74. The relationship between gene and genotype frequencies in the population in Hardy-Weinberg equilibrium is: Gene Genotype A a AA Aa aa p q p2 2pq q2 ________________________________________ 1 0 1 0 0 0.8 0.2 0.64 0.32 0.04 0.5 0.5 0.25 0.5 0.25 0.2 0.8 0.04 0.32 0.64 0 1 0 0 1
  • 75. Hardy-Weinberg Law of Equilibrium involves four situations/ stages to be true 1. Gene frequency of parent Gene segregation - normal Parent/Gamete - normal Mating of Gametes – random (large population) 2. Zygote genotype frequency 3. Progeny genotype frequency Equal viability 4. Progeny gene frequency
  • 76. Multiple Alleles • In some situations, there are more than two alleles on a locus. In this case, the population will reach equilibrium after one generation of random mating. This can be shown either by - random mating of gametes, or - random mating of genotypes • Assuming the case of three alleles on one locus: A,a' and a Gene Genotype A a’ a AA Aa’ Aa a’a’ a’a aa f p q r p2 2pq 2pr q2 2qr r2
  • 77. The proof, after random mating of gamete: A a’ a p q r A p AA p2 Aa’ pq Aa pr a’ q Aa’ pq a’a’ q2 a’a qr a r Aa pr a’a qr aa r2 Inference: Genotype AA Aa’ Aa a’a’ a’a aa Frequency p2 2pq 2pr q2 2qr r2 P Q R S T U
  • 78. After random mating of gamete: pA = 2P + Q + R = 2P + Q + R 2(P + Q + R + S + T + U) 2 = P + 1/2Q + 1/2R = p2 + 1/2(2pq) + 1/2(2pr) = p2 + pq + pr = p(p + q + r) = p
  • 79. After random mating of gamete: qa’ = S + 1/2Q + 1/2T = q2 + 1/2(2pq) + 1/2(2qr) = q2 + pq + qr = q(q + p + r) = q ra = U + 1/2R + 1/2T = r2 + 1/2(2pr) + 1/2(2qr) = r2 + pr + qr = r(r + p + q) = r
  • 80. Multiple Alleles • However, sometimes each of those genotype cannot be differentiated by type, for example, Genotype Aa’ AA,Aa a’a’,a’a aa Blood group AB A B O Frequency 2pq p2 + 2pr q2 + 2qr r2 The easiest way to calculate the gene frequencies is by the reverse method, as follows: ra = r2 = O pA ?
  • 81. The reverse method, cont. B + O = q2 + 2qr + r2 = (q + r)2 but, q + r = 1 - p therefore, (1 – p)2 = B + O 1 – p = (B + O) p = 1 - (B + O) qa’ ? A + O = p2 + 2pr + r2 = (p + r)2 = (1 – q)2 (A + O) = 1 – q = 1 - (A + O)
  • 82. Factors affecting Equilibrium 1. Sex Linkage • There are genes located on sex chromosomes, i.e. these genes are always with a certain sex. There are two forms of combinations of sex chromosomes, homogamete (XX - female) and heterogamete (XY or XO - male). Therefore, the possible genotypes would be more.
  • 83. Sex Linkage For one locus, A/a, the possible genotypes are: Male Female XY XX A a AA Aa aa XAY XaY XAXA XAXa XaXa
  • 84. • Assuming that the gene frequencies in the female and male populations are equal, A=p, a=q, the panmictic population will reach equilibrium. p2 AA 2pq Aa q2 aa p A p3 2p2q pq2 q a p2q 2pq2 q3 Sex Linkage
  • 85. Progenies Mating Female Male Freq AA Aa Aa A a AA X A p3 p3 - - p3 - Aa X A 2p2q p2q p2q - p2q p2q aa X A pq2 - pq2 - - pq2 AA X a p2q - p2q - p2q - AaX a 2pq2 - pq2 pq2 pq2 pq2 aa X a q3 - - Q3 - q3 Total p3+p2q =p2(p+q) =p2 2pq2+2p2q =2pq(q+p) =2pq pq2+q3 =q2(p+q) =q2 p3+p2q+pq2 =p(p2+2pq+q2 = p q3+2pq2+p2q =q(q2+2pq+p2) =q
  • 86. Sex Linkage Equilibrium will only be reached if the gene fruquencies in the male and female are the same, i.e., pf = pm Example: Let pf =pm = 0.4; qf = qm = 0.6, Male Female A a AA Aa aa 0.4 0.6 0.16 0.48 0.36
  • 87. Sex Linkage • If the gene frequencies in the males and females are not equal, equilibrium will not be reached after one generation of panmixia. This is shown below: Female Male AA Aa aa A a P H Q R S pf = P + 1/2H pm = R p = 1/3 pm + 2/3 pf
  • 88. Sex Linkage • Since after panmixia, the male progenies received genes from the female parents, while female progenies received half of the genes from female parents, while the other half from the male parents, the gene frequencies after one generation of panmixia are: pm = pf' pf = 1/2 (pf' + pm') pf - pm = 1/2 (pf' + pm') - pf' = -1/2pf' + 1/2pm' = -1/2(pf' -pm') i.e.; 1. the difference in gene frequencies between the males and females is ½ after every generation of panmixia, 2. the direction of the difference is reverse every generation.
  • 89. Example: Initial population: Male Female A a AA Aa aa 0.2 0.8 0.2 0.6 0.2 pm = 0.2 pf = 0.2 + 1/2 (0.6) = 0.5 pm = pf' pf = 1/2(pf' + pm') p = 1/3(0.2) + 2/3(0.5) = 0.4
  • 90. Generation pm pf pf - pm ________________________________________________________________ 0 0.2 0.5 +0.3 1 0.5 0.35 -0.15 2 0.35 0.425 +0.075 3 0.425 0.3875 -0.0375 4 0.3875 0.40625 +0.01875 5 0.40625 0.396875 -0.009375 6 0.396875 0.4015625 +0.0046875 . . . n 0.40000 0.40000 0.00000 ____________________________________________________________
  • 91. 0 0.1 0.2 0.3 0.4 0.5 0.6 0 1 2 3 4 5 6 . . . n pm pf
  • 92. 2. Two (or more) Linked Loci • Equilibrium in the population is reached after one generation of random mating if all loci are considered separately. • Equilibrium is not reached if the loci are considered together. The rate in achieving equilibrium will be slower if the loci are more tightly linked. Assuming 2 loci A/a and B/b, with the gene frequency of: A a B b p q r s
  • 93. At equilibrium, the genotype frequencies are: AABB AABb Aabb AaBB AaBb Aabb aaBB aaBb aabb p2r2 2p2rs p2s2 2pqr2 4pqrs 2pqs2 q2r2 2q2rs q2s2 Equilibrium will be reached, depending on the gamete frequencies Gamete: AB Ab aB ab Frequency: pr ps qr qs
  • 94. • Equilibrium will be reached after one generation of random mating, if all the gene frequencies are the same, i.e. p=q=r=s=0.5; or pr=ps=qr=qs=0.25. • At equilibrium, it is expected that the frequency of the repulsion phase gametes equals to the frequency of the coupling phase gametes. A B ______________________________ ______________________________ ………………. X …..……………….... ……………………………………….... a b AB, ab = coupling phase gametes Ab, aB = repulsion phase gametes 2. Two (or more) Linked Loci
  • 95. At equilibrium, (AB)(ab) = (Ab)(aB) for example: A = B = 0.6, a = b = 0.4; AB Ab aB ab 0.36 0.24 0.24 0.16 (0.36 x 0.16) = (0.24 x 0.24) 0.0576 = 0.0576
  • 96. 3. Changes in Gene Frequencies in Populations • According to Hardy-Weinberg Law of Equilibrium, considering only one locus (gene), a population will be at equilibrium after one generation of random mating, in the absence of migration, mutation and selection.
  • 97. Migration Let, in a large poplation: m = proportion of new immigrants 1-m = proportion of natives. Let the gene frequency of a certain gene among the immigrants = qm and among the natives = q0. Then, the gene frequency in the combined population: q1 = mqm + (1 - m)q0 = m(qm - q0) + q0
  • 98. Change in gene frequency as a result of immigration: (q ) = q1 - q0 = m(qm-q0) • It can therefore be concluded that the change in gene frequency in the new population depends on: – migration rate, and – the difference in gene frequencies between the immigrants and the natives.
  • 99. Mutation • Mutation is the sudden change of a gene (allele) in a population to a different form. The effect on the population depends on the kinds of mutation. 2 kinds of mutation:
  • 100. a. Non-Recurrent mutation AA Aa • This kind only involves a small change in the large population. It is not important and not effective, because its product has a small chance to be viable in a large population. Normally lost and does not show changes in the succeeding generation, as it is usually in the form of heterozygote.
  • 101. b. Recurrent Mutation This kind affects the gene frequency. Its occurance is recurring, and has a certain frequency of occurance in the population. i. Unidirectional mutation A a Let, mutation rate/ generation =  (  = rate of gene A changing to a per generation) If frequency of A in a population = p0, Freq. of new a genes in the next generation = p0.
  • 102. At equilibrium, p0 = q0, or q = 0, p0 q0 = ------ ;   ( 1 - q0 ) q0 = -------------- ;   - q0 q0 = ---------- ;  q0  =  - q0 ; q0 (  +  ) =  ;  q0 = ------- ;  +    q = -------  +   (not influenced by the initial gene frequency, but influenced by rate of mutation).
  • 103. The effect of mutation on gene frequency: 1. Normally low; 10-5 to 10-6 per generation (1 in 100,000 or 1,000,000 gametes carries the new allele mutated at any loci) 2. Mutations are more frequent from the wild type to mutant type, rather than the reverse. Example:  = 0.00003,  = 0.00002. Gene frequency at equilibrium: 0.00003 q = ------------------------- 0.00003 + 0.00002 = 0.6 I  I 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9  = 0.00003  = 0.00002
  • 104. Number of generations needed for a certain frequency to be reached: q0 - q  = ln ------- qn – q ________  +  Example:  = 0.00003  = 0.00002 q = 0.6, q0 = 0.10, q1 = 0.20 0.1 - 0.6  = ln -------------- 0.2 - 0.6 ------------------------ 0.00003 + 0.00002 = ln 1.25 ---------- 0.00005 = 4463 generation
  • 105. Two factors determine fitness: a. Long life span. b. Number of offsprings produced within a period. These two factors lead to higher contribution to the succeeding generation.If the difference in fitness is associated with the presence or absence of a gene in the genotype of an individual, selection is said to have been done on the gene.  The gene freq. in the offsprings will not be the same as it was in the parents, because individuals in the parental generation contribute genes to the next generation at different rates among genes.  Selection results in changes in gene frequencies, and hence genotype frequencies.
  • 106. Kinds of Selection • The kinds of selection consider degree/rate of dominance for the gene involved. 1. Selection Against Recessives Selection depends on the degree of dominance of the gene involved. s = coefficient of selection; 1 = fitness : contribution of the favoured genotype; 1-s : contribution of the genotypes selected against.
  • 107. Degree of dominance vs. fitness: a. No dominance Genotype aa Aa AA Fitness 1-s 1-1/2s 1 against a b. Partial dominance Genotype aa Aa AA Fitness 1-s 1-hs 1 aginst a degree of dominance
  • 108. c. Complete dominance Genotype aa Aa;AA Fitness 1-s 1 against a d. Over dominance Genotype aa AA Aa Fitness 1-s2 1-s1 1 against homozygotes - selection against AA and aa
  • 109. Selection Against Recessives - Complete Dominance (Partial Elimination of Recessives) AA Aa aa Total Initial Freq. p2 2pq q2 1 Sel. Coef. 0 0 s Fitness 1 1 1-s Gamete p2 2pq q2(1-s)1-sq2 Contribution)
  • 110. q1 = freq. of gene 'a' in the following generation q2(1-s) + pq q1 = --------------- 1-sq2 q = q1 - q0 pq + (1-s) q2 = ---------------- - q 1-sq2 pq + (1-s)q2 - q (1-sq2) = ---------------------------- 1-sq2 pq + q2 - sq2 - q + sq3 = -------------------------- 1-sq2 q - q2 + q2 - sq2 - q + sq3 = ------------------------------ 1-sq2 -sq2 (1-q) q = ------------- 1-sq2
  • 111. Partial Elimination of Recessives • Determining factors: 1. initial gene freq. 2. selection coefficient.
  • 112. 2. Complete Elimination of Recessives AA Aa aa Initial Freq. p2 2pq q2 Fitness 1 1 Gamete p2 2pq 0 q q1 = --------- (1+q)
  • 113. q = q1 - q0 q -q2 = ------- 1+q q depends on initial gene freq. Freq. decrease at higher rate if initial freq. is high. Freq. decrease at lower rate if gene freq. gradually reduced.
  • 114. q2 q1 q / (1+q) = ------- = -------------------- 1+q1 1+ { q / (1+q) } q ----- 1+q = ------------- 1+q+q ---------- 1+q q 1+q = ----- x -------- 1+q 1+2q q q2 = ----------- 1 + 2q q3 = q -----------, 1 + 3q
  • 115. q qn = --------- 1 + nq qn (1 + nq) = q, qn + nqnq = q q - qn n = -------- qqn 1 1 n = ---- - ---- qn q
  • 116. Example: q0 = 0.2 qn = 0.1 How many generations are required to reduce the frequency of the recessive gene from q0 to qn through selection by elimination of all recessives? 1 1 n = ---- - ---- 0.1 0.2 = 5 generations if q0 = 0.02, qn = 0.01, 1 1 n = ----- - ----- 0.01 0.02 = 50 generations
  • 117. Selection Against Heterozygotes AA Aa aa Total Initial freq. p2 2pq q2 1 Fitness 1 1-s 1 - Gamete p2 2(1-s)pq q2 1-2pqs contribution pq (1 - s) + q2 q1 = ---------------- 1 - 2pqs q - spq = ---------- 1 - 2pqs
  • 118. Δq = q1 - q q - spq = ---------- - q 1 - 2pqs q - spq - q + 2pq2s = ------------------------ 1 - 2pqs spq (2q - 1) = --------------- 1 - 2pqs if s too small, 2pqs approaching 0 Δq = spq (2q - 1) = 2spq (q – 1/2)
  • 119. If q = 1/2, Δq = 0; If q  1/2, Δq positive, and increase with generation. If q < 1/2, Δq Negative, and decrease every generation. . + q Δq  0 1/2 1.0 _ q
  • 120. Selection For Heterozygotes • Normal case in natural situation • Both alleles will be maintained in the population and will not be lost. • With random mating, equilibrium is reached. AA Aa aa Total Initial freq. p2 2pq q2 1 Fitness 1-s1 1 1-s2 - Gametic p2 (1-s1) 2pq q2(1-s2) 1-s1p2-s2q2 contribution
  • 121. pq + q2 (1 - s2) q1 = ------------------- p + q = 1 1 - s1 p2 - s2q2 q - s2q2 = --------------- 1 - s1 p2 - s2q2 q - s2q2 Δq = ---------------- - q s gets small, 1 - s1 p2 - s2q2 = q – s2q2 – q + s1p2q + s2q3 = s1p2q – s2q2(1 – q) = s1p2q – s2pq2 Δq = pq (s1p – s2q)
  • 122. Selection For Heterozygotes • If Δq = 0, s1p = s2q s1(1 –q) = s2q s1 - s1q = s2q q (at equilibrium) s1 = ---------- balanced polymorphism s1 + s2
  • 123. Conclusion 1. In natural selection, if selection is against H is conducted, then q will increase or decrease depending on q and s, and q will remain constant at q = ½ 2. In selection for H, no gene will be lost or eliminated and the rate of gene frequency depends on the initial gene frequency and selection coefficients 3. For selection against recessive, the recessive will be lost very fast, if the initial frequency is high and vice versa.
  • 124. 4. SMALL POPULATION SIZE • Introduction – In previous lectures, we discussed on agents of change in gene and genotype frequencies where the population size is large, i.e. in the absence of migration, mutation or selection, gene and genotype frequencies remain constant from one generation to another, in a random mating situation, in a large population (systematic process).
  • 125. SMALL POPULATION SIZE • These features are not true in small populations. The gene frequencies are exposed to random increase and decrease which occur from gamete sampling, because small populations can be considered as samples of large populations. If the sample size is not large enough, it will not represent the large population, and thus changes of gene frequencies occur. The process of change in gene frequencies at random in a small population is called a dispersive process.
  • 126. Prevailing Situations in a Dispersive Process: 1. Random Drift (Wright's Effect) - Changes in gene frequencies at random. - Frequency changes irregularly from one generation to another, and normally does not return to its initial value. 2. Differentiation among sub-populations Drifts occur independently within the small populations which are contained in the large population. Matings are only confined within the sub-populations. No random mixing of the large population.
  • 127. Prevailing Situations in a Dispersive Process Small pop. • Large pop.
  • 128. Prevailing Situations in a Dispersive Process 3. Uniformity in small populations – Genetic variations within small populations become small. – Because of inbreeding, etc., many unfavourable effects are seen. 4. Homozygosity increases among individuals within small population. - many unfavourable effects to population. - fertility - viability, etc.
  • 129. Incomplete Block Designs • Large number of treatments to be tested • It is difficult to get uniform blocks large enough to accommodate a complete replication of all the treatments • Precision increases as the block size decreases • Smaller blocks are preferred to larger ones
  • 130. Idealised Population • A large population where mating is at random, and population then sub-divided into many sub- populations. This is due to geographical or ecological factors (natural), or controlled mating (laboratory or controlled environment). • Initial population, which undergoes random mating is called base population, and the sub- populations called lines.
  • 131. Idealised Population Lines Base populations • Characteristics of lines can be combined to form the characteristics of the base population.
  • 132. Balanced Incomplete Block Design (BIBD) • Every pair of treatments occurs once in the same incomplete block • All pairs of treatments are compared with the same degree of precision • Each treatment occurs together with every other treatment in the same block equal number of times
  • 133. Balanced Incomplete Block Design (BIBD) • Each block contains the same number of units • Each treatment occur the same number of times in total • Each pair of treatments occurs together the same number of times in total  A design that satisfies these conditions is called Balanced Incomplete Block Design
  • 134. Characteristics of Idealised Population 1. Mating only occurs among individuals within a line. = No migration between lines. 2. Generations do not overlap among each other. 3. Number of individuals in each line is the same, = N 4. Random mating among individuals within lines. 5. No selection or mutation at any level.
  • 136. Balanced Incomplete Block Design (BIBD)
  • 137. Idealised Population Base Population (n = ) Individual N N N N Gamete 2N 2N 2N 2 N Individual N N N N
  • 138. Sampling in Idealised Population • For idealized population q = qo If error is committed, p0q0 2 Δq = ____ 2N = variance to the differences in gene frequencies. This difference occurs when sampling is done from each of the lines. This caused the final gene frequency not the same as the initial gene frequency. ie q  q o
  • 139. Sampling in Idealised Population • sub-populations have different characteristics – random drift – some genes will be lost, while others fixed in the population
  • 140. Balanced Incomplete Block Design (BIBD) The sum of squares for total, replication, treatment and error are computed as in any other designs. The sum of squares due to block is a new statistic to be computed in lattice designs. 1. Correction factor C.F. = 2 2 ) ( rq GT 2. Total SS = ∑∑X2 ij(l) – C.F. 3. SSR = 2 2 q Rj  –C.F. 4. SSB = ) 1 ( 2   r qr Cij - ) 1 ( 2 2   r r q Ci 5. SSt = r Ti  2 – C.F. 6. SSE = Total SS – SSR – SSB – SSt
  • 141. Balanced Incomplete Block Design (BIBD) Practical Example…….. 1.A breeder would like to evaluate 16 highly advanced hybrids in balanced lattice design as the experimental area has variability in terms soil acidity with unknown direction of the gradient. Then he conducted the experiment and obtained the following measurements. The statistical objective of this example is to get familiarize with the analysis of variance for balanced lattice design.
  • 142. Balanced Incomplete Block Design (BIBD)
  • 143. Balanced Incomplete Block Design (BIBD) The stepwise analysis is as follows: 1.Compute Bj a. B1 = Bi1 + Bi5 + Bi9 + Bi13 + Bi17 = 62.4 + 63.2 + 65.1 + 81.1 + 58.1 =329.9 a. B2 = Bi1 + Bi6+ Bi10 + Bi14 + Bi18 62.4 + 58.9 + 65.7 + 74.2 + 75 =336.2 a. B3 = Bi1 + Bi7 + Bi11 + Bi15 + Bi19 = 62.4 + 72.5 + 63.7 + 69.7 + 77.9 =346.2 a. B4 = Bi1 + Bi8+ Bi12 + Bi16 + Bi20 = 62.4 + 76.8 + 69.0 + 69.5 + 83.5 =361.2 a. B5 = Bi2 + Bi5 + Bi10 + Bi15 + Bi20 = 61.6 + 63.2 + 65.7 + 69.7 + 83.5 =343.7 a. B6 = Bi2 + Bi6 + Bi9 + Bi16 + Bi19 61.6 + 58.9 + 65.1 + 69.5 + 77.9 =333.0 a. B7 = Bi2 + Bi7 + Bi12 + Bi13 + Bi18 = 61.6 + 72.5 + 69.0 + 81.1 + 75.0 =359.2 a. B8 = Bi2 + Bi8 + Bi11 + Bi14 + Bi17 = 61.6 + 76.8 + 63.7 + 74.2 + 58.1 =329.9 a. B9 = Bi3 + Bi5 + Bi11 + Bi14 + Bi17 = 60.9 + 63.2 + 63.7 + 74.2 + 58.1 =332.3
  • 144. Balanced Incomplete Block Design (BIBD) a. B10 = Bi3 + Bi6 + Bi12 + Bi15 + Bi17 • = 60.9 + 58.9 + 69 + 69.7 + 58.1 • =329.9 a. B11 = Bi3 + Bi7 + Bi9 + Bi14 + Bi20 • = 60.9 + 72.5 + 65.1 + 74.2 + 83.5 • =356.2 a. B12 = Bi3 + Bi8 + Bi10 + Bi13 + Bi19 • = 60.9 + 76.8 + 65.7 + 81.1 + 77.9 • =362.4 a. B13 = Bi4 + Bi5 + Bi12 + Bi14 + Bi19 • = 73.7 + 63.2 + 69 + 74.2 + 77.9 • =358 a. B14 = Bi4 + Bi6 + Bi11+ Bi13 + Bi20 • = 73.74 + 58.9 + 63.7 + 81.1 + 83.5 • =360.9 • a. B15 = Bi4 + Bi7 + Bi10 + Bi16 + Bi17 • = 73.7 + 72.5 + 65.7 + 69.5 + 58.1 • =339.5 a. B16 = Bi4 + Bi8 + Bi9 + Bi15 + Bi18 • = 73.7 + 76.8 + 65.1 + 69.7 + 75 = 360.3
  • 145. Balanced Incomplete Block Design (BIBD) 1. Compute Wj a. W1 = qT1 – (q+1)B1 + GT = (4x80.7) – (4+1) x 329.9 + 1382.5 = 55.8 b. W2 = qT2 – (q+1)B2 + GT = (4x80.4) – (4+1) x 336.2 + 1382.5 = 23.1 c. W3 = qT3 – (q+1)B3 + GT = (4x91.3) – (4+1) x 346.2 + 1382.5 = 16.7 d. W4 = qT4 – (q+1)B4 + GT = (4x91.5) – (4+1) x 361.2 + 1382.5 = -57.5 e. W5 = qT5 – (q+1)B5 + GT = (4x81.3) – (4+1) x 343.7 + 1382.5 = -10.8 f. W6 = qT6 – (q+1)B6 + GT = (4x85.3) – (4+1) x 333 + 1382.5 = 58.7 g. W7 = qT7 – (q+1)B7 + GT = (4x87.7) – (4+1) x 359.2 + 1382.5 = -62.7 h. W8 = qT8 – (q+1)B8 + GT = (4x87.7) – (4+1) x 334.4 + 1382.5 = 61.3 i. W9 = qT9 – (q+1)B9 + GT = (4x80.2) – (4+1) x 332.3 + 1382.5 = 41.8
  • 146. Balanced Incomplete Block Design (BIBD) a. W10 = qT10 – (q+1)B10 + GT = (4x59.4) – (4+1) x 316.6 + 1382.5 = 37.1 b. W11 = qT11 – (q+1)B11 + GT = (4x95.8) – (4+1) x 356.2 + 1382.5 = -15.3 c. W12 = qT12 – (q+1)B12 + GT = (4x95.1) – (4+1) x 362.4 + 1382.5 = -49.1 d. W13 = qT13 – (q+1)B13 + GT = (4x87.5) – (4+1) x 358 + 1382.5 = -57.5 e. W14 = qT14 – (q+1)B14 + GT = (4x99.9) – (4+1) x 360.9 + 1382.5 = -22.4 f. W15 = qT15 – (q+1)B15 + GT = (4x90.8) – (4+1) x 339.5 + 1382.5 = 48.2 g. W16 = qT16 – (q+1)B16 + GT = (4 x 87.9) – (4+1) x 360.3 + 1382.5 = -67.4
  • 147. Balanced Incomplete Block Design (BIBD) 1. Compute Sum of squares for the different components. The best way is to start with the adjusted block sum of squares because the mean square of the block is an important component for making decision whether we continue the analysis as lattice or as RCBD after comparing it with MS of error. a. SS block (adjusted) = ) 1 ( 3 2   q q Wj = ) 1 4 ( 4 ] ) 4 . 67 ( ... ) 1 . 23 ( ) 8 . 55 [( 3 2 2 2      =109.14 b. MS block = 1 ) ( 2  q adj block SS = 1 4 14 . 109 2  = 7.28 = Eb c. Correction factor(C.F.) = 2 2 ) ( rq GT = 2 2 4 5 ) 5 . 1382 ( x = 23891.33 d. Total SS = ∑Xijk 2 – CF = [(14.9)2 + (15.2)2 + … + (22.5)2 ] - 23891.33 = 566.4
  • 148. Balanced Incomplete Block Design (BIBD) a. SS treatment (unadj.) SSt (unadj.) = . . 2 F C r Ti   = 5 ] ) 9 . 87 ( ... ) 4 . 80 ( ) 7 . 80 ( 2 2    = 257.13 b. SS replication (SSR) SSR = . . 2 2 F C q Rj   = 2 2 2 4 ) 5 . 294 ( ... ) 4 . 271 ( ) 6 . 258 [(    -23891.33 = 72.71 c. SS due to error (SSE) SSE = Total SS – SSt(unadj) – SS block(adj) – SSR = 566.4 – 257.13 – 109.14 – 72.71 = 127.42 d. Degree of freedom for error = (q-1)(q2 -1) = (4-1)(42 -1) = 45 e. MSE = SSE/d.f. for error = 127.42/45 = 2.83 = Ee, Once the two statistics are obtained, it is possible to check whether μ is positive or not. If it is positive we will continue the analysis as lattice, if not as in RCBD.
  • 149. Balanced Incomplete Block Design (BIBD) a. μ = b e b E q E E 2  = 28 . 7 4 83 . 2 28 . 7 2 x  = 0.04, since μ is positive we will proceed to adjust treatment means as in Table 4.72. Let T’j = Tj + μWj where Tj is unadjusted treatment total Table 4.72. Computing adjusted treatment means Treatment Tj Bj Wj T’j = Tj + μWj Adjusted mean(T’j/r) T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 T12 T13 T14 T15 T16 80.7 80.4 91.3 91.5 81.3 85.3 87.7 87.7 80.2 59.4 95.8 95.1 87.5 99.9 90.8 87.9 329.9 336.2 346.2 361.2 343.7 333.0 359.2 334.4 332.3 316.6 356.2 362.4 358.0 360.9 339.5 360.3 55.8 23.1 16.7 -57.5 -10.8 58.7 -62.7 61.3 41.8 37.1 -15.3 -49.1 -57.5 -22.4 48.2 -67.4 82.93 81.32 91.97 89.20 80.87 87.65 85.19 90.15 81.87 60.88 95.19 93.14 85.20 99.00 92.73 85.20 16.59 16.26 18.39 17.84 16.17 17.53 17.04 18.03 16.37 12.18 19.04 18.63 17.04 19.80 18.55 17.04
  • 150. Balanced Incomplete Block Design (BIBD) 1. SSt (adjusted) = 1 2   q Ti – CF = 1 4 ] ) 20 . 85 ( ... ) 32 . 81 ( ) 93 . 82 [( 2 2 2     - 23891.33 = 223.77 2. MSt = SSt(adj)/q2 –1= 223.77/42 -1 = 14.92 3. Effective error MS = Ee (1 + qµ) = 2.83(1+4x0.04) = 3.2828 4. F-calculated = MSt/Effective error MS = 14.92/3.2828 = 4.54 5. Finally ANOVA table can be constructed as in Table 4.73 Table 4.73. ANOVA for balanced lattice design Sources of d.f. SS MS F-cal F-tabulated Variation 0.05 0.01 Replication 4 72.71 18.18 Block (adj) 15 109.14 7.28 2.57* 1.90 Treatment(adj) 15 223.77 14.92 4.54* 1.90 Intra-block error 45 127.13 2.83 Effective error 45 3.2828 *, significant at 0.05 level of probability,
  • 151. Balanced Incomplete Block Design (BIBD) CV = 100 ) ( x GM MS Effective E = 56 . 11 28 . 17 28 . 3  1. Compute standard errors SE(m) = r q Ee )] 1 ( [   = 5 ) 04 . 0 4 1 ( 83 . 2 x  = 0.81 SE(d) = r q Ee )] 1 ( 2 [   = 5 )] 04 . 0 4 1 ( 83 . 2 2 [ x x  ] = CD/LSD = SE(d) x t0.05 at error degree of freedom = 1.15 x 2.019 = 2.32 2. Compute relative efficiency of lattice over RCBD MSERCBD = error block E B f d f d SS adj SS . . . . ) (   = 45 15 ) 42 . 127 ( ) 14 . 109 (   = 3.94 Effective error MS = 3.2828 RE = MSERCBD/Effective error MS = 3.94/3.2828 = 1.20
  • 152. Interpretation  Treatment effect was found to be significant  T14 had the highest grain yield (19.80 t/ha) followed by T11 and T12 (but statistically these three treatments did not show differences among themselves)  There was also a significant block effect implying that blocking helped in reducing experimental error  The relative efficiency of 1.20 indicates that the use of lattice design instead of RCBD improved precision by 20%
  • 153. Partially Balanced Design  Characteristics of partially balanced design:  All treatments do not occur together in the same block  The number of replications is not restricted  Number of treatment must be a perfect square
  • 154. Differences in genotype frequencies • Therefore: Difference from initial AA : p2 = ( p )2 + q 2 Aa : H = 2pq - 2 q 2 aa : q2 = ( q)2 - q 2 = Wahlund’s Formula
  • 155. Inbreeding • Inbreeding-breeding together of individuals more closely related than mates chosen at random from a population (mating of relatives) • Inbreeding coefficient – probability of any individual (diploid individual) being an identical homozygote. • probability that the 2 genes of a random member of a pop. are identical by descent
  • 156. Inbreeding A B C D E F G
  • 157. Inbreeding a a - allele -identical by descent • When we have more genes/genotypes that are identical by descent, then the higher is the chance of inbreeding incidence in the population. The incidence of inbreeding is measured by coefficient of inbreeding (F) aa
  • 158. Inbreeding m F = 1/2  (1/2)n (1 + FA) i=1 Fi = Inbreeding coefficient of an individual or a population of individuals n = No. of generations separating the male and female parent through the common ancestor. m = No. of pathways to get the individual through common ancestor FA= coefficient of inbreeding for common ancestor for seed and pollen parent.
  • 159. In any breeding program, population mean refers to both phenotypic and genotypic values, because we consider the environmental deviation = 0 in the population. To assume this, we have to have a good control of the environment, i.e. by growing or raising all individuals in the population in the area with no difference in the environmental influence.
  • 160. Alpha Lattice Design  Alpha-lattice design is replicated designs that divide the replicate into incomplete blocks that contain a fraction of the total number of entries  It bridges the gap between RCBD and lattice designs  The number of treatments should not necessarily be a perfect square  Genotypes are distributed among the reps so that all pairs occur in the same replication in nearly equal frequency
  • 161. Alpha lattice design • Suppose a researcher is interested in evaluating 20 genotypes in alpha (0,1)-lattice design. Then, there will be t = 20, q = 4 and number of treatments in each block = 5
  • 162. Alpha Lattice Design Used when we have large number of genotypes and small area There are no checks varieties for estimation error It reduce the effect of within-complete-block variation They can also provide repeatability, particularly in trials Maximizes the use of comparisons between genotypes in the same incomplete-block
  • 163. Advantages of alpha lattice design  It allows the adjustment of treatment means for block effects  The small incomplete blocks create homogeneous comparisons  It provides effective control within replicate variability
  • 164. Augmented Design  Augmented designs also use grids or incomplete blocks to remove some field variation from the plot residuals  In an augmented design, a large set of experimental lines is divided into small incomplete blocks  In each incomplete block, a set of checks is included; every check occurs in each incomplete block
  • 165. Augmented Design  Because the design is unreplicated, the repeated checks are used to estimate the error mean square and the block effect  The block effect is estimated from the repeated check means and then removed from the means of the test varieties  This reduces error and increases precision somewhat
  • 167. Augmented Designs Developed by Federer (1956) Used to test a large number of lines in a limited area Used when other designs are not appropriate due to large number of entries In augmented designs the goal is to compare existing (control) treatments with new treatments that have an experimental constraint of , limited replication and resources
  • 168. Augmented Design Experimental lines replicated once Checks occur in each block Checks used to estimate block effects Checks provide error term Difficult to maintain homogeneous blocks when comparing Flexible – blocks can be of unequal size
  • 169. Disadvantage of Augmented Design Considerable resources are spent on production and processing of control plots Relatively few degrees of freedom for experimental error, which reduces the power to detect differences among treatments Un-replicated experiments are inherently imprecise, no matter how sophisticated the design
  • 170. Analysis of Quantitative Traits Consider,One locus, two alleles - A1/A2 Genotypes: A2A2 A1A2 A1A1 Value -a 0 d +a Alelle A1 has value that increases the mean value. d : depends on the degree of dominance d = 0, no dominance d = +, A1 > A2 if complete dominance, d = +a, -a Over dominance, d > +a or d < -a Degree of dominance = d/a
  • 171. Population Mean In a population, population mean is product of the above genotypic values, after the effects of all loci controlling the trait is combined. i.e. when every genotypic value is multiplied with its frequency, and then total for all three genotypes taken. Genotype Frequency Value Frequency x Value A1A1 p2 +a p2a A1A2 2pq d 2pqd A2A2 q2 -a -q2a Population Mean = a(p - q) + 2dpq
  • 172. Population Mean Population Mean: M = a(p – q) + 2dpq M = a(p - q) + 2dpq Produced by Produced by Homozygotes heterozygotes
  • 173. Population Mean • The value is the product of gene contribution from all/many loci, effect of combination = population mean. • Assume here that they all combine additively, M = a(p - q) + 2dpq
  • 174. Average Effect • Average effect of a gene is the average deviation from the population mean, of individuals receiving one gene from one parent, while the other is received at random from the population. • Average effect of a gene substitution - effect on the population mean, when a gene is substituted with another with a different form. ie. A1 → A2 A1A1 → A1A2 A2 → A1 A1A2 → A1A1 1 locus, 2 alleles, Frequency of A1 = p Frequency of A2 = q
  • 175. Average Effect • Average effect of gene A1= 1, • If a gamete carrying A1 combines with gametes at random in the population, the genotype frequencies resulting would be, A1A1 = p A1A2 = q Genotypic value for A1A1 = a Genotypic value for A1A2 = d Mean for both = pa + qd
  • 176. Average Effect • Difference between this mean value and the population mean is average effect of gene A1.  1 = pa + qd - a (p - q) + 2dpq  1 = qa + d(q - p) For gene A2, 2 = -pa + d(q - p) If A2 is taken at random, genotype frequency, A1A2 = p A2A2 = q Changing A1A2 A1A1, changing value of d to +a  effect = (a - d)
  • 177. Average Effect • Changing A2A2 A1A2, changing value of -a to d  effect of gene substitution  = p (a - d) + q(d+a) Computation: pa - pd + qd + qa = a(p+q) + d(q - p) = a + d(q - p)   = a + d(q - p) Relate with 1, 2,  = 1 - 2
  • 178. Breeding Value • The value of an individual as judged by the mean value of the progenies is called the breeding value. • Breeding value can be measured by the value of deviation from the population mean.
  • 179. Breeding Value population X A • If a certain individual is mated with a group of individuals at random from a population, • Breeding value = 2 x the average deviation of the progenies from the population mean. • In the context of average effects, the breeding value of an individual = total average effects of the genes it carries, summed up for all pairs of genes (alleles) at every locus, for all loci involved. XXXXXXXX XXXXX
  • 180. Breeding Value • Considering one locus, the breeding value for the genotypes: B.V. A1A1 = 21 = 2q B.V. A1A2 = 1 + 2 = (q - p)  B.V. A2A2 = 22 = -2p Values of progenies
  • 181. Breeding Value Arbitrary Value Breeding value + a 2q  d (q-p) 0   -a -2p 0 1 2 A2A2 A1A2 A1A1 q2 2pq p2 No. of A1 gene
  • 182. • We have discussed about only a component of the genotype value, i.e. additive effects; i.e. breeding value ie. G = A + D + I G = genotypic value A = breeding value D = dominance deviation I = interaction deviation • For one locus only:- G = A + D Breeding Value
  • 183. Dominance Deviation  Dominance Deviation is a function of d d = 0, DD = 0  all genes are additive in nature I Deviation Interaction G = A + D + I • When more than one locus involved. If I  0, there is locus interaction contributing to the genotypic value. It is called epistasis. I is also called epistatic deviation. • If I = 0, the genes are said to act additively among loci. • If 1 locus involved, additive action means the absence of dominance. • If more than 1 locus involved, additive action means absence of epistasis.
  • 184. 6. MATING DESIGNS AND ESTIMATION OF GENETIC PARAMETERS • Heterosis = hybrid vigor – the superiority of F1 over its parents – Positive traits – Negative traits
  • 185. Heterosis cont. • Superiority of the cross over its parents – A/a locus, AA x aa = Aa, position – Dominance √ – Epistasis √ – Additive X
  • 187. Heterosis cont. • Average parent, MPH = 100(F1-MP)/MP • Better parent, BPH = 100(F1-BP)/BP
  • 188. Heterosis cont. • What is better parent? • For which traits • Positive traits: • Negative traits: • Measurement and score:
  • 189. Heterosis cont. • What is the basis for heterosis? – Theory of dominance – Theory of over-dominance – Theory of physiological enhancement
  • 190. Heterosis cont. • Heterotic patterns • Theory of testers – Broad-base – Narrow-base
  • 191. • General combining ability • Average performance, • Additive genetic variance, σ2 A • Specific combining ability • Specific combinations, • Non-additive genetic variance, σ2 I Combining Ability
  • 192. DIALLEL MATING DESIGN Introduction Diallel cross - mating design where all possible crosses are made on an individual or population (inbred, variety) to obtain all possible combinations. - Complete diallel - Partial diallel (half-diallel)
  • 193. DIALLEL MATING DESIGN Exmaple: n inbred lines, therefore n x n = n2 Components: n parents = 7 ½ n (n-1) crosses = 21 ½ n (n-1) reciprocals = 21
  • 194. DIALLEL ANALYSIS • Diallel is difficult to construct but useful to obtain genetic information from populations. • Started by Sprague and Tatum (1942) • Uses of diallel (Hayward, 1979): Application of the dialel cross to outbreeding crop species.
  • 195. USES OF DIALLEL ANALYSIS 1. Strategic survey on populations as initial breeding materials in breeding programmes. – observe variance components – observe genetic variablility – estimate heritability - Use top–cross
  • 196. USES OF DIALLEL ANALYSIS 2. Tactical assessment on genetic relationships among selected elite genotypes. – Selection can be done to select parents that have good combining ability. – Inbred lines have to be first developed and then tested.
  • 197. USES OF DIALLEL ANALYSIS Examples: - Evaluate SCA, GCA on hybrid combinations among inbred lines. Methods of estimating GCA and SCA: 1. Diallel Analysis 2. Mating Designs I, II, III 3. Test cross performance - top cross, inbreds, hybrids, full/half-sibs. 4. Self-progeny performance.
  • 198. Methods of Diallel Analysis Many methods have been proposed including, a) Hayman's (1954) b) Griffing's (1956) - 4 Methods c) Gardner and Eberhart's (1966) - Analysis II, Analysis III. - Focus on Griffing's Methods (Ref: Issues of Dialel Analysis by Baker, 1978)
  • 199. Assumptions used in diallel analysis a. Diploid segregation of the individuals involved. b. Homozygous parents. c. No reciprocal differences. d. No epistasis. e. No multiple alleles. f. Uncorrelated gene distribution in the two parents
  • 200. a) Example of strategic survey on populations: To determine the features of a trait in terms of its genetic components, regression of Wr against Vr is used. Lets say, we test five parents which were entered into a diallel i.e. A, B, C, D and E, to form progenies. Partial dominance Wr Complete dominance W2 = Vr Vp xA Over dominance xC xD xB xE 0 Vr Wr = covariance of progenies on parents Vr = variance of progenies on parents.
  • 201. Diallel Analysis • From the Wr - Vr graph above: - The line that passes through the origin (0) shows complete dominance as the main feature of the control of the trait concerned.
  • 202. Diallel Analysis cont. a. Above origin - partial dominance. b. Below origin - over dominance. c. The larger the Vr value, the higher is the interloci interaction.
  • 203. Diallel Analysis cont. d. E.g. A and E are more different from each other genetically because their points are far apart on the graph, as compared to e.g. D and E. e. All points within the parabola - parabola limits the values of the coordinates.
  • 204. Diallel Analysis cont. f. If points close to the origin - more dominant genes. g. If far from origin - more recessive genes.
  • 205. Example, 7 x 7 half diallel: 1 2 3 4 5 6 7 1 37.250 38.500 38.375 39.500 37.375 38.125 38.375 2 30.500 32.125 32.750 34.875 38.750 32.625 3 31.000 32.625 34.875 39.000 35.125 4 32.250 36.375 37.500 35.375 5 35.250 38.875 35.625 6 38.500 38.625 7 34.250
  • 206. Diallel Analysis Correction factor = ( one parental cross value )2 n = (37.250 + 30.500 + ......+ 34.250)2 7 = 8160.143 Variance: Vp (phenotypic variance of population): = 1 [(one-parent cross value)2 ] - Correction factor (C.F.) n-1 = 1/6 x [37.2502 + 30.5002 + ..... + 34.2502 ]2 – C.F. = 9.4345
  • 207. Correction factor = (Grand total)2/Total number of observations. = (23231.82)2 4x64 = 2108271.3301 Total S.S. = (104.86)2 + (88.66)2 + .......... + (81.48)2 - C. F. = 127712.5000 Treatments S. S. = (342.58)2 + (348.05)2 +.......... + (328.00)2 - C.F. 4 = 104924.1604 Replication S. S. = (5811.48)2 + .......... + (5951.34)2 - C.F. 64 = 1037.0241 Error S. S. = Total S. S. - Treatment S. S. - Replication S. S. = 21751.3155.
  • 208. Diallel Analysis Vi = 1 {  (value of all crosses to i )2 -[ ( value of all crosses n - 1 to i )2/n ] } V1 = 1/6 { ( 37.2502 + 38.5002 + .... + 38.3752) - (37.250 + 38.500 + .... + 38.375)2/7) = 0.57143 V2 = 1/6 { ( 38.5002 + 30.5002 + .... + 32.6252) - (38.500 + 30.500 + .... + 32.625)2 / 7) = 10.35863 V3 = 9.47098 V4 = 7.75446 V5 = 2.21801 V6 = 0.26786 V7 = 4.61830.
  • 209. Diallel Analysis Covariance: Wi = 1 X { [ (cross of parent with i X one-parent cross of the specific n – 1 parent concerned)] - [(total of all crosses to i) ( total of all one-parent cross ) / n]} W1= 1/6 X [(37.250 X 37.250) + (38.500 X 30.500) + .....+ 38.375 X 34.250)] - [(37.250 + 38.500 + ....+ 38.375) ( 37.250 + 30.500 +........+ 34.250)] / 7 = -1.37946 W2 = 9.41815 W3 = 9.22173 W4 = 7.88373 W5 = 3.30878 W6 = 0.22098 W7 = 5.74033
  • 210. Finally, the graph of Wr vs. Vr can be constructed: Wr 10 - 9 - 8 - 2* 7 - 3 * 6 - 4 * 5 - 7* 4 - 3 - 5* 2 - 1 - * 6 l l l l l l l l l 0 1 2 3 4 5 6 7 8 9 10 Vr *1
  • 211. Diallel Analysis • Deductions from the graph: – Points close to each other, parents are similar. – Points far from each other, parents are different. – Generally, this trait is controlled by genes with complete dominance. – Example, 1, 6 carry more dominant genes, 2, 3, 4 carry more recessive genes. – This analysis is called graphical analysis of a diallel cross. – Convenient with the use of computers.
  • 212. Diallel Analysis b. Example of the Use of Diallel Analysis for Tactical Assessment • To test the GCA and SCA for certain hybrid combinations. – GCA – to determine the average performance of lines/inbreds in hybrid combinations. – SCA – to compare performance of one cross with the other crosses. i.e. is it better or worse than the average performance of all crosses. • Example: A x B; A x C; A x D; A x E; A x F. Average for A crosses = ? Compare with A x F, for example to determine SCA(A x F) • Griffing’s Method (1956) : For n2 diallel table ‘v’ genotype ‘b’ block ‘c’ individuals/plot
  • 213. Diallel Analysis • Observation on performance: Xijkl = μ + νij + bk + (bν)ijk + ejkl μ = overall population mean νij = genotype bk = k th block effect (bν)ijk = block and genotype interaction effect eijkl = experimental error • then use analysis of variance to look at significance of differences. • genotypes were normally chosen for specific goals, i.e. hybrids, etc. MSv F = ........ MSe • - if the effect of genotypes is significant, look at the components of M.S., to determine GCA, SCA and other effects.
  • 214. Diallel Analysis • Values were given for each effect. The break-down of the genotype effects are as follows: Xij = μ + gi + gj + sij + rij + Σ Σ eijkl /bc g = GCA i and j = parents s = SCA r = reciprocal effects b = no.of blocks c = no. of individuals e = effects of environmental factors μ = overall mean • Analysis is limited to the following conditions: – sij = sij Σ gi = 0 – rij = -rji Σ sij = 0
  • 215. Diallel Analysis ANOVA table ______________________________________________________________________________________ Source df SS MS EMS _______________________________________ Fixed Model Random Model _____________________________________________________________________________________ GCA n-1 Sg Mg σ2 + 2n Σ gi 2 σ2+2(n-1)σs2+2nσg2 +2r ----------- ------------- --------------------------------------------- n-1 n SCA n(n-1)/2 Ss Ms σ2 + 2Σ Σ sij 2 σ2+2(n2-n+1)σs2 ------------------ --------------------- n(n-1) n2 Reciprocal n(n-1)/2 Sr Mr σ2 + 4 Σ Σ rij 2 σ2+2σr2 ---------------------- n(n-1) Error Se Me’ σ2 σ2 ______________________________________________________________________________________ Me’ = Me (MSe) r: Number of replications (observations) ----- r
  • 216. Diallel Analysis • To get more detailed breakdown of the combinations: gi = 1 (Xi. + X.i ) - X../n2 --- 2n sij = 1 (Xij + Xji ) - 1 (Xi. + X.i + Xj. + X.j ) + X.. --- --- ---- 2 2n n2 rij = 1 (Xij + Xji ) --- 2 • References: – Biometrical Genetics (Mather and Jinks)- Diallel.
  • 217. Example of a complete 7 x 7 diallel cross, in a tactical assessment involving 4 replications in RCBD: 1 2 3 4 5 6 7 1 37.25 38.50 38.25 40.00 35.75 38.75 38.25 2 39.00 30.50 32.75 32.00 35.50 39.25 32.73 3 38.50 31.50 31.00 32.75 34.75 38.50 34.75 4 39.00 33.50 32.50 32.25 36.25 35.75 35.25 5 39.00 34.25 35.00 36.50 35.25 39.25 36.25 6 37.50 38.25 39.50 39.25 38.50 38.50 39.50 7 38.50 32.50 35.50 35.50 35.20 37.75 34.25
  • 218. Example of a complete 7 x 7 diallel cross ANOVA table: _______________________________________________________ Source df SS MS F _______________________________________________________ Reps (Blks) 3 19.1875 6.3958 268.46 Genotypes 48 1380.1250 28.7526 12.0689** Error 144 343.0625 2.3824 _______________________________________________________ Total 195 1742.3750
  • 219. 1 2 3 4 5 6 7 Yi. 1 2 3 4 5 6 7 Y.i GT
  • 220. Yi. Y.i (Yi. + Y.i) 1 Y1 2 Y2 3 Y3 4 Y4 5 Y5 6 Y6 7 Y7 GT SSGCA = 1/2n(Σ(Y1 2+Y2 2….Y7 2) – 2/n2(GT)2
  • 221. Yij (Yij + Yji) Yij(Yij + Yji) 1 2 3 4 5 6 7 GT SSSCA = 1/2ΣΣ Yij(Yij + Yji)– 1/2n(Yi.+Y.i)2+1/n2(GT)
  • 222. Testing for significance MSe for testing GCA & SCA= MSE/r MSGCA ** MSGCA ns MSSCA ** MSSCA ** MSGCA** MSSCA ns
  • 223. Example of a complete 7 x 7 diallel cross Breakdown of Genotype effects: ______________________________________________________ Source d.f. MS F ______________________________________________________ Genotypes (48) 6 (GCA) 37.8310 63.518* 21 (SCA) 4.6701 7.841* 21 (Residual) 0.9509 1.597 48 Error 144 MSei = MSe’ = 2.3824 = 0.5956 ------- --------- 4 4 ______________________________________________________
  • 224. Example of a complete 7 x 7 diallel cross • The significant varaition among genotypes is caused by GCA and SCA effects. GCA has a larger contribution to the genotype differences. Genotypic variances are mainly due to additive gene action, and a little amount of non-additive gene action. When further subdivided: g1 = 2.0969 g2 = -1.8138 g3 = -1.3852 g4 = -0.9209 g5 = 0.0612 g6 = 2.3648 g7 = -0.4031  = 0
  • 225. 1 2 3 4 5 6 7 1 -3.062 2 2.0995 -1.9898 3 1.5459 -0.7934 -2.3469 4 2.2066 -0.6327 -1.8620 2.0255 5 -0.9005 0.5102 0.0816 1.1173 -0.9898 6 -2.4541 2.0816 1.9031 -0.0612 0.3316 -2.3469 7 0.5638 -1.2755 0.7959 0.5816 -0.1505 0.5459 -1.0612
  • 226. Conclusion in Diallel Analysis Therefore, when selecting for traits with small figures, example, earliness, need to go for parents with high negative values, while when higher figures e.g. yield is favoured, high positive values is selected.
  • 227. North Carolina Mating Designs • Mating designs are normally termed as the North Carolina Design, because they were first introduced by the North Carolina State University, USA (by Comstock, Cockerham and Robinson).
  • 228. North Carolina Mating Designs • Mating designs are designs used in cyclic selection schemes, where progenies and families are created, and then used for the purpose of: – estimation of genetic components in the control of a trait, calculation of gain from selection, and development of new populations.
  • 229. North Carolina Mating Designs • There are many kinds and variations as well as modifications of the designs, as proposed. However, in principle, they are categorised as follows:
  • 230. Design I Uses: • to estimate genetic components of variance • to estimate degree of dominance • to calculate gain from selection.
  • 231. Design-I • Design I is a nested design, where every male is mated to a number of females in a set. This is done in Season I, i.e. at the mating nursery stage.
  • 232. Season I From the base population: Male Female  4 half-sib families (HS) 1 x x x x (HS) formed 2 x x x x  4 HS families 3 x x x x x 4=16 HS families 4 x x x x ____________________________________________ . . . . . . . . . . n . . . .
  • 233. Season 2: After the half-sib and full-sib families were formed from the crosses in Season I, the progenies were then eveluated for performance in Season 2, following the sib identities. Example: Set-I: 16 HS families + 2 check varieties = 18 x 2 rep 9 entries/block: Example: Block No. 1 Block No. 2 ............., n II 36 -------------- 28 19 -------------- 27 I 18 -------------- 10 1 -------------- 9
  • 234. Stages in the cyclic selection schemes: Yield Trial on Progenies From Crosses (Season 2) (Season 5) Data Collection Estimation of Predicted Gain from Selection, from Yield Trial Data Analysis h2 estimation Estimation of Variance components Selection Estimation of Mating Nursery degree of dominance - Formation of families (Season I) Recombination (Season 4) (Season 3)
  • 235. ANOVA – DESIGN I For 1 Block: __________________________________________________________ Source d.f. EMS MS __________________________________________________________ Rep (2) r-1 = 1 Male (4) m-1 = 3 2 e + r2 f/m + rf2 m M1 Female/Male m(f-1) = 12 2 e + r2 f/m M2 M x F (m-1)(r-1) = 3 B/M x F (f)(m)(r-1) = 12 2 e M3 ___________________________________________________________ Total n-1 = 31 (rmf-1)
  • 236. DESIGN I Calculation of Heritability: M3 = 2 e 2 f/m = (M2-M3)/r 2 m = (M1-M2)/rf 2 T = 2m + 2f + 2 e 2 m = covariance of half-sibs = 1/4 VA (Falconer)
  • 237. DESIGN I 2 m = 1/4 2 A 2 A = 42 m 2 f/m = (M2 - M3)/ r = 1/4 VT 2 f/m = Cov. FS - Cov. paternal half sibs = 1/2 2 A + 1/4 2 D - 1/4 2 A = 1/4 2 A + 1/4 2 D = 1/4 (2 A + 2 D ) 42 f/m = 2 A + 2 D 2 D = 42 f/m - 42 m = 4(2 f/m - 2 m )
  • 238. DESIGN I h2 (m) = 2 A / 2 T = 42 m/2 T = h2 N h2 (f) = (2 A + 2 D )/2 T = 42 f/m /2 T = h2 B h2 (m+f) = 2 (2 m+2 f/m)/2 T
  • 239. DESIGN I Selection Phase: HS Family Selection – based on performance of HS families in Season 2 Yield Test. For Recombination phase, use remnant self seeds from males in Season 1. • The phases involving mating, testing, selection and recombination of selected families are conducted in a cyclic manner.
  • 240. Design II • Uses: – to estimate genetic components of variance – to estimate degree of dominance – to estimate epistatic variance – to calculate progress from selection • Also called Factorial Mating Design, where every male is crossed to one female in a factorial manner.
  • 241. Design II Example: Male inbred = 4 1 2 3 4 5 Female inbred= 4 6 7 8  produce 16 FS families - Population size to be tested is bigger – about twice the size that of Design I Example: with 4 males, 4 females, 16 crosses:
  • 242. Design II • Requires bigger population size in order to obtain information with the same precision as Design I, • Although the population to be used is much larger, the advantages of Design II are that: – it can estimate epistasis – suitable to be used in situation where some degree of inbreeding occurs in the population.
  • 243. ANOVA – Design II __________________________________________________________ Source d.f. EMS _________________________________________________________ Rep (2) r-1 = 1 Males (4) m-1 = 3 2 e + r2 MF + rf2 M Females (4) f-1 = 3 2 e + r2 MF + rm2 F M x F (m-1)(f-1) = 9 2 e + r2 MF Error (m-1)(f-1)(r-1) = 9 2 e ____________________________________________________________ Total 31
  • 244. • From here, the calculations for genetic components of variance and heritability can be computed. Given: 2 = 2 e 2 F = MSf-MSmxf/r 2 M = MSm-MSmxf/r = 1/4VA Design II
  • 245. Design III • Uses: 1. more powerful in estimating the degree of dominance. i.e. with a lesser amount of data, it gives a stronger estimate of the degree of dominance. Design-I = 10-12 times Design-II = 3-4 times Design-III = 1 time
  • 246. Design-III: Uses 2. In determining which generation to use, i.e F2, F4, etc, depends on the presence or absence of linkage – the stronger the linkage, the more advance is the generation required.
  • 247. ♀ ♀ (Original stock P1 – e.g. Inbred line A) Female ♀ ♀ ♀ ♀ ♀ ♀ Male ♂ ♂ ♂ ♂ ♂ ♂ ♂ ♂...... (any generation from the cross between the 2 parental stocks:) ♀ ♀ Female ♀ ♀ ♀ ♀ ♀ ♀ (Original Stock P2 – e.g. Inbred line B)
  • 248. • The source populations for this design are normally the product of a certain programme with specific objectives • Therefore, the evaluation on the progenies of of the 16 male parents (e.g. F2) in Season 2 will involve: 16 x2 (parents) = 32 FS families/ block + 2 checks/rep x 2 reps/block ____________ 72 plots/block Design III
  • 249. ANOVA – Design-III (1 block) _______________________________________________ Source df EMS _______________________________________________ Rep r-1 Female parent p-1 2e + r2MF + rm2F Male parent n-1 2e + r2MF + rp2M M x F (n-1)(p-1) 2e + r2MF Error (n-1)(p-1)(r-1)2e _______________________________________________ Design III
  • 250. Pascal’s triangle. 1 no segregating alleles 1 1 1 2 1 two alleles, 1 3 3 1 1 4 6 4 1 four alleles, 1 5 10 10 5 1 1 6 15 20 15 6 1 six alleles,
  • 251. Line x Tester Analysis • Kempthorre (1957) • Broad-based Tester • Narrow-based Testers • Why L x T – Cost
  • 252. Line x Tester Analysis • Uses – Information on GCA – Information on SCA – Information on gene effects – Male female relationship – Grouping
  • 253. Line x Tester Analysis T1 T2 T3 L1 L2 L3 . .
  • 254. Line x Tester Analysis based on performance of hybrid, in Season 2 Yield Test. Select good combinations The phases involving crossing and testing.
  • 255. Line x Tester Analysis ANOVA – Design-III (1 block) _____________________________________________ Source df MS ______________________________________________ Rep r-1 Genotypes g-1 Parents p-1 P vs. C 1 Crosses c-1 Lines l-1 Testers t-1 L x T (l-1)(t-1) Error (r-1)(g-1) _______________________________________________
  • 256. SSc = ΣCi 2/r– C.F. (GTc)2/rc SSp = Σpi 2/r– C.F. (GTp)2/rp SSpvs.c = SSg–SSc–SSp
  • 257. T1 T2 T3 Total 1 C1 C2 L1 2 L2 3 L3 4 L4 5 L5 T1 T2 T3 GT SSL = ΣLi 2 /tr– C.F.(crosses) SST = ΣTi 2/lr – C.F.(crosses) SSLxT =SSc-SSL-SST
  • 258. SSc =SSg-SSp-SS p vs. c SSp =SSg-SSc-SS p vs. c
  • 259. 5 lines, 3 testers, 4 reps. Blocks df Genotypes Parents, P P vs. C Crosses Lines Testers L x T Error
  • 260. 5 lines, 3 testers, 4 reps. Sources d.f. MS Blocks 3 27.66ns Genotypes 22 1479** Parents, P 7 899** P vs. C 1 53ns Crosses 14 1871** Lines 4 2579ns Testers 2 859ns L x T 8 1770** Error 66 91