Common Statistical Methods Used In Transgenic Fish Research

Common Statistical Methods Used
in Transgenic Fish Research

Session #6:
Common Statistical Methods Used In
Transgenic Fish Research
M.Afifi
M.Sc., Biostatistics(Joint Supervision with ISSR, Cairo University)
Ph.D., Candidate (AVC, UPEI, Canada)
E-mail: Afifi-stat6@hotmail.com
Tel: +201060658185

Before gene transfer After gene transfer
Statistics role

Before gene transfer
 Experimental Design: CRD, CBD
 Experimental unit:
 Single Fish or Fish tank,
 Replicates
 homogenous, same exper. condition
 Sample Size: 3, 6, 8, 12????

Basic Experimental Design for Transgenic Fish Research:
1. Setting experimental questions >>>> statistical questions
2. Setting hypotheses and then statistical null hypotheses
4. Statistical consideration (treatment groups, sample size, true replication, confounding
factors etc.)
5. Sampling design (independent, random, samples)
6. Data collection & measurement (Quality Control and Quality Assurance Procedures)
7. Data analysis
–Too few data: cannot obtain reliable conclusions
–Too many data: extra effort (time and money) in data collection

Fish
The transgenic fish used in this experiment were produced and raised in a biosecure
facility at the DFO/UBC Centre for Aquaculture and Environmental Research
(CAER) in West Vancouver, B.C., Canada.
Due to differences in growth rate, which produces fish of large size differences at
each age, control fish used were 1 year older than transgenic fish in order to
match fish to the same developmental stage and size.
Fish were cultured in filtered, aerated, flow-through well water at approximately 10
°C prior to and during the experiment. Since the two types of salmon reach smolt
size at different times of the year (the normal time of May/June in their second year
for non-transgenic fish, and in August/September of their first year for the growth
accelerated fish).

 After gene transfer
Qualitative Quantitative
Cold-tolerance growth reproductive traits
Salinity-tolerance Mass
biomass
food consumed
specific growth rate (SGR),
Protein efficiency ratio (PE)
Food conversion efficiency (EC)
Q-PCR

Comparative Biochemistry and Physiology journal
Impact Factor: 1.551

Introduction to Two of Basic Statistical Techniques:
 Group comparison methods for selection of
appropriate tests
 Correlation based methods

Flowcharts for selection of appropriate tests

 Basic rules of any statistical test
Assumption Hypothesis testing

 Basic rules of hypothesis testing
 Hypothesis:
• Null hypothesis, H0:
• Difference in (means, proportions, medians) is not actual (non-sig),
• Difference not due to treatment effect but due to any other reasons (Chance , Error)
• Alternative hypothesis HA : VS H0
 Test statistic- value: value calculated from the data (an algebraic expression particular to the
hypothesis we are testing),
 t-test >>>> t-value
 F-test >>>> F-value
 χ2-test >>>>>> χ2 value
 P-value: probability value (0-1) (Sig): Attached to each value of the test statistic It
 the probability of getting the observed effect (or one more extreme) if the null hypothesis is true

 Two-sample t-test (unpaired t-test)
 Compare the means in two independent groups of observations using representative
samples.
 Assumptions
 Two samples must be independent unrelated
 Normality A small departure from Normality is not crucial and leads to only a marginal loss in power
 Homoscedastic (equal variances) >>>> Checked by Levene’s test

Aquaculture International Journal, IF:1.878

 Figure 2. Growth performance in F1 transgenic and full sibling non-transgenic zebrafish. Fifty-four zebrafish of
F1 fry were randomly selected and grown individually under similar conditions. At the beginning of the
experiment, they were four week old. Zebrafish were weighed weekly during 6 weeks to monitor growth
performance. In the course of the experiment, fin DNA was extracted and assayed for transgene identification.
Weight of transgenic and non-transgenic full siblings was compared employing a Student t-Test (*, P < 0.05).

 Welch's t-test
(Unequal variances t-test)
 widely used modification of the t-test,
 adjusts the number of degrees of freedom when the variances are not equal to
each other.

If the sample sizes are not large,
equal variances not assumed
non-parametric method,
Mann–Whitney U test

 Methods of pairing:
 Self-pairing: each animal used as its own control (Before and After)
 Natural pairing: each pair of animals is biologically related (e.g. litter mates).
 Artificial (matched) pairing: each animal is paired with an animal matched with
respect to one or more factors that affect response.
 To avoid allocation bias in an experiment when there is self-pairing, each animal is
randomly allocated to receive one of the two treatments initially; it then receives
the other treatment later.
 If there is natural or matched pairing, one member of the pair is randomly allocated
to one of the two treatments and the other member receives the second treatment.

 Paired Vs. Independent Test

Wilcoxon rank test

F-test
ANOVA
Comparing more than two means

 Suppose, for example, we have four groups. >>>>> compare using a two-
sample t-test) for every combination of pairs of groups >>> six possible t-tests

 Principle
 Total variability in a data set is partitioned into a different source of variation.
 The sources of variation comprise one or more factors, each explained by the
levels or categories of that factor (e.g. the two levels, ‘male’ and ‘female’, defining
the factor ‘sex’, or three dose levels for a given drug factor), and also unexplained
or residual variation which results from uncontrolled biological variation and
technical error.
 We can assess the contribution of the different factors to the total variation by
making the appropriate comparisons of these variances.
 The variation is expressed by its variance

 The analysis of variance encompasses a broad spectrum of experimental
designs ranging from the simple to the complex.

 One-way analysis of variance
 Single factor with several levels or categories where each level comprises a group
of observations.
 For example, the levels may be:
 Feed formula for dogs: dry feed formula, a tinned feed and a raw meat
 Different treatment dose levels of a drug, one of which is a placebo representing
simply the drug vehicle, while the others are, say, 50%, 100% and 200% of the
presumed effective dose. Consider the simple case >>> only one factor ,
 2 sources of variation:
 Between the group means
 Within the groups

 In the experimental situation, the animals should be randomly allocated to one of
the levels of the factor, i.e. to one of the groups, in order to avoid allocation bias

 Assumptions:
 results are reliable only if the assumptions on which it is based are satisfied
 samples representing the levels are independent
 Observations in each sample come from a Normally distributed population with
variance σ2; this implies that the group variances are the same. Approximate
Normality may be established by drawing a histogram; moderate departures
from Normality have little effect on the result.
 Constant variance, the more important assumption, may be established by
Levene’s test

Post-hoc test
Multiple Comparisons of Means

Which group means Differs?????
Post-hoc Test
Multiple Comparison

 Multiple comparisons
 Conducting a number of tests, but the more tests that we perform, the more
likely it is that we will obtain a significant P-value on the basis of chance alone.
We have to approach this problem of multiple comparisons in such a way that
we avoid spurious P-values.
 Adjusted p-values are simply the unadjusted p-values multiplied by the number
of possible comparisons (six in this case);
 If multiplying a p-value by the number of comparisons produces a value greater
than one, the probability is given as 1.00.

Most Common Multiple comparisons
 Least significant difference (LSD)
 Duncan’s multiple range test, (DMRT)
 Tukey’s (HSD)
 Newman–Keuls tests,
 Bonferroni’s correction
 Scheffe’s
 . Be aware: they often produce slightly different results!

 Example
 Fig. 1. Growth rates and hormone profiles of wild-type (W),
domesticated (D), and GH transgenic (T) salmon. (A)
Specific growth rates (SGR).
 (B) Plasma IGF1 levels. n = 10 per genotype.
 Letters above bars denote significant differences among
groups (1-way ANOVA, P < 0.05).
 Error bars represent standard SEM.

 Example 2
 .
 Example
 Fig. 1. Plasma concentrations of growth hormone (A) in non-
transgenic and transgenic salmon fed full rations and inration-
restricted transgenic coho salmon (pair fed with controls).
 GH values (A) are pooled from samples (N = 23) taken on Sept.
11, 2002 and Oct. 11, 2002, which did not differ significantly.
 Statistical relationships between groups are indicated by letters
where significant differences occur.
 Bars are means ± SE, letters denote significant differences.

 TABLE 2.—Sample size (n), mean body weight, mean fork length, and mean condition factor (CF) for
all fish sampled.
 Different lowercase letters indicate statistically significant differences between populations (ANOVA).
The letters H, T, and N represent hatchery, transgenic, and cultured nontransgenic fish, respectively.

 Detailed statistical analysis by
category of analyzed data and
sex of coho salmon.
 Asterisks indicate statistically
significant values.
Abbreviations are as follows:
GSI 5 gonadosomatic index,
H 5 hatchery fish, T
5transgenic fish, and N 5
cultured nontransgenic fish.

 Enzyme activities were measured before (pre-diet treatment, n=8) and after a 12-week feeding trial (post-
diet treatment, n=3 replicates, n=4 fish/replicate). Differences between C and T pre-diet treatment (p<0.05)
are indicated by ⁎ on the larger value, and differences between fish (F) and diet (D) groups post-diet
treatment are indicated by differing letters (a, b, c).

 Reporting results in table:

Kruskal-wallis test

 Correlation (r)
 measure the degree of association by calculating Pearson’s product moment
correlation coefficient, usually just called the correlation coefficient or, sometimes, the
linear correlation coefficient.
 take any value from −1 to +1.

 Correlation (r)
 (a) perfect positive association,r = +1;
 (b) perfect negative association, r = −1;
 (c) positive association, r = +0.86;
 (d) negative association, r = −0.85;
 (e) no association, r = 0;
 (f) no linear association, r = 0.

 Above hatched cells includes analysis with all groups combined (non-transgenic, full-ration
transgenic and ration-restricted transgenic fish).
 Below hatched cells displays correlations for non-transgenic and full-ration transgenic fish
only.
 Correlation coefficients are shown for significant correlations only.
 aLiver GH correlations do not include non-transgenic fish in which expression was

Regression
 linear relationship between two numerical variables with a change in one variable
being associated with a change in the other, we may be interested in determining the
strength of that relationship.
 Are the points in the scatter diagram close to this line or are they widely dispersed
around it? Provided a linear relationship exists between the two variables, the closer
the points are to the line, the stronger the linear association between the two variables.

 Linear regression lines
 Body composition and energy content in
relation to wet body weight of growth
enhanced transgenic

 Linear regression lines
 Atlantic salmon >>>open
triangles
 controls >>>solid circles. fed to
satiation three timesrday on a
commercial diet.
 Each data point represents a
subsample of five fish. Data is
presented with fitted regression
lines solid lines. surrounded by
95% confidence intervals (dashed
lines).

Regression coefficients for the relation between body composition and energy content per fish wet
weight of growth enhanced transgenic Atlantic salmon and controls fed to satiation three
timesrday on a commercial diet: Y=b0+b1×BW
where ‘Y ’ is absolute nutrient or energy content,
‘b0’ and ‘b1’ are regression coefficients,
‘BW’ is wet body weight

 Nonlinear regression
 Second degree polynomial

 Nonlinear regression
 second degree polynomial

 International Journal of Molecular Sciences
 The genotype frequencies in HWE

 Statistical analysis
 The genotype frequencies were calculated and HWE was tested using a chi-
square test of
 The population genetic indexes including He, Ho, effective allele numbers (Ne)
and PIC were calculated by Nei’s method [25]. Generally, polymorphism
information content (PIC) is classified in to the following three types: low
polymorphism (PIC value < 0.25), median polymorphism (0.25 < PIC value <
0.5) and high polymorphism (PIC value > 0.5). The LD structure measured by
D’ and r2 was performed with the HAPLOVIEW software (Ver.3.32) [26].

 Association analyses between genotypes or haplotypes of GH gene and four growth
traits were performed using general linear model (GLM) procedure with SPSS 17.0
software (IBM, Armonk, NY, USA). We used the following statistical model:
 Y = u + G + e
 where Y is the phenotypic value of each trait;
 u is population mean value of 4 growth traits,
 G is the fixed genotype effect of each SNP, and
 e is the random error effect.
 Multiple comparisons between different genotypes were tested using the LSD
method with Bonferroni correction adjustment [27].

How would these results be reported in a scientific journal article?

 Mean ± SD or SEM with Both t-value and P-value
 Mean ± SD or SEM with only P-value

 Representing P-values with astrikes
 Representing P-values with superscripts

 Your Formal sentence must includes:
 Dependent , independent variable
 Exact p-value (unless the p value is less than .001). < 0.000 Or < 0.0001
 The direction of the effect as evidenced by the reported means, as well as a
statement about statistical significance,
 Symbol of the test (t), the degrees of freedom (6), the statistical value (2.95)

Common Statistical Methods Used In Transgenic Fish Research

Recommended

Recommended

More Related Content

Similar to Common Statistical Methods Used In Transgenic Fish Research

Similar to Common Statistical Methods Used In Transgenic Fish Research (20)

Recently uploaded

Recently uploaded (20)

Common Statistical Methods Used In Transgenic Fish Research