The document discusses common statistical methods used in transgenic fish research. It begins with an overview of experimental design considerations such as sample size and replication before gene transfer and measurement of traits after gene transfer. Key statistical techniques covered include t-tests, ANOVA, regression, and chi-square tests. Results are typically reported with mean and standard error or deviation values and indicated significance using letters or asterisks. Graphs such as bar plots and box plots are also used to visually present results.
Statistical analysis of biological data (comparison of proportions)Mohamed Afifi
This document discusses statistical methods for comparing proportions, including comparing two independent proportions using chi-squared tests and McNemar's test for comparing two paired proportions. It provides examples comparing the incidence of diabetes between castrated and non-castrated mice, comparing pregnancy rates between different cattle insemination training methods, and comparing two detection methods for Tritrichomonas foetus.
Experimental design and statistical power in swine experimentation: A present...Kareem Damilola
This document discusses experimental design and statistical power considerations for swine experimentation. It begins by explaining the importance of valid and precise animal experiments for developing new feeds and feeding standards. It then describes common experimental designs like completely randomized design and randomized complete block design. The document emphasizes that sufficient replication is needed within experiments to maximize statistical power. It also stresses the importance of performing a power analysis to determine adequate sample sizes and avoid type II errors. The document concludes by noting that power analyses are vital for efficient experimental planning and limiting unnecessary replications in swine research.
Female Barbary macaques implanted with the contraceptive Implanon exhibited behaviors indicating higher anxiety levels, such as increased self-scratching and self-grooming, compared to females without implants. They also showed more aggression and spent more time traveling while spending less time resting and giving grooming. There were no significant differences in foraging behavior between groups. These results suggest Implanon implantation had multiple effects on behavior in female Barbary macaques.
Standard error is used in the place of deviation. it shows the variations among sample is correlate to sampling error. list of formula used for standard error for different statistics and applications of tests of significance in biological sciences
ANOVA is a statistical technique used to analyze differences between group means and their associated procedures. It was developed by Ronald Fisher and partitions variance into components that can be attributed to different sources. In its simplest form, ANOVA provides a statistical test of whether population means of several groups are equal, generalizing the t-test to more than two groups. It is useful for comparing three or more group means for statistical significance.
Systematic review of animal studies. The case of statins. Pecoraro, Mojavpecoraro
Animal experiments provide a basis for decisions on the design and conduct of subsequent clinical trials. Our work evaluates the efficacy of statins in animal studies to provide an overall synthesis that facilitate the interpretation of aggregated data from the basic research conducted so far. What is the evidence from laboratory animals on the effects of statins in decreasing cholesterol levels and preventing or ameliorating cardiovascular diseases?
Statistical analysis of correlated data using generalized estimating equation...Angelina Lessa
- The document describes the statistical method of generalized estimating equations (GEE) which can be used to analyze correlated data, such as longitudinal or clustered data.
- GEE uses weighted combinations of observations to extract the appropriate amount of statistical information from correlated data, rather than treating observations as independent.
- Small examples are provided to illustrate how GEE calculates variances of statistics derived from correlated data by accounting for both the weights given to observations and the correlations between observations. This allows all data points to be used rather than discarding some related observations.
Statistical analysis of biological data (comparison of proportions)Mohamed Afifi
This document discusses statistical methods for comparing proportions, including comparing two independent proportions using chi-squared tests and McNemar's test for comparing two paired proportions. It provides examples comparing the incidence of diabetes between castrated and non-castrated mice, comparing pregnancy rates between different cattle insemination training methods, and comparing two detection methods for Tritrichomonas foetus.
Experimental design and statistical power in swine experimentation: A present...Kareem Damilola
This document discusses experimental design and statistical power considerations for swine experimentation. It begins by explaining the importance of valid and precise animal experiments for developing new feeds and feeding standards. It then describes common experimental designs like completely randomized design and randomized complete block design. The document emphasizes that sufficient replication is needed within experiments to maximize statistical power. It also stresses the importance of performing a power analysis to determine adequate sample sizes and avoid type II errors. The document concludes by noting that power analyses are vital for efficient experimental planning and limiting unnecessary replications in swine research.
Female Barbary macaques implanted with the contraceptive Implanon exhibited behaviors indicating higher anxiety levels, such as increased self-scratching and self-grooming, compared to females without implants. They also showed more aggression and spent more time traveling while spending less time resting and giving grooming. There were no significant differences in foraging behavior between groups. These results suggest Implanon implantation had multiple effects on behavior in female Barbary macaques.
Standard error is used in the place of deviation. it shows the variations among sample is correlate to sampling error. list of formula used for standard error for different statistics and applications of tests of significance in biological sciences
ANOVA is a statistical technique used to analyze differences between group means and their associated procedures. It was developed by Ronald Fisher and partitions variance into components that can be attributed to different sources. In its simplest form, ANOVA provides a statistical test of whether population means of several groups are equal, generalizing the t-test to more than two groups. It is useful for comparing three or more group means for statistical significance.
Systematic review of animal studies. The case of statins. Pecoraro, Mojavpecoraro
Animal experiments provide a basis for decisions on the design and conduct of subsequent clinical trials. Our work evaluates the efficacy of statins in animal studies to provide an overall synthesis that facilitate the interpretation of aggregated data from the basic research conducted so far. What is the evidence from laboratory animals on the effects of statins in decreasing cholesterol levels and preventing or ameliorating cardiovascular diseases?
Statistical analysis of correlated data using generalized estimating equation...Angelina Lessa
- The document describes the statistical method of generalized estimating equations (GEE) which can be used to analyze correlated data, such as longitudinal or clustered data.
- GEE uses weighted combinations of observations to extract the appropriate amount of statistical information from correlated data, rather than treating observations as independent.
- Small examples are provided to illustrate how GEE calculates variances of statistics derived from correlated data by accounting for both the weights given to observations and the correlations between observations. This allows all data points to be used rather than discarding some related observations.
Statistical analysis of biological data (comaprison of means)Mohamed Afifi
The document provides information about statistical analysis methods for comparing means from biological data, including t-tests, ANOVA, and post-hoc tests. It discusses two-sample t-tests, paired t-tests, one-way ANOVA, post-hoc multiple comparisons tests like Tukey's and Bonferroni tests, and assumptions and examples for each method. Flowcharts are also provided to help select the appropriate statistical test based on the study design and data.
This document provides an overview of parametric statistical tests, including the t-test, ANOVA, Pearson's correlation coefficient, and Z-test. It describes the assumptions, calculations, and procedures for each test. The t-test is used to compare means of small samples and can be used for one sample, two independent samples, or paired samples. ANOVA allows comparison of multiple population means and is used when more than two groups are involved. Pearson's correlation measures the strength of association between two continuous variables. The Z-test, which is used for larger samples, can be applied to compare means or proportions.
The document provides information about contact details for Hemant Trivedi and then discusses analysis of variance (ANOVA) techniques, including one-way, two-way, and three or more way ANOVA. It provides an example of using ANOVA to determine the best type of packaging among three options. The document also includes information about chi square tests, including their applications and a formula. It provides two examples for chi square test exercises. Finally, it discusses fundamentals of variables, types of tests, hypotheses, and choosing a significance level.
The document discusses different types of t-tests including one sample t-test, independent two sample t-test, and paired t-test. It provides examples and steps to perform each test. The one sample t-test examines if a population mean is equal to a hypothesized value. The independent two sample t-test determines if two independent population means are equal. The paired t-test evaluates if the mean difference between paired observations is zero. Assumptions for each test include independent observations, approximately normal distribution, and equal variances between samples.
This document discusses statistical tests and data analysis. It defines statistical power and describes two broad classifications of data: qualitative and quantitative. It then discusses different measures of central tendency like mean, median and mode. The document outlines different types of data distribution such as normal, binomial and Poisson distributions. It compares parametric and non-parametric statistical tests and their strengths. Finally, it discusses some commonly used parametric tests like t-tests and ANOVA to compare means between two or more samples.
This document discusses various parametric statistical tests including t-tests, ANOVAs, and regression. It provides information on when to use each test and their assumptions. T-tests can be used to compare the means of two samples and determine if any differences are statistically significant. There are one-sample, two-sample independent, and paired t-tests. ANOVA is used to compare the means of three or more samples and can identify which means are significantly different. Factorial and repeated measures ANOVAs are discussed.
Experimental design cartoon part 5 sample sizeKevin Hamill
Part 5 of 5 - Experimental design lecture series. This one focuses on sample size calculations and introduces some of the commonly used statistical tests (for normally distributed data). Toward the end it covers type I and II errors, alpha/beta and reducing variability.
This document provides an overview of different types of statistical tests used for data analysis and interpretation. It discusses scales of measurement, parametric vs nonparametric tests, formulating hypotheses, types of statistical errors, establishing decision rules, and choosing the appropriate statistical test based on the number and types of variables. Key statistical tests covered include t-tests, ANOVA, chi-square tests, and correlations. Examples are provided to illustrate how to interpret and report the results of these common statistical analyses.
The document discusses parametric hypothesis testing concepts like directional vs non-directional hypotheses, p-values, critical values, and types of parametric tests including t-tests, ANOVA, and when each should be used. It provides examples of one-way and two-way ANOVA, describing how one-way ANOVA is used when groups differ on one factor and two-way is used when groups differ on two or more factors. Key assumptions for parametric tests like normality and sample size are also outlined.
Assessment 3 ContextYou will review the theory, logic, and a.docxgalerussel59292
Assessment 3 Context
You will review the theory, logic, and application of t-tests. The t-test is a basic inferential statistic often reported in psychological research. You will discover that t-tests, as well as analysis of variance (ANOVA), compare group means on some quantitative outcome variable.
Recall that null hypothesis tests are of two types: (1) differences between group means and (2) association between variables. In both cases there is a null hypothesis and an alternative hypothesis. In the group means test, the null hypothesis is that the two groups have equal means, and the alternative hypothesis is that the two groups do not have equal means. In the association between variables type of test, the null hypothesis is that the correlation coefficient between the two variables is zero, and the alternative hypothesis is that the correlation coefficient is not zero.
Notice in each case that the hypotheses are mutually exclusive. If the null is false, the alternative must be true. The purpose of null hypothesis statistical tests is generally to show that the null has a low probability of being true (the p value is less than .05) – low enough that the researcher can legitimately claim it is false. The reason this is done is to support the allegation that the alternative hypothesis is true.
In this context you will be studying the details of the first type of test. This is the test of difference between group means. In variations on this model, the two groups can actually be the same people under different conditions, or one of the groups may be assigned a fixed theoretical value. The main idea is that two mean values are being compared. The two groups each have an average score or mean on some variable. The null hypothesis is that the difference between the means is zero. The alternative hypothesis is that the difference between the means is not zero. Notice that if the null is false, the alternative must be true. It is first instructive to consider some of the details of groups. Means, and difference between them.
Null Hypothesis Significance Test
The most common forms of the Null Hypothesis Significance Test (NHST) are three types of t tests, and the test of significance of a correlation. The NHST also extends to more complex tests, such as ANOVA, which will be discussed separately. Below, the null hypothesis and the alternative hypothesis are given for each of the following tests. It would be a valuable use of your time to commit the information below to memory. Once this is done, then when we refer to the tests later, you will have some structure to make sense of the more detailed explanations.
1. One-sample t test: The question in this test is whether a single sample group mean is significantly different from some stated or fixed theoretical value - the fixed value is called a parameter.
· Null Hypothesis: The difference between the sample group mean and the fixed value is zero in the population.
· Alternative hypothesis: T.
This document discusses statistical tests and hypotheses. It begins by introducing statistical significance and some key contributors to the field like Fisher, Student, and Gauss. It then discusses characteristics that can affect the range of estimates from samples like variability, mean, and sample size. Common tests are discussed like t-tests, z-tests, ANOVA, and nonparametric tests. Steps for performing t-tests are outlined as well as uses for paired t-tests. Formulas for t-tests are provided and differences between one-tailed and two-tailed tests are explained. Overall the document provides an overview of important statistical concepts and common tests used for hypotheses testing.
Parametric tests such as ANOVA allow researchers to compare means across multiple groups and determine if differences are statistically significant. ANOVA specifically compares variability between groups to variability within groups to assess if group means differ. If the ANOVA results in a p-value less than the significance level, it indicates that at least one group mean is significantly different from the others.
This document discusses various parametric tests used for hypothesis testing with quantitative data, including:
- One-sample t-test to compare a sample mean to a predefined value
- Two-sample t-test to compare means of two independent groups
- Paired t-test to compare means of two related/matched groups
- ANOVA tests to compare means of three or more groups, including one-way and two-way ANOVA
- Assumptions of parametric tests like normal distribution and additive effects are also outlined.
Parmetric and non parametric statistical test in clinical trailsVinod Pagidipalli
The document discusses parametric and non-parametric statistical tests used in clinical trials. Parametric tests like the z-test, t-test, ANOVA, and correlation tests are used when data follows a normal distribution. Non-parametric tests like the chi-square test, Fisher's exact test, and binomial test are used when data cannot be assumed to be normally distributed. Several statistical tests are described, including how to apply them in clinical trials to compare treatment groups, analyze associations between variables, and test hypotheses about population proportions.
Statistical inference: Statistical Power, ANOVA, and Post Hoc testsEugene Yan Ziyou
This document provides an overview of statistical power, analysis of variance (ANOVA), and post hoc tests. It defines statistical power and explains how to calculate power and minimum sample size. It then describes ANOVA, comparing it to t-tests. ANOVA partitions variability between and within groups. The document interprets ANOVA tables and explains F distributions. Conditions for ANOVA and post hoc tests like Bonferroni corrections are also covered. Finally, it briefly mentions different types of ANOVA like one-way and factorial.
This document discusses inferential statistics and hypothesis testing. It provides examples of researchers formulating hypotheses and collecting data to test them. Researchers take random samples from populations to test if there are meaningful differences between groups. Hypothesis testing involves comparing experimental and control groups after exposing them to different levels of an independent variable. The goal is to determine if the independent variable caused a detectable change in the dependent variable. Inferential statistics are used to test if sample means differ significantly, which would suggest the hypothesis is supported or not supported. Proper sampling and estimating sampling distributions, standard errors, and variability are important concepts for accurately testing hypotheses about populations based on sample data.
In this document, I have tried to illustrate most of the hypothesis testing like 1 sample,2 samples, etc, which I have covered to analyze the machine learning algorithms. I have focused on Independent statistical testing.
Now the question is why we use statistical testing? the answer is that we use statistical testing for significance analysis of our results, which I am going to deliver
1. Statistical tests are used in fisheries science to test hypotheses and make quantitative decisions about fisheries processes. Common statistical tests include correlation tests, comparison of means tests, regression analyses, and hypothesis tests.
2. The appropriate statistical test to use depends on the research design, data distribution, and variable type. Parametric tests are used for normally distributed data, while non-parametric tests are used when assumptions are not met.
3. Accuracy of statistical tests relies on quality survey data. Both fishery-dependent and fishery-independent data are important, though confounding factors must be considered with dependent data. Proper study design and use of statistics allows prediction of fish production.
Build applications with generative AI on Google CloudMárton Kodok
We will explore Vertex AI - Model Garden powered experiences, we are going to learn more about the integration of these generative AI APIs. We are going to see in action what the Gemini family of generative models are for developers to build and deploy AI-driven applications. Vertex AI includes a suite of foundation models, these are referred to as the PaLM and Gemini family of generative ai models, and they come in different versions. We are going to cover how to use via API to: - execute prompts in text and chat - cover multimodal use cases with image prompts. - finetune and distill to improve knowledge domains - run function calls with foundation models to optimize them for specific tasks. At the end of the session, developers will understand how to innovate with generative AI and develop apps using the generative ai industry trends.
End-to-end pipeline agility - Berlin Buzzwords 2024Lars Albertsson
We describe how we achieve high change agility in data engineering by eliminating the fear of breaking downstream data pipelines through end-to-end pipeline testing, and by using schema metaprogramming to safely eliminate boilerplate involved in changes that affect whole pipelines.
A quick poll on agility in changing pipelines from end to end indicated a huge span in capabilities. For the question "How long time does it take for all downstream pipelines to be adapted to an upstream change," the median response was 6 months, but some respondents could do it in less than a day. When quantitative data engineering differences between the best and worst are measured, the span is often 100x-1000x, sometimes even more.
A long time ago, we suffered at Spotify from fear of changing pipelines due to not knowing what the impact might be downstream. We made plans for a technical solution to test pipelines end-to-end to mitigate that fear, but the effort failed for cultural reasons. We eventually solved this challenge, but in a different context. In this presentation we will describe how we test full pipelines effectively by manipulating workflow orchestration, which enables us to make changes in pipelines without fear of breaking downstream.
Making schema changes that affect many jobs also involves a lot of toil and boilerplate. Using schema-on-read mitigates some of it, but has drawbacks since it makes it more difficult to detect errors early. We will describe how we have rejected this tradeoff by applying schema metaprogramming, eliminating boilerplate but keeping the protection of static typing, thereby further improving agility to quickly modify data pipelines without fear.
More Related Content
Similar to Common Statistical Methods Used In Transgenic Fish Research
Statistical analysis of biological data (comaprison of means)Mohamed Afifi
The document provides information about statistical analysis methods for comparing means from biological data, including t-tests, ANOVA, and post-hoc tests. It discusses two-sample t-tests, paired t-tests, one-way ANOVA, post-hoc multiple comparisons tests like Tukey's and Bonferroni tests, and assumptions and examples for each method. Flowcharts are also provided to help select the appropriate statistical test based on the study design and data.
This document provides an overview of parametric statistical tests, including the t-test, ANOVA, Pearson's correlation coefficient, and Z-test. It describes the assumptions, calculations, and procedures for each test. The t-test is used to compare means of small samples and can be used for one sample, two independent samples, or paired samples. ANOVA allows comparison of multiple population means and is used when more than two groups are involved. Pearson's correlation measures the strength of association between two continuous variables. The Z-test, which is used for larger samples, can be applied to compare means or proportions.
The document provides information about contact details for Hemant Trivedi and then discusses analysis of variance (ANOVA) techniques, including one-way, two-way, and three or more way ANOVA. It provides an example of using ANOVA to determine the best type of packaging among three options. The document also includes information about chi square tests, including their applications and a formula. It provides two examples for chi square test exercises. Finally, it discusses fundamentals of variables, types of tests, hypotheses, and choosing a significance level.
The document discusses different types of t-tests including one sample t-test, independent two sample t-test, and paired t-test. It provides examples and steps to perform each test. The one sample t-test examines if a population mean is equal to a hypothesized value. The independent two sample t-test determines if two independent population means are equal. The paired t-test evaluates if the mean difference between paired observations is zero. Assumptions for each test include independent observations, approximately normal distribution, and equal variances between samples.
This document discusses statistical tests and data analysis. It defines statistical power and describes two broad classifications of data: qualitative and quantitative. It then discusses different measures of central tendency like mean, median and mode. The document outlines different types of data distribution such as normal, binomial and Poisson distributions. It compares parametric and non-parametric statistical tests and their strengths. Finally, it discusses some commonly used parametric tests like t-tests and ANOVA to compare means between two or more samples.
This document discusses various parametric statistical tests including t-tests, ANOVAs, and regression. It provides information on when to use each test and their assumptions. T-tests can be used to compare the means of two samples and determine if any differences are statistically significant. There are one-sample, two-sample independent, and paired t-tests. ANOVA is used to compare the means of three or more samples and can identify which means are significantly different. Factorial and repeated measures ANOVAs are discussed.
Experimental design cartoon part 5 sample sizeKevin Hamill
Part 5 of 5 - Experimental design lecture series. This one focuses on sample size calculations and introduces some of the commonly used statistical tests (for normally distributed data). Toward the end it covers type I and II errors, alpha/beta and reducing variability.
This document provides an overview of different types of statistical tests used for data analysis and interpretation. It discusses scales of measurement, parametric vs nonparametric tests, formulating hypotheses, types of statistical errors, establishing decision rules, and choosing the appropriate statistical test based on the number and types of variables. Key statistical tests covered include t-tests, ANOVA, chi-square tests, and correlations. Examples are provided to illustrate how to interpret and report the results of these common statistical analyses.
The document discusses parametric hypothesis testing concepts like directional vs non-directional hypotheses, p-values, critical values, and types of parametric tests including t-tests, ANOVA, and when each should be used. It provides examples of one-way and two-way ANOVA, describing how one-way ANOVA is used when groups differ on one factor and two-way is used when groups differ on two or more factors. Key assumptions for parametric tests like normality and sample size are also outlined.
Assessment 3 ContextYou will review the theory, logic, and a.docxgalerussel59292
Assessment 3 Context
You will review the theory, logic, and application of t-tests. The t-test is a basic inferential statistic often reported in psychological research. You will discover that t-tests, as well as analysis of variance (ANOVA), compare group means on some quantitative outcome variable.
Recall that null hypothesis tests are of two types: (1) differences between group means and (2) association between variables. In both cases there is a null hypothesis and an alternative hypothesis. In the group means test, the null hypothesis is that the two groups have equal means, and the alternative hypothesis is that the two groups do not have equal means. In the association between variables type of test, the null hypothesis is that the correlation coefficient between the two variables is zero, and the alternative hypothesis is that the correlation coefficient is not zero.
Notice in each case that the hypotheses are mutually exclusive. If the null is false, the alternative must be true. The purpose of null hypothesis statistical tests is generally to show that the null has a low probability of being true (the p value is less than .05) – low enough that the researcher can legitimately claim it is false. The reason this is done is to support the allegation that the alternative hypothesis is true.
In this context you will be studying the details of the first type of test. This is the test of difference between group means. In variations on this model, the two groups can actually be the same people under different conditions, or one of the groups may be assigned a fixed theoretical value. The main idea is that two mean values are being compared. The two groups each have an average score or mean on some variable. The null hypothesis is that the difference between the means is zero. The alternative hypothesis is that the difference between the means is not zero. Notice that if the null is false, the alternative must be true. It is first instructive to consider some of the details of groups. Means, and difference between them.
Null Hypothesis Significance Test
The most common forms of the Null Hypothesis Significance Test (NHST) are three types of t tests, and the test of significance of a correlation. The NHST also extends to more complex tests, such as ANOVA, which will be discussed separately. Below, the null hypothesis and the alternative hypothesis are given for each of the following tests. It would be a valuable use of your time to commit the information below to memory. Once this is done, then when we refer to the tests later, you will have some structure to make sense of the more detailed explanations.
1. One-sample t test: The question in this test is whether a single sample group mean is significantly different from some stated or fixed theoretical value - the fixed value is called a parameter.
· Null Hypothesis: The difference between the sample group mean and the fixed value is zero in the population.
· Alternative hypothesis: T.
This document discusses statistical tests and hypotheses. It begins by introducing statistical significance and some key contributors to the field like Fisher, Student, and Gauss. It then discusses characteristics that can affect the range of estimates from samples like variability, mean, and sample size. Common tests are discussed like t-tests, z-tests, ANOVA, and nonparametric tests. Steps for performing t-tests are outlined as well as uses for paired t-tests. Formulas for t-tests are provided and differences between one-tailed and two-tailed tests are explained. Overall the document provides an overview of important statistical concepts and common tests used for hypotheses testing.
Parametric tests such as ANOVA allow researchers to compare means across multiple groups and determine if differences are statistically significant. ANOVA specifically compares variability between groups to variability within groups to assess if group means differ. If the ANOVA results in a p-value less than the significance level, it indicates that at least one group mean is significantly different from the others.
This document discusses various parametric tests used for hypothesis testing with quantitative data, including:
- One-sample t-test to compare a sample mean to a predefined value
- Two-sample t-test to compare means of two independent groups
- Paired t-test to compare means of two related/matched groups
- ANOVA tests to compare means of three or more groups, including one-way and two-way ANOVA
- Assumptions of parametric tests like normal distribution and additive effects are also outlined.
Parmetric and non parametric statistical test in clinical trailsVinod Pagidipalli
The document discusses parametric and non-parametric statistical tests used in clinical trials. Parametric tests like the z-test, t-test, ANOVA, and correlation tests are used when data follows a normal distribution. Non-parametric tests like the chi-square test, Fisher's exact test, and binomial test are used when data cannot be assumed to be normally distributed. Several statistical tests are described, including how to apply them in clinical trials to compare treatment groups, analyze associations between variables, and test hypotheses about population proportions.
Statistical inference: Statistical Power, ANOVA, and Post Hoc testsEugene Yan Ziyou
This document provides an overview of statistical power, analysis of variance (ANOVA), and post hoc tests. It defines statistical power and explains how to calculate power and minimum sample size. It then describes ANOVA, comparing it to t-tests. ANOVA partitions variability between and within groups. The document interprets ANOVA tables and explains F distributions. Conditions for ANOVA and post hoc tests like Bonferroni corrections are also covered. Finally, it briefly mentions different types of ANOVA like one-way and factorial.
This document discusses inferential statistics and hypothesis testing. It provides examples of researchers formulating hypotheses and collecting data to test them. Researchers take random samples from populations to test if there are meaningful differences between groups. Hypothesis testing involves comparing experimental and control groups after exposing them to different levels of an independent variable. The goal is to determine if the independent variable caused a detectable change in the dependent variable. Inferential statistics are used to test if sample means differ significantly, which would suggest the hypothesis is supported or not supported. Proper sampling and estimating sampling distributions, standard errors, and variability are important concepts for accurately testing hypotheses about populations based on sample data.
In this document, I have tried to illustrate most of the hypothesis testing like 1 sample,2 samples, etc, which I have covered to analyze the machine learning algorithms. I have focused on Independent statistical testing.
Now the question is why we use statistical testing? the answer is that we use statistical testing for significance analysis of our results, which I am going to deliver
1. Statistical tests are used in fisheries science to test hypotheses and make quantitative decisions about fisheries processes. Common statistical tests include correlation tests, comparison of means tests, regression analyses, and hypothesis tests.
2. The appropriate statistical test to use depends on the research design, data distribution, and variable type. Parametric tests are used for normally distributed data, while non-parametric tests are used when assumptions are not met.
3. Accuracy of statistical tests relies on quality survey data. Both fishery-dependent and fishery-independent data are important, though confounding factors must be considered with dependent data. Proper study design and use of statistics allows prediction of fish production.
Similar to Common Statistical Methods Used In Transgenic Fish Research (20)
Build applications with generative AI on Google CloudMárton Kodok
We will explore Vertex AI - Model Garden powered experiences, we are going to learn more about the integration of these generative AI APIs. We are going to see in action what the Gemini family of generative models are for developers to build and deploy AI-driven applications. Vertex AI includes a suite of foundation models, these are referred to as the PaLM and Gemini family of generative ai models, and they come in different versions. We are going to cover how to use via API to: - execute prompts in text and chat - cover multimodal use cases with image prompts. - finetune and distill to improve knowledge domains - run function calls with foundation models to optimize them for specific tasks. At the end of the session, developers will understand how to innovate with generative AI and develop apps using the generative ai industry trends.
End-to-end pipeline agility - Berlin Buzzwords 2024Lars Albertsson
We describe how we achieve high change agility in data engineering by eliminating the fear of breaking downstream data pipelines through end-to-end pipeline testing, and by using schema metaprogramming to safely eliminate boilerplate involved in changes that affect whole pipelines.
A quick poll on agility in changing pipelines from end to end indicated a huge span in capabilities. For the question "How long time does it take for all downstream pipelines to be adapted to an upstream change," the median response was 6 months, but some respondents could do it in less than a day. When quantitative data engineering differences between the best and worst are measured, the span is often 100x-1000x, sometimes even more.
A long time ago, we suffered at Spotify from fear of changing pipelines due to not knowing what the impact might be downstream. We made plans for a technical solution to test pipelines end-to-end to mitigate that fear, but the effort failed for cultural reasons. We eventually solved this challenge, but in a different context. In this presentation we will describe how we test full pipelines effectively by manipulating workflow orchestration, which enables us to make changes in pipelines without fear of breaking downstream.
Making schema changes that affect many jobs also involves a lot of toil and boilerplate. Using schema-on-read mitigates some of it, but has drawbacks since it makes it more difficult to detect errors early. We will describe how we have rejected this tradeoff by applying schema metaprogramming, eliminating boilerplate but keeping the protection of static typing, thereby further improving agility to quickly modify data pipelines without fear.
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Aggregage
This webinar will explore cutting-edge, less familiar but powerful experimentation methodologies which address well-known limitations of standard A/B Testing. Designed for data and product leaders, this session aims to inspire the embrace of innovative approaches and provide insights into the frontiers of experimentation!
Introduction to Jio Cinema**:
- Brief overview of Jio Cinema as a streaming platform.
- Its significance in the Indian market.
- Introduction to retention and engagement strategies in the streaming industry.
2. **Understanding Retention and Engagement**:
- Define retention and engagement in the context of streaming platforms.
- Importance of retaining users in a competitive market.
- Key metrics used to measure retention and engagement.
3. **Jio Cinema's Content Strategy**:
- Analysis of the content library offered by Jio Cinema.
- Focus on exclusive content, originals, and partnerships.
- Catering to diverse audience preferences (regional, genre-specific, etc.).
- User-generated content and interactive features.
4. **Personalization and Recommendation Algorithms**:
- How Jio Cinema leverages user data for personalized recommendations.
- Algorithmic strategies for suggesting content based on user preferences, viewing history, and behavior.
- Dynamic content curation to keep users engaged.
5. **User Experience and Interface Design**:
- Evaluation of Jio Cinema's user interface (UI) and user experience (UX).
- Accessibility features and device compatibility.
- Seamless navigation and search functionality.
- Integration with other Jio services.
6. **Community Building and Social Features**:
- Strategies for fostering a sense of community among users.
- User reviews, ratings, and comments.
- Social sharing and engagement features.
- Interactive events and campaigns.
7. **Retention through Loyalty Programs and Incentives**:
- Overview of loyalty programs and rewards offered by Jio Cinema.
- Subscription plans and benefits.
- Promotional offers, discounts, and partnerships.
- Gamification elements to encourage continued usage.
8. **Customer Support and Feedback Mechanisms**:
- Analysis of Jio Cinema's customer support infrastructure.
- Channels for user feedback and suggestions.
- Handling of user complaints and queries.
- Continuous improvement based on user feedback.
9. **Multichannel Engagement Strategies**:
- Utilization of multiple channels for user engagement (email, push notifications, SMS, etc.).
- Targeted marketing campaigns and promotions.
- Cross-promotion with other Jio services and partnerships.
- Integration with social media platforms.
10. **Data Analytics and Iterative Improvement**:
- Role of data analytics in understanding user behavior and preferences.
- A/B testing and experimentation to optimize engagement strategies.
- Iterative improvement based on data-driven insights.
Codeless Generative AI Pipelines
(GenAI with Milvus)
https://ml.dssconf.pl/user.html#!/lecture/DSSML24-041a/rate
Discover the potential of real-time streaming in the context of GenAI as we delve into the intricacies of Apache NiFi and its capabilities. Learn how this tool can significantly simplify the data engineering workflow for GenAI applications, allowing you to focus on the creative aspects rather than the technical complexities. I will guide you through practical examples and use cases, showing the impact of automation on prompt building. From data ingestion to transformation and delivery, witness how Apache NiFi streamlines the entire pipeline, ensuring a smooth and hassle-free experience.
Timothy Spann
https://www.youtube.com/@FLaNK-Stack
https://medium.com/@tspann
https://www.datainmotion.dev/
milvus, unstructured data, vector database, zilliz, cloud, vectors, python, deep learning, generative ai, genai, nifi, kafka, flink, streaming, iot, edge
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataKiwi Creative
Harness the power of AI-backed reports, benchmarking and data analysis to predict trends and detect anomalies in your marketing efforts.
Peter Caputa, CEO at Databox, reveals how you can discover the strategies and tools to increase your growth rate (and margins!).
From metrics to track to data habits to pick up, enhance your reporting for powerful insights to improve your B2B tech company's marketing.
- - -
This is the webinar recording from the June 2024 HubSpot User Group (HUG) for B2B Technology USA.
Watch the video recording at https://youtu.be/5vjwGfPN9lw
Sign up for future HUG events at https://events.hubspot.com/b2b-technology-usa/
The Ipsos - AI - Monitor 2024 Report.pdfSocial Samosa
According to Ipsos AI Monitor's 2024 report, 65% Indians said that products and services using AI have profoundly changed their daily life in the past 3-5 years.
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Kaxil Naik
Navigating today's data landscape isn't just about managing workflows; it's about strategically propelling your business forward. Apache Airflow has stood out as the benchmark in this arena, driving data orchestration forward since its early days. As we dive into the complexities of our current data-rich environment, where the sheer volume of information and its timely, accurate processing are crucial for AI and ML applications, the role of Airflow has never been more critical.
In my journey as the Senior Engineering Director and a pivotal member of Apache Airflow's Project Management Committee (PMC), I've witnessed Airflow transform data handling, making agility and insight the norm in an ever-evolving digital space. At Astronomer, our collaboration with leading AI & ML teams worldwide has not only tested but also proven Airflow's mettle in delivering data reliably and efficiently—data that now powers not just insights but core business functions.
This session is a deep dive into the essence of Airflow's success. We'll trace its evolution from a budding project to the backbone of data orchestration it is today, constantly adapting to meet the next wave of data challenges, including those brought on by Generative AI. It's this forward-thinking adaptability that keeps Airflow at the forefront of innovation, ready for whatever comes next.
The ever-growing demands of AI and ML applications have ushered in an era where sophisticated data management isn't a luxury—it's a necessity. Airflow's innate flexibility and scalability are what makes it indispensable in managing the intricate workflows of today, especially those involving Large Language Models (LLMs).
This talk isn't just a rundown of Airflow's features; it's about harnessing these capabilities to turn your data workflows into a strategic asset. Together, we'll explore how Airflow remains at the cutting edge of data orchestration, ensuring your organization is not just keeping pace but setting the pace in a data-driven future.
Session in https://budapestdata.hu/2024/04/kaxil-naik-astronomer-io/ | https://dataml24.sessionize.com/session/667627
2. Session #6:
Common Statistical Methods Used In
Transgenic Fish Research
M.Afifi
M.Sc., Biostatistics(Joint Supervision with ISSR, Cairo University)
Ph.D., Candidate (AVC, UPEI, Canada)
E-mail: Afifi-stat6@hotmail.com
Tel: +201060658185
5. Before gene transfer
Experimental Design: CRD, CBD
Experimental unit:
Single Fish or Fish tank,
Replicates
homogenous, same exper. condition
Sample Size: 3, 6, 8, 12????
6. Basic Experimental Design for Transgenic Fish Research:
1. Setting experimental questions >>>> statistical questions
2. Setting hypotheses and then statistical null hypotheses
4. Statistical consideration (treatment groups, sample size, true replication, confounding
factors etc.)
5. Sampling design (independent, random, samples)
6. Data collection & measurement (Quality Control and Quality Assurance Procedures)
7. Data analysis
–Too few data: cannot obtain reliable conclusions
–Too many data: extra effort (time and money) in data collection
11. Fish
The transgenic fish used in this experiment were produced and raised in a biosecure
facility at the DFO/UBC Centre for Aquaculture and Environmental Research
(CAER) in West Vancouver, B.C., Canada.
Due to differences in growth rate, which produces fish of large size differences at
each age, control fish used were 1 year older than transgenic fish in order to
match fish to the same developmental stage and size.
Fish were cultured in filtered, aerated, flow-through well water at approximately 10
°C prior to and during the experiment. Since the two types of salmon reach smolt
size at different times of the year (the normal time of May/June in their second year
for non-transgenic fish, and in August/September of their first year for the growth
accelerated fish).
12. After gene transfer
Qualitative Quantitative
Cold-tolerance growth reproductive traits
Salinity-tolerance Mass
biomass
food consumed
specific growth rate (SGR),
Protein efficiency ratio (PE)
Food conversion efficiency (EC)
Q-PCR
18. Basic rules of any statistical test
Assumption Hypothesis testing
19. Basic rules of hypothesis testing
Hypothesis:
• Null hypothesis, H0:
• Difference in (means, proportions, medians) is not actual (non-sig),
• Difference not due to treatment effect but due to any other reasons (Chance , Error)
• Alternative hypothesis HA : VS H0
Test statistic- value: value calculated from the data (an algebraic expression particular to the
hypothesis we are testing),
t-test >>>> t-value
F-test >>>> F-value
χ2-test >>>>>> χ2 value
P-value: probability value (0-1) (Sig): Attached to each value of the test statistic It
the probability of getting the observed effect (or one more extreme) if the null hypothesis is true
22. Two-sample t-test (unpaired t-test)
Compare the means in two independent groups of observations using representative
samples.
Assumptions
Two samples must be independent unrelated
Normality A small departure from Normality is not crucial and leads to only a marginal loss in power
Homoscedastic (equal variances) >>>> Checked by Levene’s test
25. Figure 2. Growth performance in F1 transgenic and full sibling non-transgenic zebrafish. Fifty-four zebrafish of
F1 fry were randomly selected and grown individually under similar conditions. At the beginning of the
experiment, they were four week old. Zebrafish were weighed weekly during 6 weeks to monitor growth
performance. In the course of the experiment, fin DNA was extracted and assayed for transgene identification.
Weight of transgenic and non-transgenic full siblings was compared employing a Student t-Test (*, P < 0.05).
26. Welch's t-test
(Unequal variances t-test)
widely used modification of the t-test,
adjusts the number of degrees of freedom when the variances are not equal to
each other.
27. If the sample sizes are not large,
equal variances not assumed
non-parametric method,
Mann–Whitney U test
31. Methods of pairing:
Self-pairing: each animal used as its own control (Before and After)
Natural pairing: each pair of animals is biologically related (e.g. litter mates).
Artificial (matched) pairing: each animal is paired with an animal matched with
respect to one or more factors that affect response.
To avoid allocation bias in an experiment when there is self-pairing, each animal is
randomly allocated to receive one of the two treatments initially; it then receives
the other treatment later.
If there is natural or matched pairing, one member of the pair is randomly allocated
to one of the two treatments and the other member receives the second treatment.
37. Suppose, for example, we have four groups. >>>>> compare using a two-
sample t-test) for every combination of pairs of groups >>> six possible t-tests
38. Principle
Total variability in a data set is partitioned into a different source of variation.
The sources of variation comprise one or more factors, each explained by the
levels or categories of that factor (e.g. the two levels, ‘male’ and ‘female’, defining
the factor ‘sex’, or three dose levels for a given drug factor), and also unexplained
or residual variation which results from uncontrolled biological variation and
technical error.
We can assess the contribution of the different factors to the total variation by
making the appropriate comparisons of these variances.
The variation is expressed by its variance
39.
40. The analysis of variance encompasses a broad spectrum of experimental
designs ranging from the simple to the complex.
41.
42.
43.
44. One-way analysis of variance
Single factor with several levels or categories where each level comprises a group
of observations.
For example, the levels may be:
Feed formula for dogs: dry feed formula, a tinned feed and a raw meat
Different treatment dose levels of a drug, one of which is a placebo representing
simply the drug vehicle, while the others are, say, 50%, 100% and 200% of the
presumed effective dose. Consider the simple case >>> only one factor ,
2 sources of variation:
Between the group means
Within the groups
45. In the experimental situation, the animals should be randomly allocated to one of
the levels of the factor, i.e. to one of the groups, in order to avoid allocation bias
46. Assumptions:
results are reliable only if the assumptions on which it is based are satisfied
samples representing the levels are independent
Observations in each sample come from a Normally distributed population with
variance σ2; this implies that the group variances are the same. Approximate
Normality may be established by drawing a histogram; moderate departures
from Normality have little effect on the result.
Constant variance, the more important assumption, may be established by
Levene’s test
49. Multiple comparisons
Conducting a number of tests, but the more tests that we perform, the more
likely it is that we will obtain a significant P-value on the basis of chance alone.
We have to approach this problem of multiple comparisons in such a way that
we avoid spurious P-values.
Adjusted p-values are simply the unadjusted p-values multiplied by the number
of possible comparisons (six in this case);
If multiplying a p-value by the number of comparisons produces a value greater
than one, the probability is given as 1.00.
50. Most Common Multiple comparisons
Least significant difference (LSD)
Duncan’s multiple range test, (DMRT)
Tukey’s (HSD)
Newman–Keuls tests,
Bonferroni’s correction
Scheffe’s
. Be aware: they often produce slightly different results!
51.
52. Example
Fig. 1. Growth rates and hormone profiles of wild-type (W),
domesticated (D), and GH transgenic (T) salmon. (A)
Specific growth rates (SGR).
(B) Plasma IGF1 levels. n = 10 per genotype.
Letters above bars denote significant differences among
groups (1-way ANOVA, P < 0.05).
Error bars represent standard SEM.
53. Example 2
.
Example
Fig. 1. Plasma concentrations of growth hormone (A) in non-
transgenic and transgenic salmon fed full rations and inration-
restricted transgenic coho salmon (pair fed with controls).
GH values (A) are pooled from samples (N = 23) taken on Sept.
11, 2002 and Oct. 11, 2002, which did not differ significantly.
Statistical relationships between groups are indicated by letters
where significant differences occur.
Bars are means ± SE, letters denote significant differences.
54. TABLE 2.—Sample size (n), mean body weight, mean fork length, and mean condition factor (CF) for
all fish sampled.
Different lowercase letters indicate statistically significant differences between populations (ANOVA).
The letters H, T, and N represent hatchery, transgenic, and cultured nontransgenic fish, respectively.
55. Detailed statistical analysis by
category of analyzed data and
sex of coho salmon.
Asterisks indicate statistically
significant values.
Abbreviations are as follows:
GSI 5 gonadosomatic index,
H 5 hatchery fish, T
5transgenic fish, and N 5
cultured nontransgenic fish.
58. Enzyme activities were measured before (pre-diet treatment, n=8) and after a 12-week feeding trial (post-
diet treatment, n=3 replicates, n=4 fish/replicate). Differences between C and T pre-diet treatment (p<0.05)
are indicated by ⁎ on the larger value, and differences between fish (F) and diet (D) groups post-diet
treatment are indicated by differing letters (a, b, c).
62. Correlation (r)
measure the degree of association by calculating Pearson’s product moment
correlation coefficient, usually just called the correlation coefficient or, sometimes, the
linear correlation coefficient.
take any value from −1 to +1.
63. Correlation (r)
(a) perfect positive association,r = +1;
(b) perfect negative association, r = −1;
(c) positive association, r = +0.86;
(d) negative association, r = −0.85;
(e) no association, r = 0;
(f) no linear association, r = 0.
64. Above hatched cells includes analysis with all groups combined (non-transgenic, full-ration
transgenic and ration-restricted transgenic fish).
Below hatched cells displays correlations for non-transgenic and full-ration transgenic fish
only.
Correlation coefficients are shown for significant correlations only.
aLiver GH correlations do not include non-transgenic fish in which expression was
65. Regression
linear relationship between two numerical variables with a change in one variable
being associated with a change in the other, we may be interested in determining the
strength of that relationship.
Are the points in the scatter diagram close to this line or are they widely dispersed
around it? Provided a linear relationship exists between the two variables, the closer
the points are to the line, the stronger the linear association between the two variables.
66. Linear regression lines
Body composition and energy content in
relation to wet body weight of growth
enhanced transgenic
67. Linear regression lines
Atlantic salmon >>>open
triangles
controls >>>solid circles. fed to
satiation three timesrday on a
commercial diet.
Each data point represents a
subsample of five fish. Data is
presented with fitted regression
lines solid lines. surrounded by
95% confidence intervals (dashed
lines).
68. Regression coefficients for the relation between body composition and energy content per fish wet
weight of growth enhanced transgenic Atlantic salmon and controls fed to satiation three
timesrday on a commercial diet: Y=b0+b1×BW
where ‘Y ’ is absolute nutrient or energy content,
‘b0’ and ‘b1’ are regression coefficients,
‘BW’ is wet body weight
75. Statistical analysis
The genotype frequencies were calculated and HWE was tested using a chi-
square test of
The population genetic indexes including He, Ho, effective allele numbers (Ne)
and PIC were calculated by Nei’s method [25]. Generally, polymorphism
information content (PIC) is classified in to the following three types: low
polymorphism (PIC value < 0.25), median polymorphism (0.25 < PIC value <
0.5) and high polymorphism (PIC value > 0.5). The LD structure measured by
D’ and r2 was performed with the HAPLOVIEW software (Ver.3.32) [26].
76. Association analyses between genotypes or haplotypes of GH gene and four growth
traits were performed using general linear model (GLM) procedure with SPSS 17.0
software (IBM, Armonk, NY, USA). We used the following statistical model:
Y = u + G + e
where Y is the phenotypic value of each trait;
u is population mean value of 4 growth traits,
G is the fixed genotype effect of each SNP, and
e is the random error effect.
Multiple comparisons between different genotypes were tested using the LSD
method with Bonferroni correction adjustment [27].
77.
78.
79. How would these results be reported in a scientific journal article?
86. Your Formal sentence must includes:
Dependent , independent variable
Exact p-value (unless the p value is less than .001). < 0.000 Or < 0.0001
The direction of the effect as evidenced by the reported means, as well as a
statement about statistical significance,
Symbol of the test (t), the degrees of freedom (6), the statistical value (2.95)