This document provides an introduction to biostatistics for health science students at Debre Tabor University in Ethiopia. It defines biostatistics as the application of statistical methods to medical and public health problems. The introduction outlines topics that will be covered, including defining key statistical concepts, classifying variables, and discussing the importance and limitations of biostatistics. Contact information is provided for the lecturer, Asaye Alamneh.
This document provides an introduction to biostatistics. It defines biostatistics as the application of statistical methods to biological and health sciences. Biostatistics is divided into descriptive and inferential biostatistics. Descriptive biostatistics summarizes and analyzes data, while inferential biostatistics draws conclusions about populations from samples. The document also defines key biostatistics terms and concepts, including populations, samples, parameters, statistics, and different types of data and variables.
Statistics is the study of collecting, organizing, summarizing, and interpreting data. Medical statistics applies statistical methods to medical data and research. Biostatistics specifically applies statistical methods to biological data. Statistics is essential for medical research, updating medical knowledge, data management, describing research findings, and evaluating health programs. It allows comparison of populations, risks, treatments, and more.
This document provides an introduction to basic biostatistics. It defines biostatistics as the application of statistical methods to biological and health sciences. Biostatistics has two main branches - descriptive biostatistics which summarizes data, and inferential biostatistics which draws conclusions from samples. The document discusses different types of data, variables, and measurement scales including nominal, ordinal, interval and ratio scales. It also covers topics like population and sample, parameters and statistics, and the stages of a statistical investigation.
This document provides an introduction to statistics, including definitions, reasons for studying statistics, and the scope and importance of statistics. It discusses how statistics is used in fields like insurance, medicine, administration, banking, agriculture, business, and sciences. It also outlines the main functions of statistics and its branches, including theoretical, descriptive, inferential, and applied statistics. Finally, it covers topics related to data representation, including methods of presenting data through tables, graphs, and diagrams.
This document provides an overview of basic health statistics and survey methods for health extension workers. It covers key learning objectives like preparing for and undertaking data collection, compiling and interpreting health data, and preparing reports. Various health data types, variables, and measurement scales are defined. Methods for collecting, analyzing and using health statistics like rates, ratios, proportions are explained. Measures of health like fertility, mortality and morbidity are also described. The document aims to equip health extension workers with essential statistical knowledge and skills for their work.
1. The document discusses a lecture on biostatistics including topics like introduction to statistics, exploratory tools for univariate data, probabilities and distribution curves, and sampling distribution of estimates.
2. It provides examples of different types of data like qualitative vs quantitative and discrete vs continuous data. It also discusses different scales of measurement.
3. Biostatistics is defined as the application of statistical methods to biological and health-related studies and it is widely used in areas like epidemiology, public health, and clinical research.
This document provides an overview of biostatistics and medical statistics. It defines key terms like statistics, biostatistics, medical statistics, and vital statistics. It describes sources of data for vital events and the uses of biostatistics in collecting and analyzing medical data to inform decisions. The document outlines different types of data, variables, scales of measurement, and methods for data collection and presentation. It also defines important statistical measures like mean, median, mode, range, and standard deviation used to summarize central tendencies and dispersion in data.
biostatstics :Type and presentation of datanaresh gill
The document provides an overview of different types of data and methods for presenting data. It discusses qualitative vs quantitative data, primary vs secondary data, and different ways to present data visually including bar charts, histograms, frequency polygons, scatter diagrams, line diagrams and pie charts. Guidelines are provided for tabular presentation of data to make it clear, concise and easy to understand.
This document provides an introduction to biostatistics. It defines biostatistics as the application of statistical methods to biological and health sciences. Biostatistics is divided into descriptive and inferential biostatistics. Descriptive biostatistics summarizes and analyzes data, while inferential biostatistics draws conclusions about populations from samples. The document also defines key biostatistics terms and concepts, including populations, samples, parameters, statistics, and different types of data and variables.
Statistics is the study of collecting, organizing, summarizing, and interpreting data. Medical statistics applies statistical methods to medical data and research. Biostatistics specifically applies statistical methods to biological data. Statistics is essential for medical research, updating medical knowledge, data management, describing research findings, and evaluating health programs. It allows comparison of populations, risks, treatments, and more.
This document provides an introduction to basic biostatistics. It defines biostatistics as the application of statistical methods to biological and health sciences. Biostatistics has two main branches - descriptive biostatistics which summarizes data, and inferential biostatistics which draws conclusions from samples. The document discusses different types of data, variables, and measurement scales including nominal, ordinal, interval and ratio scales. It also covers topics like population and sample, parameters and statistics, and the stages of a statistical investigation.
This document provides an introduction to statistics, including definitions, reasons for studying statistics, and the scope and importance of statistics. It discusses how statistics is used in fields like insurance, medicine, administration, banking, agriculture, business, and sciences. It also outlines the main functions of statistics and its branches, including theoretical, descriptive, inferential, and applied statistics. Finally, it covers topics related to data representation, including methods of presenting data through tables, graphs, and diagrams.
This document provides an overview of basic health statistics and survey methods for health extension workers. It covers key learning objectives like preparing for and undertaking data collection, compiling and interpreting health data, and preparing reports. Various health data types, variables, and measurement scales are defined. Methods for collecting, analyzing and using health statistics like rates, ratios, proportions are explained. Measures of health like fertility, mortality and morbidity are also described. The document aims to equip health extension workers with essential statistical knowledge and skills for their work.
1. The document discusses a lecture on biostatistics including topics like introduction to statistics, exploratory tools for univariate data, probabilities and distribution curves, and sampling distribution of estimates.
2. It provides examples of different types of data like qualitative vs quantitative and discrete vs continuous data. It also discusses different scales of measurement.
3. Biostatistics is defined as the application of statistical methods to biological and health-related studies and it is widely used in areas like epidemiology, public health, and clinical research.
This document provides an overview of biostatistics and medical statistics. It defines key terms like statistics, biostatistics, medical statistics, and vital statistics. It describes sources of data for vital events and the uses of biostatistics in collecting and analyzing medical data to inform decisions. The document outlines different types of data, variables, scales of measurement, and methods for data collection and presentation. It also defines important statistical measures like mean, median, mode, range, and standard deviation used to summarize central tendencies and dispersion in data.
biostatstics :Type and presentation of datanaresh gill
The document provides an overview of different types of data and methods for presenting data. It discusses qualitative vs quantitative data, primary vs secondary data, and different ways to present data visually including bar charts, histograms, frequency polygons, scatter diagrams, line diagrams and pie charts. Guidelines are provided for tabular presentation of data to make it clear, concise and easy to understand.
This document provides an introduction to biostatistics. It defines biostatistics as the application of statistical tools and concepts to data from biological sciences and medicine. The two main branches of statistics are described as descriptive statistics, which involves organizing and summarizing sample data, and inferential statistics, which involves generalizing from samples to populations. Several key statistical concepts are also defined, including populations, samples, variables, data types, levels of measurement, and common sampling methods. The objectives are to demonstrate knowledge of these fundamental statistical terms and concepts.
This document provides an introduction to statistics and data visualization. It discusses key topics including descriptive and inferential statistics, variables and types of data, sampling techniques, organizing and graphing data, measures of central tendency and variation, and random variables. Specifically, it defines statistics as collecting, organizing, summarizing, analyzing and making decisions from data. It also outlines the main differences between descriptive statistics, which describes data, and inferential statistics, which uses samples to make estimations about populations.
This document discusses key concepts in statistics including:
- Descriptive statistics involves collecting, organizing and presenting data to describe a situation. Inferential statistics involves making inferences about populations based on samples.
- There are different types of variables (qualitative, quantitative) and levels of measurement (nominal, ordinal, interval, ratio).
- Common data collection methods include surveys conducted by telephone, mail, or in-person interviews. Random sampling and stratified sampling are techniques for selecting samples from populations.
Analysis of statistical data in heath information managementSaleh Ahmed
This document discusses analysis of statistical data in health information management. It defines key terms like statistics, descriptive statistics, inferential statistics. It describes the different types of health statistics including vital statistics, morbidity statistics, and health service statistics. It also discusses how to calculate rates like crude rates and specific rates that are important measures for analyzing health data. Finally, it covers different methods for presenting statistical data, including tables, graphs, pie charts and histograms. The overall aim is to emphasize the importance of properly collecting, analyzing and presenting health statistics for effective healthcare planning and decision making.
The document contains an outline of the table of contents for a textbook on general statistics. It covers topics such as preliminary concepts, data collection and presentation, measures of central tendency, measures of dispersion and skewness, and permutations and combinations. Sample chapters discuss introduction to statistics, variables and data, methods of presenting data through tables, graphs and diagrams, computing the mean, median and mode, and other statistical measures.
1. Statistics is the collection, analysis, interpretation and presentation of numerical data. It has evolved from meaning information useful to the state to being a field that uses methods and techniques to analyze data and make decisions under uncertainty.
2. There are three main categories of statistics: numerical facts systematically arranged, statistics as a subject dealing with methods for analyzing data, and statistics as plural of statistic which refers to values computed from sample data.
3. Statistics is used in pharmaceutical sciences to design clinical studies, summarize and analyze collected data to answer research questions, and interpret and communicate results to regulatory agencies and scientific communities.
BIOSTATISTICS FUNDAMENTALS FOR BIOTECHNOLOGYGauravBoruah
1. This document discusses various methods for representing and summarizing data in statistics, including frequency distribution tables, graphical representations, and measures of central tendency and dispersion.
2. Frequency distribution tables organize raw data into a table by counting the frequency of observations in categories. Graphical representations like charts further simplify the data for analysis and interpretation.
3. Measures of central tendency and dispersion numerically summarize data by indicating typical values and the spread or variation in the data set. Together, these descriptive statistics techniques reduce large data sets into more manageable forms for analysis, interpretation, and drawing conclusions.
A frequency distribution summarizes data by organizing it into intervals and counting the frequency of observations within each interval. It presents the data distribution in a table or chart. To create one, you first collect data, identify the range of values, create intervals, count frequencies within each interval, and construct a table or chart showing the intervals and frequencies. Frequency distributions are useful for understanding central tendency, dispersion, patterns and making comparisons. They have many applications across fields like descriptive statistics, data analysis, business, economics, manufacturing, healthcare and education.
This document provides an introduction to quantitative methods and statistics. It defines statistics as the science of collecting, organizing, presenting, analyzing and interpreting data to assist in decision making. It outlines descriptive and inferential statistics, and describes variables, levels of measurement, characteristics of statistical data, uses of statistics, and limitations of statistics. It also discusses topics such as frequency distributions, measures of central tendency including the mean, median and mode, and measures of dispersion.
This document provides an overview of biostatistics. It defines biostatistics and discusses topics like data collection, presentation through tables and charts, measures of central tendency and dispersion, sampling, tests of significance, and applications of biostatistics in various medical fields. The document aims to introduce students to important biostatistical concepts and their use in research, clinical trials, epidemiology and other areas of medicine.
This document provides an introduction to statistics and biostatistics in healthcare. It defines statistics and biostatistics, outlines the basic steps of statistical work, and describes different types of variables and methods for collecting data. The document also discusses different types of descriptive and inferential statistics, including measures of central tendency, dispersion, frequency, t-tests, ANOVA, regression, and different types of plots/graphs. It explains how statistics is used in healthcare for areas like disease burden assessment, intervention effectiveness, cost considerations, evaluation frameworks, health care utilization, resource allocation, needs assessment, quality improvement, and product development.
This document provides an overview of biostatistics. It defines biostatistics and discusses topics like data collection, presentation through tables and charts, measures of central tendency and dispersion, sampling, tests of significance, and applications in various medical fields. The key areas covered include defining variables and parameters, common statistical terms, sources of data collection, methods of presenting data through tabulation and diagrams, analyzing data through measures like mean, median, mode, range and standard deviation, sampling and related errors, significance tests, and uses of biostatistics in areas like epidemiology and clinical trials.
1. Introduction to statistics in curriculum and Instruction
1 The definition of statistics and other related terms
1.2 Descriptive statistics
3 Inferential statistics
1.4 Function and significance of statistics in education
5 Types and levels of measurement scale
2. Introduction to SPSS Software
3. Frequency Distribution
4. Normal Curve and Standard Score
5. Confidence Interval for the Mean, Proportions, and Variances
6. Hypothesis Testing with One and two Sample
7. Two-way Analysis of Variance
8. Correlation and Simple Linear Regression
9. CHI-SQUARE
This document provides an introduction to statistics, including defining what statistics is, the different types of variables and scales of measurement, and why statistics is important in dentistry. It discusses how statistics can be used for research, understanding medical literature, and informing clinical decision making. Descriptive statistics are used to summarize and describe data, while inferential statistics allow generalizing beyond the sample data to the overall population. Nominal, ordinal, interval, and ratio scales of measurement are explained along with examples. The importance of understanding the scale of measurement is that it determines which statistical tests can appropriately be used for analysis.
This document provides an introduction to statistics, including defining what statistics is, the different types of variables and scales of measurement, and why statistics is important in dentistry. It discusses how statistics can be used for research, understanding medical literature, and informing clinical decision making. Descriptive statistics are used to summarize and describe data, while inferential statistics allow generalizing beyond the sample data to the overall population. Nominal scales name categories, ordinal scales rank order items, interval scales have equal intervals but an arbitrary zero point, and ratio scales have a true zero point where the absence of a trait can be measured.
This document provides an introduction to statistics, including defining what statistics is, the different types of variables and scales of measurement, and why statistics is important in dentistry. It discusses how statistics can be used for research, understanding medical literature, and informing clinical decision making. Descriptive statistics are used to summarize and describe data, while inferential statistics allow generalizing beyond the sample data to the overall population. Nominal, ordinal, interval, and ratio scales of measurement are explained along with examples. The importance of understanding the scale of measurement is that it determines which statistical tests can appropriately be used for analysis.
This document provides an overview of descriptive statistics as taught in a statistics course (STS 102) at Crescent University, Nigeria. It covers topics like statistical data collection methods, presentation of data through tables and graphs, measures of central tendency and dispersion. The key objectives of descriptive statistics are to summarize and describe characteristics of data through measures, charts and diagrams. Inferential statistics is also introduced as a way to make inferences about populations based on samples.
Statistics as a subject (field of study):
Statistics is defined as the science of collecting, organizing, presenting, analyzing and interpreting numerical data to make decision on the bases of such analysis.(Singular sense)
Statistics as a numerical data:
Statistics is defined as aggregates of numerical expressed facts (figures) collected in a systematic manner for a predetermined purpose. (Plural sense) In this course, we shall be mainly concerned with statistics as a subject, that is, as a field of study
1) Statistics is the study of collecting, organizing, analyzing, and drawing conclusions from data. It involves sampling, hypothesis testing, and using statistical tests tailored to measurement scales and hypothesis types.
2) Descriptive statistics describe and summarize data quantitatively, while inferential statistics allow generalizing from samples to populations through statistical testing and other methods.
3) The document discusses differences between statistics and statistical data, types of data, levels of measurement, sampling techniques, and uses of statistics.
This document provides an introduction to biostatistics. It defines statistics as the collection, organization, and analysis of data to draw inferences about a sample population. Biostatistics applies statistical methods to biological and medical data. The document discusses why biostatistics is studied, including that more aspects of medicine and public health are now quantified and biological processes have inherent variation. It also covers types of data, methods of data collection like questionnaires and observation, and considerations for designing questionnaires and conducting interviews.
Regression analysis can be used to analyze the relationship between variables. A scatter plot should first be created to determine if the variables have a linear relationship required for regression analysis. A regression line is fitted to best describe the linear relationship between the variables, with an R-squared value indicating how well it fits the data. Multiple regression allows for analysis of the relationship between a dependent variable and multiple independent variables and their individual contributions to explaining the variance in the dependent variable.
This document provides an introduction to biostatistics. It defines biostatistics as the application of statistical tools and concepts to data from biological sciences and medicine. The two main branches of statistics are described as descriptive statistics, which involves organizing and summarizing sample data, and inferential statistics, which involves generalizing from samples to populations. Several key statistical concepts are also defined, including populations, samples, variables, data types, levels of measurement, and common sampling methods. The objectives are to demonstrate knowledge of these fundamental statistical terms and concepts.
This document provides an introduction to statistics and data visualization. It discusses key topics including descriptive and inferential statistics, variables and types of data, sampling techniques, organizing and graphing data, measures of central tendency and variation, and random variables. Specifically, it defines statistics as collecting, organizing, summarizing, analyzing and making decisions from data. It also outlines the main differences between descriptive statistics, which describes data, and inferential statistics, which uses samples to make estimations about populations.
This document discusses key concepts in statistics including:
- Descriptive statistics involves collecting, organizing and presenting data to describe a situation. Inferential statistics involves making inferences about populations based on samples.
- There are different types of variables (qualitative, quantitative) and levels of measurement (nominal, ordinal, interval, ratio).
- Common data collection methods include surveys conducted by telephone, mail, or in-person interviews. Random sampling and stratified sampling are techniques for selecting samples from populations.
Analysis of statistical data in heath information managementSaleh Ahmed
This document discusses analysis of statistical data in health information management. It defines key terms like statistics, descriptive statistics, inferential statistics. It describes the different types of health statistics including vital statistics, morbidity statistics, and health service statistics. It also discusses how to calculate rates like crude rates and specific rates that are important measures for analyzing health data. Finally, it covers different methods for presenting statistical data, including tables, graphs, pie charts and histograms. The overall aim is to emphasize the importance of properly collecting, analyzing and presenting health statistics for effective healthcare planning and decision making.
The document contains an outline of the table of contents for a textbook on general statistics. It covers topics such as preliminary concepts, data collection and presentation, measures of central tendency, measures of dispersion and skewness, and permutations and combinations. Sample chapters discuss introduction to statistics, variables and data, methods of presenting data through tables, graphs and diagrams, computing the mean, median and mode, and other statistical measures.
1. Statistics is the collection, analysis, interpretation and presentation of numerical data. It has evolved from meaning information useful to the state to being a field that uses methods and techniques to analyze data and make decisions under uncertainty.
2. There are three main categories of statistics: numerical facts systematically arranged, statistics as a subject dealing with methods for analyzing data, and statistics as plural of statistic which refers to values computed from sample data.
3. Statistics is used in pharmaceutical sciences to design clinical studies, summarize and analyze collected data to answer research questions, and interpret and communicate results to regulatory agencies and scientific communities.
BIOSTATISTICS FUNDAMENTALS FOR BIOTECHNOLOGYGauravBoruah
1. This document discusses various methods for representing and summarizing data in statistics, including frequency distribution tables, graphical representations, and measures of central tendency and dispersion.
2. Frequency distribution tables organize raw data into a table by counting the frequency of observations in categories. Graphical representations like charts further simplify the data for analysis and interpretation.
3. Measures of central tendency and dispersion numerically summarize data by indicating typical values and the spread or variation in the data set. Together, these descriptive statistics techniques reduce large data sets into more manageable forms for analysis, interpretation, and drawing conclusions.
A frequency distribution summarizes data by organizing it into intervals and counting the frequency of observations within each interval. It presents the data distribution in a table or chart. To create one, you first collect data, identify the range of values, create intervals, count frequencies within each interval, and construct a table or chart showing the intervals and frequencies. Frequency distributions are useful for understanding central tendency, dispersion, patterns and making comparisons. They have many applications across fields like descriptive statistics, data analysis, business, economics, manufacturing, healthcare and education.
This document provides an introduction to quantitative methods and statistics. It defines statistics as the science of collecting, organizing, presenting, analyzing and interpreting data to assist in decision making. It outlines descriptive and inferential statistics, and describes variables, levels of measurement, characteristics of statistical data, uses of statistics, and limitations of statistics. It also discusses topics such as frequency distributions, measures of central tendency including the mean, median and mode, and measures of dispersion.
This document provides an overview of biostatistics. It defines biostatistics and discusses topics like data collection, presentation through tables and charts, measures of central tendency and dispersion, sampling, tests of significance, and applications of biostatistics in various medical fields. The document aims to introduce students to important biostatistical concepts and their use in research, clinical trials, epidemiology and other areas of medicine.
This document provides an introduction to statistics and biostatistics in healthcare. It defines statistics and biostatistics, outlines the basic steps of statistical work, and describes different types of variables and methods for collecting data. The document also discusses different types of descriptive and inferential statistics, including measures of central tendency, dispersion, frequency, t-tests, ANOVA, regression, and different types of plots/graphs. It explains how statistics is used in healthcare for areas like disease burden assessment, intervention effectiveness, cost considerations, evaluation frameworks, health care utilization, resource allocation, needs assessment, quality improvement, and product development.
This document provides an overview of biostatistics. It defines biostatistics and discusses topics like data collection, presentation through tables and charts, measures of central tendency and dispersion, sampling, tests of significance, and applications in various medical fields. The key areas covered include defining variables and parameters, common statistical terms, sources of data collection, methods of presenting data through tabulation and diagrams, analyzing data through measures like mean, median, mode, range and standard deviation, sampling and related errors, significance tests, and uses of biostatistics in areas like epidemiology and clinical trials.
1. Introduction to statistics in curriculum and Instruction
1 The definition of statistics and other related terms
1.2 Descriptive statistics
3 Inferential statistics
1.4 Function and significance of statistics in education
5 Types and levels of measurement scale
2. Introduction to SPSS Software
3. Frequency Distribution
4. Normal Curve and Standard Score
5. Confidence Interval for the Mean, Proportions, and Variances
6. Hypothesis Testing with One and two Sample
7. Two-way Analysis of Variance
8. Correlation and Simple Linear Regression
9. CHI-SQUARE
This document provides an introduction to statistics, including defining what statistics is, the different types of variables and scales of measurement, and why statistics is important in dentistry. It discusses how statistics can be used for research, understanding medical literature, and informing clinical decision making. Descriptive statistics are used to summarize and describe data, while inferential statistics allow generalizing beyond the sample data to the overall population. Nominal, ordinal, interval, and ratio scales of measurement are explained along with examples. The importance of understanding the scale of measurement is that it determines which statistical tests can appropriately be used for analysis.
This document provides an introduction to statistics, including defining what statistics is, the different types of variables and scales of measurement, and why statistics is important in dentistry. It discusses how statistics can be used for research, understanding medical literature, and informing clinical decision making. Descriptive statistics are used to summarize and describe data, while inferential statistics allow generalizing beyond the sample data to the overall population. Nominal scales name categories, ordinal scales rank order items, interval scales have equal intervals but an arbitrary zero point, and ratio scales have a true zero point where the absence of a trait can be measured.
This document provides an introduction to statistics, including defining what statistics is, the different types of variables and scales of measurement, and why statistics is important in dentistry. It discusses how statistics can be used for research, understanding medical literature, and informing clinical decision making. Descriptive statistics are used to summarize and describe data, while inferential statistics allow generalizing beyond the sample data to the overall population. Nominal, ordinal, interval, and ratio scales of measurement are explained along with examples. The importance of understanding the scale of measurement is that it determines which statistical tests can appropriately be used for analysis.
This document provides an overview of descriptive statistics as taught in a statistics course (STS 102) at Crescent University, Nigeria. It covers topics like statistical data collection methods, presentation of data through tables and graphs, measures of central tendency and dispersion. The key objectives of descriptive statistics are to summarize and describe characteristics of data through measures, charts and diagrams. Inferential statistics is also introduced as a way to make inferences about populations based on samples.
Statistics as a subject (field of study):
Statistics is defined as the science of collecting, organizing, presenting, analyzing and interpreting numerical data to make decision on the bases of such analysis.(Singular sense)
Statistics as a numerical data:
Statistics is defined as aggregates of numerical expressed facts (figures) collected in a systematic manner for a predetermined purpose. (Plural sense) In this course, we shall be mainly concerned with statistics as a subject, that is, as a field of study
1) Statistics is the study of collecting, organizing, analyzing, and drawing conclusions from data. It involves sampling, hypothesis testing, and using statistical tests tailored to measurement scales and hypothesis types.
2) Descriptive statistics describe and summarize data quantitatively, while inferential statistics allow generalizing from samples to populations through statistical testing and other methods.
3) The document discusses differences between statistics and statistical data, types of data, levels of measurement, sampling techniques, and uses of statistics.
This document provides an introduction to biostatistics. It defines statistics as the collection, organization, and analysis of data to draw inferences about a sample population. Biostatistics applies statistical methods to biological and medical data. The document discusses why biostatistics is studied, including that more aspects of medicine and public health are now quantified and biological processes have inherent variation. It also covers types of data, methods of data collection like questionnaires and observation, and considerations for designing questionnaires and conducting interviews.
Regression analysis can be used to analyze the relationship between variables. A scatter plot should first be created to determine if the variables have a linear relationship required for regression analysis. A regression line is fitted to best describe the linear relationship between the variables, with an R-squared value indicating how well it fits the data. Multiple regression allows for analysis of the relationship between a dependent variable and multiple independent variables and their individual contributions to explaining the variance in the dependent variable.
The document provides a summary of topics related to conditional probability, Bayes' theorem, and independent events. It includes examples and formulas for conditional probability, multiplication rule of probability, total probability rule, Bayes' rule, and independent events. It also discusses pairwise and mutually independent events. The document concludes with examples demonstrating applications of conditional probability, Bayes' theorem, multiplication rule, total probability rule, and independent events.
This document discusses statistical inference concepts including parameter estimation, hypothesis testing, sampling distributions, and confidence intervals. It provides examples of how to calculate point estimates, construct sampling distributions for sample means and proportions, and determine confidence intervals for population parameters using normal and t-distributions. The key concepts of statistical inference covered include parameter vs statistic, point vs interval estimation, properties of sampling distributions, and the components and calculation of confidence intervals.
The standard normal distribution, also called the Z-distribution, is a normal distribution with a mean of 0 and standard deviation of 1. To convert a random variable X with mean μ and standard deviation σ to the standard normal form Z, we calculate (X - μ)/σ. The normal distribution is widely used in statistics because many sampling distributions and transformations of variables tend toward normality for large samples. It also finds applications in approximating other distributions and in statistical quality control.
The document discusses three probability applications: 1) The probability of having 0, 1, 2, etc. boys before the first girl for a couple planning children. 2) The probability that the first, second, etc. anti-depressant drug tried is effective for a newly diagnosed patient, given a 60% effectiveness rate. 3) The expected number of donors that need to be tested to find a matching kidney donor for transplant, given a 10% probability of a random donor being a match.
The chi-square distribution is related to the normal distribution, as it is the distribution of the sum of squared normal random variables. The F distribution is the ratio of two chi-square random variables, each divided by its degrees of freedom. Both the chi-square and F distributions are used to test hypotheses about variances and compare variance estimates. To test if two samples have equal variances, the F test compares the ratio of the two sample variance estimates to the critical values of the F distribution with the degrees of freedom of each sample.
This document discusses measures of central tendency. It defines measures of central tendency as summary statistics that represent the center point of a distribution. The three main measures discussed are the mean, median, and mode. The mean is the sum of all values divided by the total number of values. There are different types of means including the arithmetic mean, weighted mean, and geometric mean. The document provides formulas for calculating each type of mean and discusses their properties and applications.
This document provides an introduction to the course "Design and Analysis of Clinical Trials". It discusses how clinical research uses statistics to investigate medical treatments and assess benefits of therapies. Statistics allow for reasonable inferences from collected data despite variability in patient responses. The course covers fundamental concepts of clinical trial design and analysis including phases of trials, randomization, sample size, treatment allocation, and ethical considerations. It aims to teach students how to generalize trial results to populations and combine empirical evidence with medical theory using statistical methods.
This document provides an overview of the R programming language and environment. It discusses why R is useful, outlines its interface and workspace, describes how to access help and tutorials, install packages, and input/output data. The interactive nature of R is highlighted, where results from one function can be used as input for another.
This document outlines a study on jointly modeling multivariate longitudinal measures of hypertension (blood pressure and pulse rate) and time to develop cardiovascular disease among hypertensive outpatients in Ethiopia. The study aims to identify factors affecting changes in blood pressure and pulse rate over time as well as time to develop cardiovascular complications. The study will collect longitudinal data on blood pressure, pulse rate and cardiovascular events from 178 hypertensive patients and analyze it using joint longitudinal-survival models. Preliminary results show changes in blood pressure and pulse rate over time differ between patients who did and did not develop cardiovascular events. Key factors like diabetes, family history of hypertension and clinical stage of hypertension affect both longitudinal outcomes and survival.
This document provides an introduction to common statistical terms and concepts used in biostatistics. It defines key terms like data, variables, independent and dependent variables. It also discusses populations and samples, and how random samples and random assignment are used in research. The document outlines descriptive statistics and different levels of measurement. It also explains concepts like measures of central tendency, frequency distributions, normal distributions, and skewed distributions. Finally, it discusses properties of normal curves and what the standard deviation represents.
R3 Stem Cell Therapy: A New Hope for Women with Ovarian FailureR3 Stem Cell
Discover the groundbreaking advancements in stem cell therapy by R3 Stem Cell, offering new hope for women with ovarian failure. This innovative treatment aims to restore ovarian function, improve fertility, and enhance overall well-being, revolutionizing reproductive health for women worldwide.
NURSING MANAGEMENT OF PATIENT WITH EMPHYSEMA .PPTblessyjannu21
Prepared by Prof. BLESSY THOMAS, VICE PRINCIPAL, FNCON, SPN.
Emphysema is a disease condition of respiratory system.
Emphysema is an abnormal permanent enlargement of the air spaces distal to terminal bronchioles, accompanied by destruction of their walls and without obvious fibrosis.
Emphysema of lung is defined as hyper inflation of the lung ais spaces due to obstruction of non respiratory bronchioles as due to loss of elasticity of alveoli.
It is a type of chronic obstructive
pulmonary disease.
It is a progressive disease of lungs.
As Mumbai's premier kidney transplant and donation center, L H Hiranandani Hospital Powai is not just a medical facility; it's a beacon of hope where cutting-edge science meets compassionate care, transforming lives and redefining the standards of kidney health in India.
Mental Health and well-being Presentation. Exploring innovative approaches and strategies for enhancing mental well-being. Discover cutting-edge research, effective strategies, and practical methods for fostering mental well-being.
Joker Wigs has been a one-stop-shop for hair products for over 26 years. We provide high-quality hair wigs, hair extensions, hair toppers, hair patch, and more for both men and women.
This particular slides consist of- what is hypotension,what are it's causes and it's effect on body, risk factors, symptoms,complications, diagnosis and role of physiotherapy in it.
This slide is very helpful for physiotherapy students and also for other medical and healthcare students.
Here is the summary of hypotension:
Hypotension, or low blood pressure, is when the pressure of blood circulating in the body is lower than normal or expected. It's only a problem if it negatively impacts the body and causes symptoms. Normal blood pressure is usually between 90/60 mmHg and 120/80 mmHg, but pressures below 90/60 are generally considered hypotensive.
Michigan HealthTech Market Map 2024. Includes 7 categories: Policy Makers, Academic Innovation Centers, Digital Health Providers, Healthcare Providers, Payers / Insurance, Device Companies, Life Science Companies, Innovation Accelerators. Developed by the Michigan-Israel Business Accelerator
Emotional and Behavioural Problems in Children - Counselling and Family Thera...PsychoTech Services
A proprietary approach developed by bringing together the best of learning theories from Psychology, design principles from the world of visualization, and pedagogical methods from over a decade of training experience, that enables you to: Learn better, faster!
End-tidal carbon dioxide (ETCO2) is the level of carbon dioxide that is released at the end of an exhaled breath. ETCO2 levels reflect the adequacy with which carbon dioxide (CO2) is carried in the blood back to the lungs and exhaled.
Non-invasive methods for ETCO2 measurement include capnometry and capnography. Capnometry provides a numerical value for ETCO2. In contrast, capnography delivers a more comprehensive measurement that is displayed in both graphical (waveform) and numerical form.
Sidestream devices can monitor both intubated and non-intubated patients, while mainstream devices are most often limited to intubated patients.
At Malayali Kerala Spa Ajman, Full Service includes individualized care for every client. We specifically design each massage session for the individual needs of the client. Our therapists are always willing to adjust the treatments based on the client's instruction and feedback. This guarantees that every client receives the treatment they expect.
By offering a variety of massage services, our Ajman Spa Massage Center can tackle physical, mental, and emotional illnesses. In addition, efficient identification of specific health conditions and designing treatment plans accordingly can significantly enhance the quality of massaging.
At Malayali Kerala Spa Ajman, we firmly believe that everyone should have the option to experience top-quality massage services regularly. To achieve that goal we offer cheap massage services in Ajman.
If you are interested in experiencing transformative massage treatment at Malayali Kerala Spa Ajman, you can use our Ajman Massage Center WhatsApp Number to schedule your next massage session.
Contact @ +971 529818279
Visit @ https://malayalikeralaspaajman.com/
Friendly Massage in Ajman - Malayali Kerala Spa Ajman
1. intro_biostatistics.pptx
1. Debre Tabor University is new and different
7/4/2023
Asaye.A
1
Debre Tabor University
College of Heath Science
Social and Public Health
Biostatistics Course for Health Science Students
Debre Tabor, Ethiopia
2. Contact detail
7/4/2023
Asaye.A
2
Asaye Alamneh (Lecturer of Biostatistics at DTU)
Debre Tabor University
College of Health Science
Department: Social and Public Health
Qualifications:
BSc in Statistics, MPH in Biostatistics
Contacts:
Email: asaye2127stat@gmail.com
Location: Debre Tabor University
4. Outlines of presentation
7/4/2023
Asaye.A
4
Definition of statistics and biostatistics
Basic statistical concepts
Classification of statistics
Types of variables
Application and limitation of biostatistics
5. Objectives
7/4/2023
Asaye.A
5
After completing this chapter, the student will be able to:
Define Statistics and Biostatistics
List some basic terms
Define and identify the different types of data and
understand why we need to classify variables
Describe the importance and limitations of statistics
Identify source of data
6. Definition of statistics
7/4/2023
Asaye.A
6
The word statistics come from the Latin “status” which refers to
political state or government.
Statistics can be defined in two ways:- plural sense and singular
sense.
1. Plural sense: are the aggregate of facts and figures, which are
expressed in numerical form.
For example: Statistics on industrial production, Population growth
in the country in different years, etc.
7. Definition of statistics
7/4/2023
Asaye.A
7
2. Singular sense: Statistics refers to the science of collection,
organization, presentation, analysis, and interpretation of numerical of
data.
It is useful to make data simple and easy to be understood by entire
population.
Help us to use numbers to communicate ideas.
For example: if we want to have a study about the distribution of
weights of the health science students in DTU.
8. Biostatistics
7/4/2023
Asaye.A
8
Biostatistics: application of statistical methods to medical, biological and
public health related problems.
When the data being analyzed are derived from the biological science and
medicine, we use the term biostatistics to manage medical uncertainties.
9. Biostatistics….
7/4/2023
Asaye.A
9
It is the scientific treatment given to the medical data derived from
group of individuals or patients.
Collection of data.
Presentation of the collected data.
Analysis and interpretation of the results.
Making decisions on the basis of such analysis.
10. Types of biostatistics
7/4/2023
Asaye.A
10
Based on how the data can be used, biostatistics can be classified in to
two main categories .
1. Descriptive statistics:
Ways of collecting, organizing, summarizing, and presenting data at
hand into concise manner to get an impression of the data.
Use to organize and describe the sample/population to simplify large
amount of data in sensible ways .
It also show the final results in the form of table and graph.
11. Types……..
7/4/2023
Asaye.A
11
2. Inferential statistics: are methods for using sample data to make general
conclusions (inferences) about populations.
Making conclusions for the population that is beyond available data.
For example:
Probability distribution,
Estimation,
Confidence interval,
Hypothesis testing,
Regression analysis, etc.
13. Stages of statistical investigation
7/4/2023
Asaye.A
13
A) Collection of data: measuring or gathering numerical data.
B) Organization of data: organizing and classifying the collected data.
C) Presentation of data: overview of the data in form of tables, graphs
and charts.
D) Analysis of data: extracting relevant information from the
summarized data
E) Interpretation of data: making generalization to the target
population.
14. Definitions of some basic terms
7/4/2023
Asaye.A
14
Population: A large group possessing a given characteristic or set of
characteristics .
A population may be finite or infinite
Parameter: characteristics obtained from the population or a single
measurement of population value.
Example: population mean (μ),population standard deviation (δ)
Statistic: characteristics obtained from the samples
Example: sample mean , mode , median SD, Variance etc
15. Cont….
7/4/2023
Asaye.A
15
Sampling: The technique of sample selection from the entire population
Sample: A subset of the population selecting by same sampling
techniques
Census: Complete enumeration of the population
16. Cont….
7/4/2023
Asaye.A
16
Data: is raw, unorganized facts that need to be processed.
When data is processed, organized, structured or presented in a given context
so as to make it useful, it is called information.
Figure1: Relation between data and information
18. Variable
7/4/2023
Asaye.A
18
Variable- a characteristic which take different values in different
persons, places or things or any aspect of population unit that is
measured or recorded.
e.g. height, weight, marital status, etc.
Random variables: are variables whose value are determined by
chance.
Data: are sets of values of one or more variables.
Are numbers which can be measurements or can be obtained by
counting.
Data set: it is a collection of observation on a variable.
20. Types of Variables (1)
7/4/2023
Asaye.A
20
Depending on the characteristic of the measurement, variable can
be classified into two types.
1. Quantitative (numerical variable): it is one that can be measured
and expressed quantitatively or numerically.
It is the result of measuring or counting attributes population.
Quantitative variables are also subdivided into two types:-
A. Discrete variable
B. Continuous variable:
21. Types …..
7/4/2023
Asaye.A
21
A. Discrete variable:
A variable whose values are countable and assign a whole
number.
There is no decimal number.
E.g. the number of daily admission of hospital, number of live
births per 1000 women, number of motor vehicle accident in Debre
Tabor town.
22. Types …..
7/4/2023
Asaye.A
22
B. Continuous variable: the one that does not have gaps or interruption.
A variable that can assume any decimal number value over a certain
intervals.
For example;
Serum cholesterol level of a patient,
Weight,
Age,
Laboratory result,
Time, Arm circumference
23. Types …..
7/4/2023
Asaye.A
23
2. Qualitative (categorical) variable: it can not be quantified or
measured numerically, but measured by assigning names to
items (events).
E.g. sex, marital status, race or ethnic group, occupational status,
eye color etc.
A. Nominal variables: variables with no inherent order or ranking
sequence.
B. Ordinal variables: variables with an ordered series .
24. Types of Variables (2)
7/4/2023
Asaye.A
24
Dependent variable: the outcome of interest, which should change in
response to some intervention.
Some times called as out come or response variable.
Independent variable: is the intervention, or what is being
manipulated.
a variable that you believe might influence your outcome
measure.
An independent variable is a hypothesized cause or influence on a
dependent variable.
25. Type of scales of measurement
7/4/2023
Asaye.A
25
Based on the nature of the variable, variables can be measured
into four d/t levels of measurement.
Measurement is defined as the assignment of numbers, symbols
and/or names to objects or events.
26. Type of scales….
7/4/2023
Asaye.A
26
Each scale of measurement has certain properties which in turn
determine the appropriateness for use of certain statistical analyses.
The property of value assigned to data based on the three properties of
measurement such as, order, distance and fixed zero/true zero.
The four scales/levels of measurement are nominal, ordinal, interval
and ratio.
27. 1. Nominal scale
7/4/2023
Asaye.A
27
It is the lowest level of measurement.
It simply consists of "naming" or classifying them into various
mutually exclusive, all inclusive categories in which no order or
ranking can be imposed on the data.
When numbers are assigned to categories, it only used for coding
purposes and it does not provide a sense of size.
28. Cont…
7/4/2023
Asaye.A
28
No arithmetic and relational operation can be applied.
Nominal measurements have no three properties among values.
For example;
Sex of a person (M, F),
Eye color (e.g. brown, blue),
Religion (Muslim, Christian),
Place of residence (urban, rural),
Race (e.g. black, white).
29. 2. Ordinal scale
7/4/2023
Asaye.A
29
Level of measurement which classifies data into categories that
can be ranked. Differences between the ranks do not exist.
Relational operations of greater than, less than are applicable,
The real difference between the ranks do not exist.
30. Cont…
7/4/2023
Asaye.A
30
Example;-
Socio-economic status (very low, low, medium, high, very high)
Patient status (unimproved, improved, much improved),
Height of patients (very short, short, tall, very tall),
Blood pressure (very low, low, high, very high),
Job satisfaction level (highly dissatisfied, dissatisfied, satisfied,
highly satisfied), etc
31. 3. Interval Scale
7/4/2023
Asaye.A
31
It is possible to rank or order and tell the real distance between any two
measurements.
However, there is no meaningful zero, so ratios are meaningless.
All arithmetic operations except division are applicable.
Relational operations are also possible.
32. Cont…
7/4/2023
Asaye.A
32
The selected zero point is not necessarily a true zero in which it doesn't
have to indicate a total absence of the quantity being measured.
Not that, zero degree Celsius is arbitrary so it does not make sense to
say that 20 degree Celsius is twice hot as 10 degree Celsius.
Examples:
Body temperature in OF or OC, time of the day, days of the year, test
score, IQ…
33. 4. Ratio scale
7/4/2023
Asaye.A
33
Is the highest level of measurement.
It classifies data that can be ranked, differences are meaningful, and
there is a true zero. True ratios exist between the different units of
measure.
There is always a true zero point, which shows the absence of
condition.
All arithmetic and relational operations are applicable.
Example: volume, height, weight, length, number of items, etc.
35. Summary of levels of measurement
No
No
No
Yes
Nominal
No
No
Yes
Yes
Ordinal
No
Yes
Yes
Yes
Interval
Yes
Yes
Yes
Yes
Ratio
Determine if one
data value is a
multiple of another
Subtract data
values
Arrange
data in
order
Put data in
categories
Level of
measurement
37. Why we need Biostatistics?
7/4/2023
Asaye.A
37
The main theory of statistics lies in the term variability.
We can also have instrumental variability and observers variability.
1. Handling variation.
1. Biological variation: variation among individuals as well as within
individuals over time.
For example; height, weight, blood pressure,….
2. Sample variation: biomedical research project are usually carried out on
small numbers of study subjects.
38. Why we need Biostatistics?
7/4/2023
Asaye.A
38
2. Essential for scientific statistical methods of investigation.
Formulate hypothesis.
Design study to objectively to test hypothesis.
Collect reliable and unbiased data.
Process and evaluate data rigorously.
Interpretate and making appropriate conclusion.
These statistical methods are designed to contribute to the process of
making scientific judgment in the face of uncertainties and variation.
39. Why we need Biostatistics?
7/4/2023
Asaye.A
39
It helps the researcher to arrive at a scientific judgment about a
hypothesis.
It study the association between two or more attributes
To evaluate the efficacy of drugs
To determine the success or failure of health care program
To define and measure the extent of the disease
Statistical methods help us to understand public health issues and
disease, also quantifying uncertainties present in basic medical
sciences.
40. Limitations of statistics
7/4/2023
Asaye.A
40
It deals with only those subjects of investigation that are capable of being
quantitatively measured and numerically expressed.
It deals on only aggregates of facts and no importance to individual items.
Statistical data are only approximately and not mathematically correct.
Statistics can be easily misused and therefore should be used be experts.
41. Sources of data
7/4/2023
Asaye.A
41
There are two basic sources of statistical data: These are:
1. Primary data: The first hand data were collected from the items or
individual respondents directly by researcher primarily for the purpose of
certain study.
42. Primary data…
7/4/2023
Asaye.A
42
The major primary sources of data are :-
Surveys,
Surveillance,
Census,
Observation and,
Experimental studies.
43. Secondary data
7/4/2023
Asaye.A
43
2. Secondary data: which had been collected by certain people or agency, and
statistically treated and the information contained in it is used for other purpose.
For example: hospital records, magazines, CSA, DHS, and vital statistics:
Birth reports,
Death reports,
Epidemic reports
Reports of laboratory utilization (including laboratory test results)
44. Exercises
7/4/2023
Asaye.A
44
For each of the following variable indicate whether it is quantitative or
qualitative and specify the measurement scale for each variable :
1. Blood Pressure (mmHg)
2. Cholesterol (mmol/l)
3. Diabetes (Yes/No)
4. Body Mass Index (Kg/m2)
5. Age (years)
6. Employment (paid work/retired/housewife)
7. Smoking Status (smokers/non-smokers, ex-smokers)
8. Exercise (hours per week)
9. Drink alcohol (units per week)
10. Level of pain (mild/moderate/severe)
46. Methods of data collection and
presentation….
7/4/2023
Asaye.A
46
At the end of this chapter ,you should be able to
– Understand Method of Data collection
– Identify Method of data Presentation
Tabular Presentation
Diagrammatic presentation
Graphic presentation
47. 1. Methods of data collection
7/4/2023
Asaye.A
47
Data collection techniques allow us to systematically collect
data about our objects of study (people, objects, and
phenomena) and about the setting in which they occur
Data can be obtained by a variety of ways. One of the most
common is through the use of surveys
Surveys can be done by using a variety of data collection
methods
49. Methods……
7/4/2023
Asaye.A
49
1. Observation- It is a technique that involves systematically
selecting, watching and recoding behaviors of people or other
phenomena and for the purpose of getting (gaining) specified
information.
It includes all methods from simple visual observations to the
use of high level machines and measurements, sophisticated
equipment or facilities
50. Methods……
7/4/2023
Asaye.A
50
Advantage- it gives relatively more accurate data on behavior and
activities.
Disadvantage: -
Investigator’s (observers) bias
It requires more resource and
Skilled human power during use of high level machines.
52. Methods…
7/4/2023
Asaye.A
52
I. Direct personal interview
The investigator presents himself /herself personally before the
informant and questions him /her personally
Best suited to situations where problems are not completely
understood and where questions can not be formulated before
hand and one question leads to other.
Disadvantage
It is time consuming
It is not suited for large group of informants
53. Methods…
7/4/2023
Asaye.A
53
II . Interviewing using questionnaire
One drafts a detailed questionnaire
The investigator appoints agents known as enumerators, who go to the respondents
personally with the questionnaire,
Ask them the questions given there in, and
Record their replies
They can be
Face-to-face or
Telephone interviews.
54. Methods…
7/4/2023
Asaye.A
54
Face-to-face interviews
Advantage
The interviewer knows exactly who is responding to the questionnaire.
The interviewer can help the respondent if he/she has difficulty in
understanding the questions e.g. language, concentration.
There is more flexibility in presenting the items ;they can range from
closed to open.
Observations can be made as well.
55. Methods…
7/4/2023
Asaye.A
55
Disadvantage
Untrained interviewer may distort the meaning of questions.
Attributes of the interviewer may affect the responses given due to
bias of the interviewer and his/ her social or ethnic characteristics.
More cost in terms of time and money (training and salary of
interviewers).
56. Methods…
7/4/2023
Asaye.A
56
Telephone interviews
Advantage
Less expensive in time and money compared with face to face interviews
The interviewer is able to help the respondent if he/she doesn’t understand
the question
Broad representative samples can be obtained for those who have
telephone lines
May assure the uniformity if interviewer is the same.
58. 3. Questionnaires
7/4/2023
Asaye.A
58
Self administered questionnaires:- the respondent reads the question and
fill the answers by themselves.
Advantage
Is simpler and cheaper.
Can be administered to many persons simultaneously (e.g. to a class
of students).
Disadvantage
They demand a certain level of education and skill on the part of the
respondents.
59. Methods…
7/4/2023
Asaye.A
59
Postal questionnaire
The questionnaires are sent by post to the informants together with
a polite covering letter by explaining
The detail information
The aims and objectives of collecting the information, and
Requesting the respondents to cooperate by furnishing the
correct replies and returning the questionnaire duly filled in.
The return postage expenses are usually covered by the investigator.
60. Methods…
7/4/2023
Asaye.A
60
The main problems with postal questionnaire are :
Response rates tend to be relatively low, and
There may be under representation of less literate subjects.
61. Methods…
7/4/2023
Asaye.A
61
Mailed Questionnaire
The questionnaire is mailed to respondents to be filled.
Some times known as self enumeration.
Advantage
Cheap
No need for trained interviewers.
No interviewer bias.
They can be coordinated from one central location.
62. Methods…
7/4/2023
Asaye.A
62
Disadvantage
Low response rate.
Uncompleted questionnaires due to omission or invalid
response.
No assurance that the questionnaire was answered by right
person.
Needs intense follow up to get a high response rate.
63. Methods……
7/4/2023
Asaye.A
63
3. Extraction of data from records
Clinical and other personal records, death certificates, published mortality
statistics, census publications, etc.
Examples;
1. Official publications of Central Statistical Authority
2. Publication of Ministry of Health and Other Ministries
3. News Papers and Journals.
64. Methods…
7/4/2023
Asaye.A
64
4. International Publications like Publications by WHO, World Bank,
UNICEF.
5. Records of hospitals or any Health Institutions.
During the use of data from documents, though they are less time
consuming and relatively have low cost, take care on the quality and
completeness of the data.
65. Problems in gathering data
7/4/2023
Asaye.A
65
Common problems might include:
Language barriers
Lack of adequate time
Expense
Inadequately trained and experienced staff
Bias
Cultural norms
66. Choosing method of data collection
7/4/2023
Asaye.A
66
To chose a better data collection method, we have to focus on
relevant, timely, accurate and usability of information.
Some methods pay attention to timeliness and reduction in cost.
Others pay attention to accuracy and the strength of the method
in using scientific approaches.
67. Cont…
7/4/2023
Asaye.A
67
The selection of the method of data collection is also based on
practical considerations, such as:
The need for personnel, skills, equipment, etc. in relation to what
is available.
The acceptability of the procedures to the subjects.
The probability that the method will provide a good coverage.
i.e. will supply the required information about all or almost all
members of the population.
68. Types of questions
7/4/2023
Asaye.A
68
Before looking the steps in questionnaire design, we need to review
the types of questions.
There are two types of questions
1. Open ended (free-response)
2. Close ended (restricted choice)
1. open ended
e.g. in your opinion what is the biggest barrier in getting your hospitals
ANC unit patient.
69. Types……
7/4/2023
Asaye.A
69
Advantages- it stimulates free thoughts of respondent
Helpful to obtain information on sensitive issues
Disadvantages- there may problem of recalling answers
It is not suitable for mailed question
Answers are difficult to code for statistical analysis
The problem of poor hand writing
70. Types……
7/4/2023
Asaye.A
70
2. Close ended- provides fixed answers
e.g. including your present visit how many times did you visit this
hospital in the past two yrs?
A. Once B. Twice C. 3x D. 4x E. >4x
Advantage- suitable for many forms of statistical analysis
Not difficult to code
Disadvantage- limits a variety of details
71. Types……
7/4/2023
Asaye.A
71
Partially open ended question
Advantage- provides alternatives if certain option are over looked
it identifies missing categories for future use
Disadvantage- respondent may ignore other options
e.g. if the house hold lost any of its members due to death in the last 12
months what was the cause of death.
1.Malaria 3. car accident
2. famine/hunger 4.others specify
72. Requirements of questions
7/4/2023
Asaye.A
72
Must have face validity
The question that we design should be one that give an obviously
valid and relevant measurement for the variable.
Must be clear and unambiguous.
One question contain only one ideas and all respondent will
understand in the same way.
Must not be offensive (avoid questions that may offend the
respondent).
73. Cont…
7/4/2023
Asaye.A
73
The questions should be fair (should not be loaded).
Sensitive questions - It may not be possible to avoid asking
‘sensitive’ questions that may offend respondents
In such situations the interviewer (questioner) should do it very
carefully and wisely
74. Cont…
7/4/2023
Asaye.A
74
Start with an interesting but non-controversial question
(preferably open) that is directly related to the subject of the
study.
Pose more sensitive questions as late as possible in the
interview
Use simple language.
Make the questionnaire as short as possible.
75. What to be considered before designing questioning tool
7/4/2023
Asaye.A
75
What exactly do we want to know, according to the objectives
and variables we identified earlier?
Of whom will we ask questions and what techniques will we
use?
Are our informants mainly literate or illiterate?
How large is the sample that will be interviewed?
77. Types of closed format
7/4/2023
Asaye.A
77
Choice of categories
Q. What is your marital status?
Single
Married
Divorced
Widowed
Likert (similar)style scale
Q. Biostatistics is an interesting subject
Strongly disagree
Disagree
Cannot decided
Agree
Strongly agree
78. Cont…
7/4/2023
Asaye.A
78
Checklists
Circle the public health specialties you are particularly interested in
Epidemiology and Biostatistics
Reproductive health
Nutrition
Health informatics
Health service management
General
79. Cont…
7/4/2023
Asaye.A
79
Ranking
Please rank your interest in the following specialties
(1=most interesting, 4=least interesting )
Epidemiology and Biostatistics
Reproductive health
Nutrition
Health informatics
80. 2. Methods of data organization and presentation
7/4/2023
Asaye.A
80
The most convenient method of organizing data is to construct a frequency
distribution.
A frequency distribution is the organization of raw data in table form, using
classes and frequencies.
Frequency distribution table: lists categories of scores along with their
corresponding frequencies.
For this different techniques of data organization and presentation like order
array, tables and diagrams are used.
81. Array (ordered array)
7/4/2023
Asaye.A
81
A serial arrangement of numerical data in an ascending or
descending order.
A simple arrangement of individual observations in order of
magnitude.
This will enable us to know the range over which the items are
spread and will also get an idea of their general distribution.
It is an appropriate way of presentation when the data are small in
size (usually less than 20).
82. Frequency Distribution (F.D.)
7/4/2023
Asaye.A
82
Frequency distribution is organization of the values of a
variable arranged in order of magnitude either individually (for a
discrete variable), or in to classes (for a continuous variable), or
into categories (in case of qualitative data) along with their
frequencies.
83. Frequency Distribution (F.D.)…
7/4/2023
Asaye.A
83
A frequency distribution has two main parts; namely,
i. The values of the variable (if quantitative) or the
categories (if qualitative), and
ii. The number of observations (frequency)
corresponding to the values or categories.
84. Frequency Distribution (F.D.)…
7/4/2023
Asaye.A
84
There are two types of frequency distributions
i. Categorical (or qualitative)
ii. Numerical (or quantitative)
1. Categorical Frequency Distribution
Data are classified according to non-numerical categories.
Categories must be mutually exclusive and exhaustive.
Used to organize nominal and ordinal data.
85. Cont…
7/4/2023
Asaye.A
85
a) Nominal data: Here the construction is straight forward: count the
occurrences in each category and find the totals.
Example: The martial status of 60 adults classified as single, married,
divorced and widowed is presented in a FD as below:
Ordinal data: The construction is identical to the nominal case, but, the
categories should be put in an ordered manner.
Marital
status
Single Married Divorced Widowed Total
Frequency 25 20 8 7 60
86. Cont…
7/4/2023
Asaye.A
86
b) Ordinal data. The construction is identical to
the nominal case. How ever, the categories
should be put in an ordered manner.
Example: Satisfaction on teaching method in a
class of size 60 is presented in a FD as shown
below
87. Numerical F.D
7/4/2023
Asaye.A
87
2. Numerical Frequency Distribution
data are classified according to numerical size.
used to organize interval and ratio data.
may be discrete or continuous, depending on whether the
variable is discrete or continuous.
88. Numerical F.D…
7/4/2023
Asaye.A
88
a) Discrete (Ungrouped) Frequency Distribution
Count the number of times each possible value is repeated.
Example: In a survey of 30 families, the number of children per
family was recorded and obtained the following data:
4 2 4 3 2 8 3 4 4 2 2 8 5 3 4 5 4 5 4 3 5 2 7 3 3 6 7 3 8 4.
The distribution of children in 30 families would be:
No. of
children
2 3 4 5 6 7 8 total
No. of family
(f)
5 7 8 4 1 2 3 30
89. Continuous grouped F.D
7/4/2023
Asaye.A
89
b) Continuous/grouped Frequency Distribution
o Arise from continuous variables/data.
o Unlike for a discrete FD, a class can not be allocated to
each value of a continuous variable.
o Categories in to which the observations are distributed are
called classes or class intervals.
o Classes should be exhaustive and mutually exclusive.
91. Steps in constructing continuous frequency distribution
7/4/2023
Asaye.A
91
1. Determine the number of classes (k): Number of items
belonging to a class.
Decide ”k” with the help of Sturge’s rule:
k = 1 + 3.322 log(n)
Rounded up or down to the nearest integer.
Where n= number of observations, log= common logarithm
(logarithm of 10).
92. Cont…
7/4/2023
Asaye.A
92
Example if n=10, k=4.32≈4, if n=100, k=7.644≈8, if n=1000,
k=10.96≈11
2. Determine the class width (w): the difference between the
upper or lower boundaries of two consecutive classes (may be
class limits).
We can use, W =
𝑅𝑎𝑛𝑔𝑒
𝐾
Note that “W” rounded up or down to the nearest integers.
93. Cont…
7/4/2023
Asaye.A
93
3. Determine the Class Limits
It separates one class from another and have gap between the upper
limits of one class and the lower limit of the next class.
The lower class limit of the first class should be the smallest value of
the observations.
Add the size of a class width on the lower class limit to obtain the
lower class limit of the next classes.
Unit of measure (U): This is the possible difference between
successive values or measures. E.g. 1, 0.1, 0.01, 0.001……
94. Cont…
7/4/2023
Asaye.A
94
To find the upper limit of the first class, subtract U from the lower
limit of the second class.
Then continue to add the class width to this upper limit to find the
rest of the upper limits or
Obtain the upper class limits by adding class width minus one to the
corresponding lower class limits. i.e. UCL =LCL+ (W-1)
95. Cont…
7/4/2023
Asaye.A
95
4. Determine the Class boundaries
Making an interval of a continuous variable continuous in both directions,
no gap exists between classes.
let U =LCL of the second class – UCL of preceding class.
Add half of this difference (U/2) to all upper class limits to get the upper
class boundaries (UCBs), and subtract (U/2) from all lower class limits to
get the lower class boundaries (LCBs).
UCBi = UCLi +U/2
LCBi = LCLi – U/2
96. Cont…
7/4/2023
Asaye.A
96
5. Class mark (C.M) or Mid points: it is the average of the lower and upper
class limits or the average of upper and lower class boundary.
6. Determine the frequency of each class: determined simply by counting
the number of observations belonging to each class.
7. Cumulative frequency is the number of observations less than/ more than
or equal to a specific value.
8. Cumulative frequency above (Greater than type): it is the total
frequency of all values greater than or equal to the lower class boundary of a
given class.
97. Cont…
7/4/2023
Asaye.A
97
9. Cumulative frequency below (less than type): it is the total frequency of
all values less than or equal to the upper class boundary of a given class.
10. Relative frequency (rf): it is the frequency divided by the total frequency.
11. Relative cumulative frequency (rcf): it is the cumulative frequency
divided by the total frequency.
99. Cont…
7/4/2023
Asaye.A
99
Solution:
Step 1: Find the highest and the lowest value H=88, L=42
Step 2: Find the range; R=H-L=88-42=46.
Step 3: Select the number of classes desired using Sturge’s formula;
k=1+3.322log (50) =6.64=7(rounding up)
Step 4: Find the class width; w=R/k=46/7=6.57=7 (rounding up)
100. Cont…
7/4/2023
Asaye.A
100
Step 5: Select the starting observation as lowest class limit (this is
usually the lowest observation).
Add the class width to that observation to get the lower limit of the
next class.
Keep adding until there are 7 classes. 42, 49, 56, 63, 70, 77, 84 are
the lower class limits.
Step 6: Find the upper class limit; e.g. the first upper class=49- U=49-
1=48. The rest CL: 55, 62, 69, 76, 83, 90 are the upper class limits.
101. Cont…
7/4/2023
Asaye.A
101
So combining step 5 and step 6, one can construct the following
classes.
Step 7: Find the class boundaries by subtracting 0.5 from each lower
class limit and adding 0.5 to the UCL.
102. Cont…
7/4/2023
Asaye.A
102
Example: For class 1: LCBi =LCLi - U/2 = 42-0.5 = 41.5 and UCBi =
UCLi + U/2 = 48+0.5 = 48.5.
Then continue adding W on both boundaries to obtain the rest
boundaries.
By doing so one can obtain the following classes.
103. Cont…
7/4/2023
Asaye.A
103
Step 8: Find the frequencies
Step 9: Find cumulative frequency.
Step 10: Find relative frequency and /or relative cumulative frequency.
109. Continuous/grouped F.D …
7/4/2023
Asaye.A
109
Cumulative frequency distributions
o Tells us how often the values fall below or above that class. There
are two types of CFD:
The “less than” cumulative F.D.
o Obtained by adding the frequency of all the preceding classes
including the frequency of that class.
The “more than” cumulative F.D.
o Obtained by adding the frequency of the succeeding classes
including the frequency of that class.
111. Following the rules for grouping data
7/4/2023
Asaye.A
111
The groups must not overlap: not to be confuse concerning in which group
a measurement belongs.
There must be continuity from one group to the next: Otherwise some
measurements may not fit in a group.
The groups must range from the lowest measurement to the highest
measurement.
The groups should normally be of an equal width.
112. Methods of data presentation
7/4/2023
Asaye.A
112
Commonly, here are two ways of presenting
statistical data:
1. Statistical tables
2. Graphs/Diagrams
113. 1. Tabulation methods of data
presentations
7/4/2023
Asaye.A
113
1. Statistical tables
o A statistical table is an orderly and systematic presentation of data
in rows and columns.
Rows : are horizontal arrangements.
Columns: are vertical arrangements.
o Use of tables for organizing data that involves grouping the data
into mutually exclusive categories of the variables and counting
the number of occurrences (frequency) to each category.
114. Cont….
7/4/2023
Asaye.A
114
Based on the purpose for which the table is designed and the
complexity of the relationship, a table could be either of
simple frequency table or cross tabulation.
Simple frequency table is used when the individual
observations involve only to a single variable.
Cross tabulation is used to obtain the frequency distribution of
one variable by another variables.
115. General principles to construct
tables
7/4/2023
Asaye.A
115
1. Tables should be as simple as possible.
2. Tables should be self-explanatory.
Title should be clear and placed above the table. a good title
answers: what? when? where? how classified ?
Each row and column should be labeled.
Numerical entities of zero should be explicitly written rather
than indicated by a dash.
Dashed are reserved for missing or unobserved data.
Totals should be shown either in the top row and the first
column or in the last row and last column.
3. If data are not original, their source should be given in a footnote.
116. A) Simple or one-way table
7/4/2023
Asaye.A
116
Simple frequency table: most basic table is a simple
frequency distribution with one variable.
Example:
Table. Blood group of voluntary blood donors
examined in red cross blood bank within a day, may
2006 (n=548)
Blood group Number of
students
Percent
A 240 43.8
B 146 26.6
AB 57 10.4
O 105 19.2
Total 548 100
Rows
Title Columns
117. Two and three variable table
7/4/2023
Asaye.A
117
If two variables are cross tabulated, it is a two variable table
If the tabulation is among three variables, it is three variable
table .
In cross tabulated frequency distributions where there are row
and column totals, the decision for the denominator is based
on the variable of interest to be compared over the subset of
the other variable.
119. Common form of a two by two
variable
7/4/2023
Asaye.A
119
It is a special form of table favorite among
epidemiologist.
It is used to compare whether there is relationship
between the two variables.
Exposure Numbers of subjects Total
Cases Controls
Exposed 23 23 46
Non-
exposed
4 139 143
Total 27 162 189
120. Composite/ Higher Order Table
7/4/2023
Asaye.A
120
It is a large table combining several separate variable/tables
Age, sex and other demographic variables may be combined
to form a single table
Example: Distribution of Health Professional by Sex
and Residence
121. Diagrammatic and Graphical methods of data presentation
7/4/2023
Asaye.A
121
Advantages
To understand the information easily.
To make the data attractive.
To make comparisons of items easily.
To draw attention of the observer.
The purpose of graphs and diagrams is not to provide exact and detailed
information, but simple comparisons.
Any further information shall rather be obtained from the original data.
122. Limitations of Diagrammatic presentation
7/4/2023
Asaye.A
122
The technique is made use only for purposes of comparison. It is
not to be used when comparison is either not possible or is not
necessary.
is not an alternative to tabulation. It only strengthens the textual
exposition of a subject, and cannot serve as a complete substitute
for statistical data.
It can give only an approximate idea and as such where greater
accuracy is needed diagrams will not be suitable.
They fail to bring to light small differences.
123. 2. Diagrammatic Presentation of data
7/4/2023
Asaye.A
123
Diagrams are appropriate for presenting discrete as well as
qualitative data.
The three most commonly used diagrammatic presentation of
data are:
Pie charts
Bar charts
Pictograms
125. 1. Pie chart
7/4/2023
Asaye.A
125
Pie chart can used to compare the relation between the
whole and its components.
useful for qualitative or quantitative discrete data.
Pie chart is a circular diagram and the area of the sector of a
circle is used in pie chart.
130. 2. Bar charts (or graphs)
7/4/2023
Asaye.A
130
Categories are listed on the horizontal axis (X-axis).
Frequencies or relative frequencies are represented on the Y-
axis.
The height of each bar is proportional to the frequency or
relative frequency of observations in that category.
There are three types of bars.
131. Tips for constructing bar diagrams
7/4/2023
Asaye.A
131
1. Whenever possible it is better to construct a bar diagram on a graph
paper
2. All bars drawn in any single study should be of the same width
3. The different bars should be separated by equal distances
4. All the bars should rest on the same line called the base
5. Whenever possible, it is advisable to draw bars in order of
magnitude
135. Cont…
7/4/2023
Asaye.A
135
B. Sub-divided bar chart (component)
o is used to represent data in which the total magnitude is divided into
different or components
o Example: Plasmodium species distribution for confirmed malaria
cases, Zeway, 2003
136. Cont…
7/4/2023
Asaye.A
136
C. Multiple bar chart
are used two or more sets of inter-related data are represented
(multiple bar diagram facilities comparison between more than one
phenomenon).
The following figure shows a multiple bar chart to represent the
import and export of Canada (values in $) for the years 1991 to
1995.
139. 3. Graphical Presentation of data
7/4/2023
Asaye.A
139
The histogram, frequency polygon and cumulative frequency graph (ogive) are
most commonly applied graphical representation for continuous data.
Procedures for constructing statistical graphs
• Draw and label the X and Y axes.
• Choose a suitable scale for the frequencies or cumulative frequencies and label
it on the Y axes.
• Represent the class boundaries for the histogram or ogive and the mid points
for the frequency polygon on the X axes.
• Plot the points.
• Draw the bars or lines to connect the points.
140. Graphical Presentation of
data
7/4/2023
Asaye.A
140
1. Histogram
A graph which places the class boundaries on the horizontal axis
and the frequencies on a vertical axis
Class marks and class limits are some times used as quantity on the
X axes.
Non-overlapping intervals that cover all of the data values must be
used.
141. Cont…
7/4/2023
Asaye.A
141
Bars are drawn over the intervals in such a way that the areas of the
bars are all proportional in the same way to their interval
frequencies.
To avoid crowding, you can use class midpoints.
Example: Distribution of the age of women at the time of marriage
143. Cont…
7/4/2023
Asaye.A
143
2. Frequency polygon
Line graph of class marks against class frequencies.
To draw a frequency polygon we connect the midpoints of class
boundaries of the histogram by a straight line
144. Cont…
7/4/2023
Asaye.A
144
It can be also drawn without erecting rectangles by joining the top
midpoints of the intervals representing the frequency of the classes as
follows:
145. Cont…
7/4/2023
Asaye.A
145
3. Ogive Curve (Cumulative Frequency Polygon)
A graph showing the cumulative frequency (less than or more than type)
plotted against upper or lower class boundaries respectively.
Ogive uses class boundaries along the horizontal axis, and cumulative
frequency along vertical axis.
Less than Ogive uses less than cumulative frequency on y axis.
More than Ogive uses more than cumulative frequency on 𝑦 axis.
The points are joined by a free hand curve
150. Cont…
7/4/2023
Asaye.A
150
4. Line graph
o A variable is taken along X-axis and the frequency of occurrence of
each of its observed values along the Y-axis.
o The points are plotted and joined by line.
o An arithmetic scale line graph shows patterns or trends over some
variable, usually time.
154. learning outcomes
7/4/2023
Asaye.A
154
After completing this chapter a student will able to;
List and calculate measures of central tendency
List and calculate measures of dispersion
Describe types of shape.
155. Numerical Summary
Measures
7/4/2023
Asaye.A
155
They are the single numbers which quantify the characteristics
of a distribution of values.
They are two types;
1. Measures of central tendency or location
2. Measures of dispersion
156. Measures of Central Tendency/ Measures of Location
7/4/2023
Asaye.A
156
Measures of central Tendency: the methods of determining the
actual value at which the data tend to concentrate.
The tendency of the statistical data to get concentrated at a certain
value is called “central tendency”
The objective of calculating MCT is to determine a single figure
which may be used to represent the whole data set.
Since a MCT represents the entire data, it facilitates comparison
within one group or between groups of data
157. Characteristics of a good MCT
7/4/2023
Asaye.A
157
A MCT is good or satisfactory if it possesses the following characteristics:
o It should be based on all the observations
o It should not be affected by the extreme values
o It should be as close to the maximum number of values as possible
o It should have a definite value
o It should not be subjected to complicated and tedious calculations
o It should be capable of further algebraic treatment
o It should be stable with regard to sampling
159. 1. Arithmetic Mean
7/4/2023
Asaye.A
159
1. Ungrouped Data
The arithmetic mean is the "average" of the data set and by far the
most widely used measure of central location.
Is the sum of all the observations divided by the total number of
observations.
160. Arithmetic…..
7/4/2023
Asaye.A
160
The heart rates for n=10 patients were as follows (beats per minute):
167, 120, 150, 125, 150, 140, 40, 136, 120, 150.
What is the arithmetic mean for the heart rate of these patients?
161. Cont…
7/4/2023
Asaye.A
161
When the data are arranged or given in the form of frequency
distribution i.e. there are K variety such that a value Xi has
frequency fi (i=1,2,…,k), then the arithmetic mean will be given as ;
164. Cont…
7/4/2023
Asaye.A
164
2. For grouped data
In calculating the mean from grouped data, we assume that all values falling
into a particular class interval are located at the midpoint of each interval.
Therefore, mean for grouped data is calculated as:
165. Arithmetic…..
7/4/2023
Asaye.A
165
Example
Compute the mean age of 169 subjects from the grouped data.
Mean = 5810.5/169 = 34.48 years
Class interval Mid-point (mi) Frequency (fi) mifi
10-19
20-29
30-39
40-49
50-59
60-69
14.5
24.5
34.5
44.5
54.5
64.5
4
66
47
36
12
4
58.0
1617.0
1621.5
1602.0
654.0
258.0
Total __ 169 5810.5
166. Arithmetic…..
7/4/2023
Asaye.A
166
The mean can be thought of as a “balancing point”, “center of gravity”
It is possible in extreme cases for all but one of the sample points to be on
one side of the arithmetic mean & in this case, the mean is a poor measure
of central location or does not reflect the center of the sample.
167. Properties of the Arithmetic Mean
7/4/2023
Asaye.A
167
The mean can be used as a summary measure for both discrete and
continuous data, but it is not appropriate for either of nominal or ordinal
data.
For a given set of data there is only one arithmetic mean (uniqueness).
Easy to calculate and understand (simple).
Influenced by each and every value in a data set
Greatly affected by the extreme values.
In case of grouped data if any class interval is open, arithmetic mean can
not be calculated.
168. 2. Median
o It is the an alternative measure of central tendency, second in popularity
next to arithmetic mean.
o Suppose there are n observations in a sample
o If these observations are ordered from smallest to largest, then the median
is defined as follows:
o The median, is a value such that at least half of the observations are less
than or equal to median and at least half of the observations are greater
than or equal to median.
The median is the midpoint of the data array.
169. 2. Median….
7/4/2023
Asaye.A
169
Ungrouped data
The median is the value which divides the data set into two equal parts.
If the number of values is odd, the median will be the middle value when
all values are arranged in order of magnitude.
When the number of observations is even, there is no single middle value
but two middle observations.
In this case the median is the mean of these two middle observations, when
all observations have been arranged in the order of their magnitude.
170. Cont…
7/4/2023
Asaye.A
170
1. For ungrouped data
• If the number of observations is odd, the median is defined as the
[(n+1)/2]th observation.
• If the number of observations is even the median is the average of
the two middle (n/2)th and [(n/2)+1]th values.
• To find the median of a data set:
• Arrange the data in ascending order.
• Find the middle observation of this ordered data.
171. Cont…
7/4/2023
Asaye.A
171
Example1: where n is even: 19,20, 20, 21, 22, 24, 27, 27, 27,34
Then, the median = (22 + 24)/2 = 23
Example2: The number of children with asthma during a specific year
in seven local districts clinic is shown.
Find the median for this data set.
253, 125, 328, 417, 201, 70, 90
172. Cont…
7/4/2023
Asaye.A
172
Solution:
First we must arrange the data in ascending order
70, 90, 125, 201, 253, 328, 417
Therefore, the fourth observation is the median of the data, i.e. the value 201
is the median value.
173. Exercise
7/4/2023
Asaye.A
173
The actual waiting time for the first job on the selected sample of nine people
having different field of specialization was given below.
waiting time(in months): 11.6,11.3, 10.7, 18.0, 3.3, 9.2, 8.3, 3.8, 6.8
Calculate the median of the waiting time.
174. Cont…
7/4/2023
Asaye.A
174
2. For grouped data
-If data are given in the shape of continuous frequency distribution, the
median is defined as:
Where: Lmed =lower class boundary of the median class. f med= The frequency
of the median class, W=the size of the median class, n= total number of
observation, f c= The cumulative frequency less than type preceding the
median class.
Note: the median class is the class with smallest cumulative frequency {less
than type) greater than or equal to n/2.
178. Merit and demerit of median
7/4/2023
Asaye.A
178
Merits:
Median is a positional average and hence not influenced by extreme
observations.
Can be calculated in the case of open end intervals.
The median can be used as a summary measure for ordinal, discrete and
continuous data, in general however, it is not appropriate for nominal data.
Demerits:
It is not a good representative of data if the number of items are small.
It is not amenable to further algebraic treatment.
It is vulnerable to sampling fluctuations.
179. 3. Mode
7/4/2023
Asaye.A
179
Mode is a value which occurs most frequently in a set of values.
The mode may not exist and even if it does exist, it may not be
unique.
If in a set of observed values, all values occur once or equal
number of times, there is no mode
180. Cont…
7/4/2023
Asaye.A
180
Examples:
1. Find the mode of 5, 3, 5, 8, and 9 ; Mode = 5
2. Find the mode of 8, 9, 9, 7, 8, 2, 5; Mode =8 and 9
3. Find the mode of 4, 12, 3, 6, and 7. No mode/ mode doesn’t exist.
181. Cont…
7/4/2023
Asaye.A
181
Mode for Grouped data
NB: The mode for grouped data is modal class.
The Modal class is the class with the largest frequency.
mode = L +
1
1 2
∗ W
Where L = The lower class boundary of the modal class;
w = the size of the modal class
f1= frequency of the class preceding the modal class.
f2= frequency of the class succeeding the modal class
fmod = frequency of the modal class.
1 = fmod - f1 , 2 = fmod - f2
183. Cont…
7/4/2023
Asaye.A
183
Solution
By inspection (simply looking at the frequencies), the mode lies in the
fourth class, where L=29.5, fmod = 57, f1=50, f2=48, w = 5, and
Therefore, the modal age, x = 29.5 +
7
7 9
∗ 5
29.5 2.2
31.7
∆2=57-48=9
∆1=57-50=7,
184. Properties of Mode
7/4/2023
Asaye.A
184
The mode can be used as a summary measure for nominal,
ordinal, discrete and continuous data, in general however, it is
more appropriate for nominal and ordinal data.
It is not affected by extreme values
It can be calculated for distributions with open end classes
Sometimes its value is not unique
The main drawback of mode is that it may not exist
185. Merit and Demerit of Mode
7/4/2023
Asaye.A
185
Merits:
It is not affected by extreme observations.
Easy to calculate and simple to understand.
It can be calculated for distribution with open end class.
186. Cont…
7/4/2023
Asaye.A
186
Demerits:
It is not rigidly defined. i.e. its value is not unique.
It is not based on all observations.
It is not suitable for further mathematical treatment.
It is not stable average, i.e. it is affected by fluctuations of
sampling to some extent.
187. Measure of location
7/4/2023
Asaye.A
187
Quartiles
- Quartiles are measures that divide the frequency distribution in to four
equal parts.
- The value of the variables corresponding to these divisions are denoted
Q1, Q2, and Q3 often called the first, the second and the third quartile
respectively.
- Q1 is a value which has 25% items which are less than or equal to it
- Similarly Q2 has 50% items with value less than or equal
to it.
188. Cont…
7/4/2023
Asaye.A
188
− Q3 has 75% items whose values are less than or equal to it.
Quartile for ungrouped data.
Arrange data in ascending order.
If the number of observation is
A. Odd
Qi =
𝑖(𝑛+1)th
4
item
B. Even
Qi =(
𝑖𝑛
4
𝑡ℎ+
𝑖𝑛
4
+1 𝑡ℎ
2
)
191. Cont…
7/4/2023
Asaye.A
191
Arrange the numbers in ascending order.
Percentiles for individual series
A. Odd
Pi =
𝑖(𝑛+1)th
100
item
B. Even
Pi =(
𝑖𝑛
100
𝑡ℎ+
𝑖𝑛
100
+1 𝑡ℎ
2
)
Percentiles for grouped data
𝑃𝑖= 𝐿 +
𝑤
𝑓𝑃𝑖
𝑖𝑛
100
− 𝐶𝐹 ,i = 1, 2,...,99 .
193. Cont…
7/4/2023
Asaye.A
193
For example: suppose that 50% of a cohort survived at least 4 years.
This means also that 50% survived at most 4 years.
We say that 4 years is the median.
The media is also called 50th percentile.
We write p50= 4 years.
194. Example
7/4/2023
Asaye.A
194
Marks of 50 students out of 85 is given below. Based on the data find
𝑄1 𝑎𝑛𝑑 𝑃7.
Solution: first find CB and CF distribution.
Second determine the quartile and percentile classes.
For 𝑄1: the smallest CF ≥ i*N/4=1*50/4= 12.5
Marks
46-50 51-55 56-60 61-65 66-70 71-75 76-80
fi
4 8 15 5 9 5 4
Marks 46-50 51-55 56-60 61-65 66-70 71-75 76-80
CB 45.5-
50.5
50.5-
55.5
55.5-60.5 60.5-65.5 65.5-70.5 70.5-75.5 75.5-
80.5
fi 4 8 15 5 9 5 4
CF 4 12 27 32 41 46 50
195. Cont…
7/4/2023
Asaye.A
195
CF ≥ 12.5 are 27,37,41,46, and 50. but the smallest CF is 27. so the
quartile class is the third class (55.5-60.5).
Q1 = L +
𝑤
𝑓
𝑄1
𝑛
4
− 𝐶𝐹 = 55.5 +
5
15
12.5 − 12 = 55.7
For percentiles
P7 measure of (7n/100)th value = 3.5th value which lies in group 45.5
– 50.5.
P7 = L +
𝑤
𝑓
𝑃7
7𝑛
100
− 𝐶𝐹 = 45.5 +
5
4
3.5 − 0 = 49.875.
196. Cont…
7/4/2023
Asaye.A
196
1. Calculate 𝑄1 , 𝑄2, 𝑄3, 𝐷4, 𝑃40 & 𝑃90 for the following data given
on the table below.
2. The following frequency distribution represents the magnitude of
earth quake.
Compute the median and verify that it is equal to the second quartile
and find 72nd percentile.
x 10 11 12 13 14 15 16 17 18
f 2 8 25 48 65 40 20 9 2
Magnitude 0-0.9 1-1.9 2-2.9 3-3.9 4-4.9 5-5.9 6-6.9 7-7.9
Frequency 20 50 45 30 10 8 6 1
197. Summary
7/4/2023
Asaye.A
197
1. The arithmetic mean is used for interval and ratio data and for
symmetric distribution.
2. The median and quartiles are used for ordinal, interval and ratio data
whose distribution is skewed.
3. For nominal data mode is the appropriate MCT.
198. Measures of variation/dispersion
7/4/2023
Asaye.A
198
The scatter or spread of items of a distribution is known as
dispersion or variation.
In other words, the degree to which numerical data tend to
spread about an average value is called dispersion or variation
of the data.
Measures of dispersions are statistical measures which provide
ways of measuring the extent in which data are dispersed or spread
out.
199. Agood measure of variation posses:
7/4/2023
Asaye.A
199
o It should be easy to compute and understand.
o It should be based on all observations.
o It should be Uniquely defined
o It should be capable of further algebraic treatment.
o It should be as little as affected by extreme values
200. Cont…
7/4/2023
Asaye.A
200
o Measures of dispersion include:
o Range
o Inter-quartile range
o Variance
o Standard deviation
o Coefficient of variation
o Standard scores (Z-scores)
201. Range
7/4/2023
Asaye.A
201
It is the difference between the largest and smallest observation from the
data.
Example: Consider the data on the weight (in Kg) of 10 new born
children at Debre tabor hospital within a month: 2.51, 3.01, 3.25,
2.02,1.98, 2.33, 2.33, 2.98, 2.88, 2.43
202. Cont…
7/4/2023
Asaye.A
202
Solution:
The range for the dataset can be computed by first arranging all
observation in to ascending order as: 1.98, 2.02, 2.33, 2.33, 2.43, 2.51,
2.88, 2.98, 3.01, 3.25.
Range = Maximum – Minimum = 3.25-1.98
= 1.27
203. Cont…
7/4/2023
Asaye.A
203
Limitations of Range
It is based upon two extreme cases in the entire distribution, the range
may be considerably changed if either of the extreme cases happens to
drop out, while the removal of any other case would not affect it at all.
It wastes information , it takes no account of the entire data.
204. Inter-quartile range
7/4/2023
Asaye.A
204
The inter-quartile range (IQR) is the difference between the third and the first
quartiles.
Example: Suppose the first and third quartile for weights of girls 12 months of
age are 8.8 Kg and 10.2 Kg respectively.
The IQR = 10.2 Kg – 8.8 Kg
205. Variance and standard deviation
7/4/2023
Asaye.A
205
Variance measure how far on average scores deviate or differ
from the mean.
Variance is the average of the square of the distance each value
from the mean.
207. Cont…
7/4/2023
Asaye.A
207
For the case of frequency distribution it is expressed as:
Why you use n-1;
− To obtain unbiased estimate of population variance or,
− To describe the spread of the population.
208. Cont…
7/4/2023
Asaye.A
208
There is a problem in a variance because the deviations are squared
and its units also square, in order to get the original unit of
measurements using square root.
209. Example1
7/4/2023
Asaye.A
209
Consider the following three datasets
Dataset 1:7, 7, 7, 7, 7, 7 Mean=7, sd=0
Dataset 2: 6, 7, 7, 7, 7, 8, mean=7, sd=0.63
Dataset 3: 3, 2, 7, 8, 9, 13, mean=7, sd=4.04
We understand that the same mean but different variation
210. Example2
7/4/2023
Asaye.A
210
Find the variance and standard deviation based on the given data 35, 45,
30, 35, 40, 25
Solution; Firstly we find the mean
Next subtract the mean from each value and square it:
212. Exercise
7/4/2023
Asaye.A
212
The Areas of spray able surfaces with DDT from a sample of 15 houses
are measured as follows (in m2) :
101,105,110,114,115,124,125,125,130,133,135,136,13 7,140,145
Find the variance and standard deviation of the given data set?
217. Cont…
7/4/2023
Asaye.A
217
Properties of Variance:
The main demerit of variance is that its unit is the square of the unit
of the original measurement values.
The variance gives more weight to the extreme values as compared
to those which are near to mean value, because the difference is
squared in variance.
The drawbacks of variance are overcome by the standard deviation.
218. Cont…
7/4/2023
Asaye.A
218
SD Vs. Standard Error (SE)
SD describes the variability among individual values in a given data set.
SE is used to describe the variability among separate sample means
obtained from one sample to another.
We interpret SE of the mean to mean that another similarly conducted
study may give a mean that may lie between ± SE.
219. Cont…
7/4/2023
Asaye.A
219
The SD has the advantage of being expressed in the same units of
measurement as the mean.
SD is considered to be the best measure of dispersion and is used
widely because of the properties of the theoretical normal curve.
However, if the units of measurements of variables of two data sets
is not the same, then there variability can’t be compared by
comparing the values of SD
220. Coefficient of variation
7/4/2023
Asaye.A
220
When two data sets have different units of measurements, or their means
differ sufficiently in size, the CV should be used as a measure of
dispersion.
It is the best measure to compare the variability of two series of sets of
observations.
A series with less coefficient of variation is considered more consistent.
𝐶𝑣 =
𝑆
𝑋
∗ 100%
222. Standard score (Z-scores)
7/4/2023
Asaye.A
222
It is obtained by subtracting the mean of the data set from
the value and dividing the result by the standard deviation
of the data set.
It tells us how many standard deviations a specific value is
above or below the mean value of the data set.
The z-score is the number of standard deviations the data
value falls above (positive z-score) or below (negative z-
score) the mean for the data set.
223. Cont…
7/4/2023
Asaye.A
223
Z-score computed from the population
𝑍 𝑠𝑐𝑜𝑟𝑒 =
𝑋 − 𝜇
𝜎
Z-score computed from the sample
𝑍 𝑠𝑐𝑜𝑟𝑒 =
𝑋 − 𝑋
𝑆
Example: Suppose that a student scored 66 in biostatistics and 80 in anatomy
. The score of the summary of the courses is given below.
In which course did the student scored better as compared to his classmates?
Course Average score Standard deviation of the score
Biostatistics 51 12
Anatomy 72 16
224. Solution:
7/4/2023
Asaye.A
224
Z-score of student in Biostatistics: 𝑍 =
𝑋−𝜇
𝜎
=
66−51
12
=
15
12
=
1.25
Z-score of student in Anatomy: 𝑍 =
𝑋−𝜇
𝜎
=
80−72
16
=
8
16
= 0.5
From these two standard scores, we can conclude that the
student has scored better in Biostatistics course relative to his
classmates than in Anatomy.
225. Moments
The rth moments about the mean (the rth central moments) defined as
𝑀𝑟 =
𝑋𝑖 − 𝑋 𝑟
𝑛
, r = 0, 1, 2, …
For continuous grouped data
𝑀𝑟 =
𝑓𝑖 𝑋𝑖 − 𝑋 𝑟
𝑛
Where 𝑋𝑖’s is class mark
Find the first three central moments of the numbers 2, 3 and 7
227. 1. Skewness
7/4/2023
Asaye.A
227
o Measure of central tendency and variation do not reveal the
shape of frequency distribution.
o Skewness is the degree of asymmetry or departure from
symmetry of a distribution.
o A skewed frequency distribution is one that is not symmetrical.
o Skewness is concerned with the shape of the curve not size.
228. Concept of skewness
7/4/2023
Asaye.A
228
o The skewness of a distribution is defined as the lack of symmetry.
o In a symmetrical distribution, mean, median, and mode are equal to
each other.
229. Skewness…
7/4/2023
Asaye.A
229
• For moderately skewed distribution, the following relation holds
among the three commonly used measures of central tendency.
Mean-Mode=3*(Mean-Median)
Thera are two type of skewness based the its shape.
Positively skewed: Smaller observations are more frequent than larger
observations. i.e. the majority of the observations have a value below an
average and it has a long tail in the positive direction (Mean > Median).
231. Cont…
7/4/2023
Asaye.A
231
Negatively (left) skewed: Smaller observations are less frequent
than larger observations. i.e. the majority of the observations have a
value above an average. i.e. Mean < Median.
Mean
Median
Mode
232. Measures of Skewness
7/4/2023
Asaye.A
232
1. Karl Pearson’s Coefficient of Skewness (SK):
Mean - Mode
Standard deviation
Sk
3(Mean - Median)
Standard deviation
Sk
If SK = 0, then the distribution issymmetrical.
If SK > 0, then the distribution is positively skewed.
If SK < 0, then the distribution is negativelyskewed.
233. Cont…
7/4/2023
Asaye.A
233
2. Moment Coefficient of Skewness
Moment coefficient of skewness is based on moments. The formula
for calculating coefficient of skewness is:
𝛼3=
𝑀3
𝑀2
3/2 =
𝑀3
𝜎3
Where, Mr = 𝑖=1
𝑛
(𝑥𝑖 − 𝑥)𝑟
/𝑛
𝛼3 > 0, the distribution is positively skewed
α3 = 0, the distribution is symmetric
α3 < 0, the distribution is negatively skewed
234. 2. Kurtosis
7/4/2023
Asaye.A
234
o Kurtosis is a measure of peakedness of a distribution, and measured
relative to the peakedness of a normal curve.
o The peakedness of a distribution can be classified into three:
o Leptokurtic: -
- A distribution having relatively high peak.
- A curve is more peaked than the normal curve .
235. Cont…
7/4/2023
Asaye.A
235
o Mesokurtic: -
- Normal peak
- The curve is properly peaked
o Platykurtic:
Flat toped
A large number of observations have low frequency are spread
in the middle interval.
237. Measures of kurtosis
7/4/2023
Asaye.A
237
The moment coefficient of skewedness 𝛽2;
𝛽2 =
𝑀4
𝑀2
2
Where; 𝑀2 and 𝑀4 are central moments.
If 𝛽2 = 3, then the distribution is Mesokurtic.
If 𝛽2 > 3, then the distribution is Leptokurtic.
If 𝛽2 < 3, then the distribution is Platykurtic.
238. Example:
7/4/2023
Asaye.A
238
Based on the following data:
𝑀0 = 1, 𝑀1 = -0.6, 𝑀2 = 1.6, 𝑀3 = -2.4, 𝑀4 = 5.8
a) Find the coefficient of skewness and discuss the distribution type.
b) Find the coefficient of kurtosis and discuss the distribution type.
Solution
a) 𝛼3=
𝑀′3
𝑀′2
3/2 =
−2.4
1.63/2 = -1.19 < 0, the distribution is negatively
skewed.
b) 𝛼4=
𝑀′4
𝑀′2
2 =
5.8
1.62 = 2.26 < 3, the curve is Platykurtic.