1) The document discusses descriptive statistics and methods for summarizing categorical and numerical data through tables, graphs, and numerical measures.
2) Descriptive statistics are used to describe and characterize data through methods like frequency tables, measures of central tendency, and measures of variability.
3) Various graphs like bar charts, pie charts, histograms and frequency polygons are demonstrated to visually depict distributions of categorical and numerical variables.
The document discusses data distribution and presentation. It covers topics like the normal distribution curve, calculating probabilities using the standardized normal distribution table, and presenting data through tables and graphs. Specifically, it provides details on creating frequency distribution tables for qualitative and quantitative variables. It also discusses cross tabulation and different types of graphs like pie charts, simple bar charts, and multiple bar charts for presenting categorical data.
Biostatistics - the application of statistical methods in the life sciences including medicine, pharmacy, and agriculture.
An understanding is needed in practice issues requiring sound decisions.
Statistics is a decision science.
Biostatistics therefore deals with data.
Biostatistics is the science of obtaining, analyzing and interpreting data in order to understand and improve human health.
Applications of Biostatistics
Design and analysis of clinical trials
Quality control of pharmaceuticals
Pharmacy practice research
Public health, including epidemiology
Genomics and population genetics
Ecology
Biological sequence analysis
Bioinformatics etc.
1. The document summarizes key concepts from a lecture on statistics for engineers, including the normal distribution, the central limit theorem, and normal approximations to the binomial and Poisson distributions.
2. It provides an example of using the normal approximation to the Poisson distribution to calculate how many pills should be ordered to ensure the probability of running out is less than 0.005.
3. The document cautions that normal approximations may provide inaccurate results if assumptions like independence are violated, as with infectious diseases. Simple approximations are not advisable if failure could have important consequences, as with estimating rare event probabilities.
This document provides information about medical statistics including what statistics are, how they are used in medicine, and some key statistical concepts. It discusses that statistics is the study of collecting, organizing, summarizing, presenting, and analyzing data. Medical statistics specifically deals with applying these statistical methods to medicine and health sciences areas like epidemiology, public health, and clinical research. It also overview some common statistical analyses like descriptive versus inferential statistics, populations and samples, variables and data types, and some statistical notations.
This document provides an introduction to biostatistics. It defines key biostatistics terms like data, variables, scales of measurement, and methods of data presentation. Descriptive and inferential statistics are introduced. Common measures of central tendency (mean, median, mode) and dispersion (range, standard deviation, variance) are defined for different data types. Common methods for presenting data visually, like histograms, bar graphs and box plots, are also described. The normal distribution is introduced as an important assumption for many statistical tests. Examples are provided to illustrate concepts like using z-scores to determine what proportion of values fall above or below a given cutoff from the mean.
The document discusses data distribution and presentation. It covers topics like the normal distribution curve, calculating probabilities using the standardized normal distribution table, and presenting data through tables and graphs. Specifically, it provides details on creating frequency distribution tables for qualitative and quantitative variables. It also discusses cross tabulation and different types of graphs like pie charts, simple bar charts, and multiple bar charts for presenting categorical data.
Biostatistics - the application of statistical methods in the life sciences including medicine, pharmacy, and agriculture.
An understanding is needed in practice issues requiring sound decisions.
Statistics is a decision science.
Biostatistics therefore deals with data.
Biostatistics is the science of obtaining, analyzing and interpreting data in order to understand and improve human health.
Applications of Biostatistics
Design and analysis of clinical trials
Quality control of pharmaceuticals
Pharmacy practice research
Public health, including epidemiology
Genomics and population genetics
Ecology
Biological sequence analysis
Bioinformatics etc.
1. The document summarizes key concepts from a lecture on statistics for engineers, including the normal distribution, the central limit theorem, and normal approximations to the binomial and Poisson distributions.
2. It provides an example of using the normal approximation to the Poisson distribution to calculate how many pills should be ordered to ensure the probability of running out is less than 0.005.
3. The document cautions that normal approximations may provide inaccurate results if assumptions like independence are violated, as with infectious diseases. Simple approximations are not advisable if failure could have important consequences, as with estimating rare event probabilities.
This document provides information about medical statistics including what statistics are, how they are used in medicine, and some key statistical concepts. It discusses that statistics is the study of collecting, organizing, summarizing, presenting, and analyzing data. Medical statistics specifically deals with applying these statistical methods to medicine and health sciences areas like epidemiology, public health, and clinical research. It also overview some common statistical analyses like descriptive versus inferential statistics, populations and samples, variables and data types, and some statistical notations.
This document provides an introduction to biostatistics. It defines key biostatistics terms like data, variables, scales of measurement, and methods of data presentation. Descriptive and inferential statistics are introduced. Common measures of central tendency (mean, median, mode) and dispersion (range, standard deviation, variance) are defined for different data types. Common methods for presenting data visually, like histograms, bar graphs and box plots, are also described. The normal distribution is introduced as an important assumption for many statistical tests. Examples are provided to illustrate concepts like using z-scores to determine what proportion of values fall above or below a given cutoff from the mean.
International Journal of Mathematics and Statistics Invention (IJMSI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJMSI publishes research articles and reviews within the whole field Mathematics and Statistics, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
International Journal of Mathematics and Statistics Invention (IJMSI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJMSI publishes research articles and reviews within the whole field Mathematics and Statistics, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
Biostatistics is the science of collecting, summarizing, analyzing, and interpreting data in the fields of medicine, biology, and public health. It involves both descriptive and inferential statistics. Descriptive statistics summarize data through measures of central tendency like mean, median, and mode, and measures of dispersion like range and standard deviation. Inferential statistics allow generalization from samples to populations through techniques like hypothesis testing, confidence intervals, and estimation. Sample size determination and random sampling help ensure validity and minimize errors in statistical analyses.
This document provides an introduction to biostatistics. It defines biostatistics as the branch of statistics dealing with biological data. It discusses different types of data, methods of data presentation including tables, charts and graphs. It also covers measures of central tendency and dispersion, sampling methods, tests of significance including chi-square test and t-test, and correlation and regression. The overall purpose is to introduce basic statistical concepts and methods used for analyzing health and medical data.
1. The document discusses key concepts in biostatistics including measures of central tendency, dispersion, correlation, regression, and sampling.
2. Measures of central tendency described are the mean, median, and mode. Measures of dispersion include range, standard deviation, and quartile deviation.
3. The importance of statistical analysis for living organisms in areas like medicine, biology and public health is highlighted. Examples are provided to demonstrate calculation of statistical measures.
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...cambridgeWD
Clinical trials and health outcomes research differ in important ways that impact statistical modeling approaches. Clinical trials typically use homogeneous samples and focus on a single endpoint, while health outcomes data is heterogeneous with multiple endpoints. Predictive modeling techniques used in health outcomes research, like those in SAS Enterprise Miner, are better suited than traditional methods as they can handle complex real-world data without strong assumptions and more accurately predict rare events. Validation of models on separate test data is also important for generalizing results.
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...cambridgeWD
This document discusses the differences between clinical trials and health outcomes research. Clinical trials use homogeneous samples, surrogate endpoints, and focus on a single outcome. They are also typically underpowered for rare events. Health outcomes research uses heterogeneous data from the general population to examine multiple real endpoints simultaneously. It has larger samples and data that allow analysis of rare occurrences. Predictive modeling is better suited than traditional statistical methods for analyzing heterogeneous health outcomes data due to relaxed assumptions like normality.
Statistics is useful in medicine for decision making, acquiring new knowledge, and conducting surveys. It is used in areas like diagnosis, treatment, and research studies. Variables found in the human body include height, weight, and blood pressure. Descriptive statistics are used to describe data through measures of central tendency like mean, mode, and median, and measures of variation like standard deviation and histograms. Inferential statistics are used to make conclusions from data through hypothesis testing.
1. The document introduces basic concepts in medical statistics including variables, frequency tables, measures of central tendency and variation.
2. It describes using histograms and frequency tables to summarize sample data and calculates measures like the mean, median, and standard deviation.
3. The document also covers relative measures such as rates and ratios, and methods for standardizing crude rates like direct standardization and indirect standardization to allow comparison between populations.
This document discusses various statistical concepts and their applications in clinical laboratories. It defines descriptive statistics, statistical analysis, measures of central tendency (mean, median, mode), measures of variation (variance, standard deviation), probability distributions (binomial, Gaussian, Poisson), and statistical tests (t-test, chi-square, F-test). It provides examples of how these statistical methods are used to monitor laboratory test performance, interpret results, and compare different laboratory instruments and methods.
This document provides an overview of basic statistical concepts for bio science students. It defines measures of central tendency including mean, median, and mode. It also discusses measures of dispersion like range and standard deviation. Common probability distributions such as binomial, Poisson, and normal distributions are explained. Hypothesis testing concepts like p-values and types of statistical tests for different types of data like t-tests for continuous variables and chi-square tests for categorical data are summarized along with examples.
This document provides an overview of sampling and sampling variability. It defines key terms like population, sample, sampling, and sampling unit. It discusses the need for sampling due to limitations of complete enumeration. The main types of sampling designs covered are probability sampling methods like simple random sampling, stratified random sampling, systematic random sampling, cluster sampling, and multistage sampling as well as non-probability methods. Factors affecting sample size calculation and sampling variability are also outlined.
This document provides an overview of descriptive statistics. It defines key terms like population, sample, measures of central tendency, and types of data. It discusses how to calculate and interpret the mean, median, and mode for both raw and grouped data. Examples are provided to demonstrate calculating the mean, median, and mode from raw data sets. It also discusses how to determine the mode from a grouped data set presented in a frequency distribution table, including using graphs to identify the modal class. The document covers important concepts in descriptive statistics for summarizing and describing numerical data.
- Simulations of clinical trial randomization methods showed consistent trade-offs between efficiency and unpredictability over different methods and parameters. No single best method optimized both metrics.
- Two metrics were used to evaluate predictability (potential for selection bias) and efficiency (loss of statistical power): simulations revealed clear trade-offs between higher predictability and lower efficiency.
- As sample size increased, most methods became more efficient while some also became more predictable and others less predictable, depending on the method. Permuted blocks, dynamic allocation, and complete randomization were among the methods evaluated.
A training workshop that assists researchers in dealing with statistics throughout the research.
It is the science of dealing with numbers.
It is used for collection, summarization, presentation & analysis of data.
This document provides an introduction to biostatistics. It defines biostatistics as the collection and analysis of data related to areas of research involving variables, observations, and their relationships. Biostatistics is concerned with vital events and factors affecting life from birth to death. It has applications in public health, clinical trials, genetics, ecology, and biological sequence analysis. The document discusses topics including population and sampling, methods of data collection, organization and presentation through classification, tabulation and graphs, measures of central tendency and dispersion, and examples of statistical calculations.
Here are the responses to the questions:
1. A statistical population is the entire set of individuals or objects of interest. A sample is a subset of the population selected to represent the population. The sample infers information about the characteristics, attributes, and properties of the entire population.
2. Variance is the average of the squared deviations from the mean. It is calculated as the sum of the squared deviations from the mean divided by the number of values in the data set minus 1. Standard deviation is the square root of the variance. It measures how far data values spread out from the mean.
3. No data was provided to create graphs. Additional data on the number of fish in each age group would be needed.
Biostatistics research type of statics and examples7543e80ceb
This document provides an introduction to statistics, including definitions of key terms. It discusses descriptive statistics, which summarize data from a sample, and inferential statistics, which make inferences about a population from a sample. It also defines populations and samples, variables and values, and distinguishes between data and information. Different types of variables are outlined including quantitative continuous, quantitative discrete, qualitative ordinal, and qualitative nominal variables. Methods for summarizing quantitative data through measures of central tendency and dispersion are also introduced.
The document defines key concepts in sampling and summarizes different sampling methods. It discusses sampling as a procedure to select a subset of a population to make inferences about the whole population. Probability sampling methods like simple random sampling, systematic sampling, stratified sampling and cluster sampling are described. Non-probability sampling techniques such as convenience sampling, quota sampling, purposive sampling, and snowball sampling are also outlined.
This document provides an overview of key concepts in statistics. It discusses how statistics is used to collect, organize, summarize, present, and analyze numerical data to derive valid conclusions. It defines common statistical terminology like data, quantitative vs. qualitative data, measures of central tendency (mean, median, mode), measures of variability (range, standard deviation), the normal distribution curve, and coefficient of variation. The document also explains common statistical tests like the z-test, t-test, ANOVA, chi-square test and concepts like sensitivity and specificity. Overall, the document serves as a high-level introduction to foundational statistical methods and analyses.
Paracoccidioidomycosis is a fungal infection caused by Paracoccidioides species. It primarily involves the lungs and can disseminate to other organs. The disease ranges from asymptomatic to acute or chronic forms. Diagnosis involves microscopic examination of clinical samples to identify the characteristic yeast forms and culture growth at 37°C. Treatment requires long-term antifungal therapy for 6-12 months.
This document introduces permutation methods for statistical testing. It begins with background on permutation principles and explains that most biostatistics texts only cover rank-based permutation methods but this text will cover both rank-based and non-rank-based methods. It then reviews key mathematical concepts of permutations and combinations that are important for understanding permutation methods. It provides examples of calculating permutations and combinations. Finally, it states that several permutation-based tests will be presented, with the first using original observations and the second using ranks to test different statistical concepts like correlation in a distribution-free manner.
International Journal of Mathematics and Statistics Invention (IJMSI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJMSI publishes research articles and reviews within the whole field Mathematics and Statistics, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
International Journal of Mathematics and Statistics Invention (IJMSI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJMSI publishes research articles and reviews within the whole field Mathematics and Statistics, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
Biostatistics is the science of collecting, summarizing, analyzing, and interpreting data in the fields of medicine, biology, and public health. It involves both descriptive and inferential statistics. Descriptive statistics summarize data through measures of central tendency like mean, median, and mode, and measures of dispersion like range and standard deviation. Inferential statistics allow generalization from samples to populations through techniques like hypothesis testing, confidence intervals, and estimation. Sample size determination and random sampling help ensure validity and minimize errors in statistical analyses.
This document provides an introduction to biostatistics. It defines biostatistics as the branch of statistics dealing with biological data. It discusses different types of data, methods of data presentation including tables, charts and graphs. It also covers measures of central tendency and dispersion, sampling methods, tests of significance including chi-square test and t-test, and correlation and regression. The overall purpose is to introduce basic statistical concepts and methods used for analyzing health and medical data.
1. The document discusses key concepts in biostatistics including measures of central tendency, dispersion, correlation, regression, and sampling.
2. Measures of central tendency described are the mean, median, and mode. Measures of dispersion include range, standard deviation, and quartile deviation.
3. The importance of statistical analysis for living organisms in areas like medicine, biology and public health is highlighted. Examples are provided to demonstrate calculation of statistical measures.
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...cambridgeWD
Clinical trials and health outcomes research differ in important ways that impact statistical modeling approaches. Clinical trials typically use homogeneous samples and focus on a single endpoint, while health outcomes data is heterogeneous with multiple endpoints. Predictive modeling techniques used in health outcomes research, like those in SAS Enterprise Miner, are better suited than traditional methods as they can handle complex real-world data without strong assumptions and more accurately predict rare events. Validation of models on separate test data is also important for generalizing results.
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...cambridgeWD
This document discusses the differences between clinical trials and health outcomes research. Clinical trials use homogeneous samples, surrogate endpoints, and focus on a single outcome. They are also typically underpowered for rare events. Health outcomes research uses heterogeneous data from the general population to examine multiple real endpoints simultaneously. It has larger samples and data that allow analysis of rare occurrences. Predictive modeling is better suited than traditional statistical methods for analyzing heterogeneous health outcomes data due to relaxed assumptions like normality.
Statistics is useful in medicine for decision making, acquiring new knowledge, and conducting surveys. It is used in areas like diagnosis, treatment, and research studies. Variables found in the human body include height, weight, and blood pressure. Descriptive statistics are used to describe data through measures of central tendency like mean, mode, and median, and measures of variation like standard deviation and histograms. Inferential statistics are used to make conclusions from data through hypothesis testing.
1. The document introduces basic concepts in medical statistics including variables, frequency tables, measures of central tendency and variation.
2. It describes using histograms and frequency tables to summarize sample data and calculates measures like the mean, median, and standard deviation.
3. The document also covers relative measures such as rates and ratios, and methods for standardizing crude rates like direct standardization and indirect standardization to allow comparison between populations.
This document discusses various statistical concepts and their applications in clinical laboratories. It defines descriptive statistics, statistical analysis, measures of central tendency (mean, median, mode), measures of variation (variance, standard deviation), probability distributions (binomial, Gaussian, Poisson), and statistical tests (t-test, chi-square, F-test). It provides examples of how these statistical methods are used to monitor laboratory test performance, interpret results, and compare different laboratory instruments and methods.
This document provides an overview of basic statistical concepts for bio science students. It defines measures of central tendency including mean, median, and mode. It also discusses measures of dispersion like range and standard deviation. Common probability distributions such as binomial, Poisson, and normal distributions are explained. Hypothesis testing concepts like p-values and types of statistical tests for different types of data like t-tests for continuous variables and chi-square tests for categorical data are summarized along with examples.
This document provides an overview of sampling and sampling variability. It defines key terms like population, sample, sampling, and sampling unit. It discusses the need for sampling due to limitations of complete enumeration. The main types of sampling designs covered are probability sampling methods like simple random sampling, stratified random sampling, systematic random sampling, cluster sampling, and multistage sampling as well as non-probability methods. Factors affecting sample size calculation and sampling variability are also outlined.
This document provides an overview of descriptive statistics. It defines key terms like population, sample, measures of central tendency, and types of data. It discusses how to calculate and interpret the mean, median, and mode for both raw and grouped data. Examples are provided to demonstrate calculating the mean, median, and mode from raw data sets. It also discusses how to determine the mode from a grouped data set presented in a frequency distribution table, including using graphs to identify the modal class. The document covers important concepts in descriptive statistics for summarizing and describing numerical data.
- Simulations of clinical trial randomization methods showed consistent trade-offs between efficiency and unpredictability over different methods and parameters. No single best method optimized both metrics.
- Two metrics were used to evaluate predictability (potential for selection bias) and efficiency (loss of statistical power): simulations revealed clear trade-offs between higher predictability and lower efficiency.
- As sample size increased, most methods became more efficient while some also became more predictable and others less predictable, depending on the method. Permuted blocks, dynamic allocation, and complete randomization were among the methods evaluated.
A training workshop that assists researchers in dealing with statistics throughout the research.
It is the science of dealing with numbers.
It is used for collection, summarization, presentation & analysis of data.
This document provides an introduction to biostatistics. It defines biostatistics as the collection and analysis of data related to areas of research involving variables, observations, and their relationships. Biostatistics is concerned with vital events and factors affecting life from birth to death. It has applications in public health, clinical trials, genetics, ecology, and biological sequence analysis. The document discusses topics including population and sampling, methods of data collection, organization and presentation through classification, tabulation and graphs, measures of central tendency and dispersion, and examples of statistical calculations.
Here are the responses to the questions:
1. A statistical population is the entire set of individuals or objects of interest. A sample is a subset of the population selected to represent the population. The sample infers information about the characteristics, attributes, and properties of the entire population.
2. Variance is the average of the squared deviations from the mean. It is calculated as the sum of the squared deviations from the mean divided by the number of values in the data set minus 1. Standard deviation is the square root of the variance. It measures how far data values spread out from the mean.
3. No data was provided to create graphs. Additional data on the number of fish in each age group would be needed.
Biostatistics research type of statics and examples7543e80ceb
This document provides an introduction to statistics, including definitions of key terms. It discusses descriptive statistics, which summarize data from a sample, and inferential statistics, which make inferences about a population from a sample. It also defines populations and samples, variables and values, and distinguishes between data and information. Different types of variables are outlined including quantitative continuous, quantitative discrete, qualitative ordinal, and qualitative nominal variables. Methods for summarizing quantitative data through measures of central tendency and dispersion are also introduced.
The document defines key concepts in sampling and summarizes different sampling methods. It discusses sampling as a procedure to select a subset of a population to make inferences about the whole population. Probability sampling methods like simple random sampling, systematic sampling, stratified sampling and cluster sampling are described. Non-probability sampling techniques such as convenience sampling, quota sampling, purposive sampling, and snowball sampling are also outlined.
This document provides an overview of key concepts in statistics. It discusses how statistics is used to collect, organize, summarize, present, and analyze numerical data to derive valid conclusions. It defines common statistical terminology like data, quantitative vs. qualitative data, measures of central tendency (mean, median, mode), measures of variability (range, standard deviation), the normal distribution curve, and coefficient of variation. The document also explains common statistical tests like the z-test, t-test, ANOVA, chi-square test and concepts like sensitivity and specificity. Overall, the document serves as a high-level introduction to foundational statistical methods and analyses.
Similar to Lecture-2 (discriptive statistics).ppt (20)
Paracoccidioidomycosis is a fungal infection caused by Paracoccidioides species. It primarily involves the lungs and can disseminate to other organs. The disease ranges from asymptomatic to acute or chronic forms. Diagnosis involves microscopic examination of clinical samples to identify the characteristic yeast forms and culture growth at 37°C. Treatment requires long-term antifungal therapy for 6-12 months.
This document introduces permutation methods for statistical testing. It begins with background on permutation principles and explains that most biostatistics texts only cover rank-based permutation methods but this text will cover both rank-based and non-rank-based methods. It then reviews key mathematical concepts of permutations and combinations that are important for understanding permutation methods. It provides examples of calculating permutations and combinations. Finally, it states that several permutation-based tests will be presented, with the first using original observations and the second using ranks to test different statistical concepts like correlation in a distribution-free manner.
Lecture-8 (Demographic Studies and Health Services Statistics).ppthabtamu biazin
This document provides an overview of key concepts in demography and health services statistics. It discusses the study of demography, including the static and dynamic aspects of populations. It also describes sources of demographic data like censuses, vital registration, and surveys. Other topics covered include demographic transition, population pyramids, vital rates like fertility and mortality rates, and population projections methods.
The chi-square test is a non-parametric method used to analyze categorical data to evaluate hypotheses about populations. It can be used for goodness of fit, independence, and homogeneity. The chi-square test involves calculating expected frequencies, verifying assumptions, selecting a significance level, computing the chi-square statistic and comparing it to a critical value to determine whether to reject or fail to reject the null hypothesis.
The document discusses t-tests and one-way ANOVA statistical tests. It provides details on how to conduct one-sample t-tests, paired t-tests, two independent sample t-tests, and one-way ANOVA. It includes the assumptions, test statistics, and procedures for each test. An example is also provided to demonstrate a one-way ANOVA comparing red blood cell folate levels between three patient groups receiving different nitrous oxide treatments.
The document provides an overview of survival analysis. It defines survival analysis as a branch of statistics that focuses on time-to-event data and their analysis. It discusses censored and truncated data, the life table method, the Kaplan-Meier estimator for estimating survival functions when there is censoring, and the Cox regression model for assessing relationships between covariates and survival times. The key aspects of survival analysis are estimating the probability of surviving past a certain time point and comparing survival distributions between groups while accounting for censored observations.
This document provides an overview of logistic regression. It begins by explaining that linear regression is not appropriate when the dependent variable is dichotomous. Logistic regression uses an S-shaped logistic function to model the probabilities of different outcomes. The logistic function transforms the non-linear probabilities into linear-looking data that can be modeled using linear regression. Examples are provided to demonstrate how logistic regression can be used to predict the probability of coronary heart disease based on age and to analyze the relationship between patient satisfaction and residence.
Linear regression was used to analyze the relationship between daily food intake (independent variable) and weight gain (dependent variable) in a sample of 20 children. The regression equation obtained was: Weight gained = 0.16 + 0.643(food weight). This indicates that for each additional 1kg of daily food intake, a child's weight increases by 0.643kg on average. The coefficient of determination (R2) was 0.81, meaning 81% of the variation in children's weight gain was explained by differences in daily food intake.
Lecture-3 Probability and probability distribution.ppthabtamu biazin
This document provides an overview of key concepts in probability and probability distributions that will be covered in the chapter. The objectives are to understand probability, the difference between probability and probability distributions, conditional probability, and different types of distributions for categorical and continuous variables. Specific distributions discussed include the normal, student t, and chi-square distributions. Examples are provided on probability, conditional probability, counting rules for permutations and combinations, sampling with and without replacement, and the binomial distribution.
Fungi constitute an important group of eukaryotic organisms including yeasts and molds. Anti-fungal drugs target differences between fungal and human cells, such as fungal cell walls and sterol composition. Major classes of anti-fungals include polyenes such as amphotericin B, azoles, and allylamines. Amphotericin B binds to ergosterol in fungal cell membranes, forming pores that disrupt membrane function. It has broad antifungal activity but can cause renal toxicity. Newer lipid formulations reduce this toxicity. Nystatin is a polyene used topically due to toxicity concerns. Griseofulvin and flucytosine inhibit fung
The document discusses opportunistic fungal infections, focusing on Aspergillosis, Candidiasis, Cryptococcosis, and other mycoses. It provides details on:
- The causative fungi and their incidence in opportunistic infections
- Clinical manifestations of various fungal infections in different organ systems like the lungs and central nervous system
- Laboratory methods for diagnosing fungal infections through microscopy, culture, serology and molecular identification
- Specific details on presentations of Aspergillosis, Candidiasis and Cryptococcosis in the lungs, skin and brain
The document discusses immunology and immunopathology of human parasitic infections. It covers:
1) Microparasites multiply within host cells and pose an immediate threat, while macroparasites (helminths) do not multiply within the host and do not present an immediate threat.
2) Infections by protozoa and helminths are long-lasting and can induce immunopathological changes over years that are more dangerous than the initial infection.
3) During any infection, dying or killed parasites can deposit molecules on host cells and elicit autoimmune responses, contributing to pathology.
5,6,7. Protein detection Western_blotting DNA sequencing.ppthabtamu biazin
1. The document describes the process of isolating and detecting proteins from various samples through cell lysis, SDS-PAGE gel electrophoresis, and western blotting. Key steps include lysing cells with detergents and inhibitors, boiling samples with loading buffer, running proteins on a gel, transferring proteins to a membrane, and detecting proteins with antibodies and chemiluminescent reagents.
2. Common components of lysis buffers and SDS loading buffers are described, as well as tips for pouring gels and troubleshooting western blots. The process allows estimation of protein molecular weights and analysis of post-translational modifications.
3. Proper controls and testing antibody specificity are emphasized for accurate analysis of western blot results.
6. aa sequencing site directed application of biotechnology.ppthabtamu biazin
Protein sequencing involves an eight step strategy to determine the amino acid sequence of a protein. The steps include separating polypeptide chains, reducing disulfide bonds, determining amino acid composition, identifying terminal residues, cleaving chains into fragments, sequencing the fragments, reconstructing the sequence from overlapping fragments, and determining disulfide bond positions. Frederick Sanger developed the first method for protein sequencing by determining the structure of insulin in 1953. Advances now allow sequencing entire proteins or genomes using techniques like mass spectrometry and determining gene sequences.
Genetic engineering involves purposefully manipulating genetic material to alter organism characteristics. There are five techniques: genetic fusion, protoplast fusion, gene amplification, recombinant DNA technology, and hybridoma creation. Genetic engineering tools include specialized enzymes, gel electrophoresis, DNA sequencing machines, RNA primers, and gene probes. The Human Genome Project, completed in 2003, mapped the human genome consisting of 20,000 to 25,000 protein-coding genes. 'Omics' fields like genomics, proteomics, and metabolomics emerged from studying entire genomes and cellular components.
The document provides an overview of real-time PCR (polymerase chain reaction). It discusses extracting RNA from tissue, converting the RNA to cDNA using reverse transcriptase, performing real-time PCR, and analyzing the results. Several key steps are described, including the importance of RNA quality, using appropriate reverse transcriptase primers and PCR primers, including necessary controls, and selecting appropriate reference standards for normalization.
2. Prokaryotic and Eukaryotic cell structure.pptxhabtamu biazin
Prokaryotic cells, which include bacteria, lack membrane-bound organelles and have no nucleus. They contain a single, circular chromosome. Eukaryotic cells have a membrane-enclosed nucleus and organelles. Prokaryotes reproduce through binary fission, while eukaryotes use mitosis or meiosis. Both prokaryotic and eukaryotic cells are surrounded by a plasma membrane and contain DNA.
This document outlines the fundamentals of microbiology, including the historical development and significance of studying microbes. It discusses key topics like the structure of prokaryotic and eukaryotic cells, bacterial taxonomy, and bacterial genetics. The objectives are to understand the historical background of microbiology, classify medically significant bacteria, describe bacterial metabolism and growth, and explain methods of disinfection.
Mycobacterium is a genus of bacteria that includes the species that cause tuberculosis (TB) and leprosy. It contains obligate parasites like Mycobacterium tuberculosis and M. leprae, which cause diseases, as well as opportunistic pathogens like non-tuberculous mycobacteria. Mycobacterium species are acid-fast bacilli with a cell wall rich in lipids, making them resistant to disinfectants and host immune responses. They can survive outside of hosts for weeks. M. tuberculosis was discovered in 1882 and is the main cause of TB, appearing as thin rods in tissue.
Staphylococcus aureus is a common cause of skin and soft tissue infections that produces several virulence factors like coagulase and toxins. It is carried in the nasopharynx and skin of healthy individuals. Streptococcus pyogenes causes a variety of infections from minor skin infections to severe invasive diseases like necrotizing fasciitis. It produces extracellular enzymes and toxins that damage tissues. Neisseria gonorrhoeae causes the sexually transmitted infection gonorrhea, while Neisseria meningitidis can cause a severe blood infection and meningitis. Both Neisseria species possess pili and capsules important for virulence.
A workshop hosted by the South African Journal of Science aimed at postgraduate students and early career researchers with little or no experience in writing and publishing journal articles.
বাংলাদেশের অর্থনৈতিক সমীক্ষা ২০২৪ [Bangladesh Economic Review 2024 Bangla.pdf] কম্পিউটার , ট্যাব ও স্মার্ট ফোন ভার্সন সহ সম্পূর্ণ বাংলা ই-বুক বা pdf বই " সুচিপত্র ...বুকমার্ক মেনু 🔖 ও হাইপার লিংক মেনু 📝👆 যুক্ত ..
আমাদের সবার জন্য খুব খুব গুরুত্বপূর্ণ একটি বই ..বিসিএস, ব্যাংক, ইউনিভার্সিটি ভর্তি ও যে কোন প্রতিযোগিতা মূলক পরীক্ষার জন্য এর খুব ইম্পরট্যান্ট একটি বিষয় ...তাছাড়া বাংলাদেশের সাম্প্রতিক যে কোন ডাটা বা তথ্য এই বইতে পাবেন ...
তাই একজন নাগরিক হিসাবে এই তথ্য গুলো আপনার জানা প্রয়োজন ...।
বিসিএস ও ব্যাংক এর লিখিত পরীক্ষা ...+এছাড়া মাধ্যমিক ও উচ্চমাধ্যমিকের স্টুডেন্টদের জন্য অনেক কাজে আসবে ...
Strategies for Effective Upskilling is a presentation by Chinwendu Peace in a Your Skill Boost Masterclass organisation by the Excellence Foundation for South Sudan on 08th and 09th June 2024 from 1 PM to 3 PM on each day.
How to Manage Your Lost Opportunities in Odoo 17 CRMCeline George
Odoo 17 CRM allows us to track why we lose sales opportunities with "Lost Reasons." This helps analyze our sales process and identify areas for improvement. Here's how to configure lost reasons in Odoo 17 CRM
The simplified electron and muon model, Oscillating Spacetime: The Foundation...RitikBhardwaj56
Discover the Simplified Electron and Muon Model: A New Wave-Based Approach to Understanding Particles delves into a groundbreaking theory that presents electrons and muons as rotating soliton waves within oscillating spacetime. Geared towards students, researchers, and science buffs, this book breaks down complex ideas into simple explanations. It covers topics such as electron waves, temporal dynamics, and the implications of this model on particle physics. With clear illustrations and easy-to-follow explanations, readers will gain a new outlook on the universe's fundamental nature.
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...PECB
Denis is a dynamic and results-driven Chief Information Officer (CIO) with a distinguished career spanning information systems analysis and technical project management. With a proven track record of spearheading the design and delivery of cutting-edge Information Management solutions, he has consistently elevated business operations, streamlined reporting functions, and maximized process efficiency.
Certified as an ISO/IEC 27001: Information Security Management Systems (ISMS) Lead Implementer, Data Protection Officer, and Cyber Risks Analyst, Denis brings a heightened focus on data security, privacy, and cyber resilience to every endeavor.
His expertise extends across a diverse spectrum of reporting, database, and web development applications, underpinned by an exceptional grasp of data storage and virtualization technologies. His proficiency in application testing, database administration, and data cleansing ensures seamless execution of complex projects.
What sets Denis apart is his comprehensive understanding of Business and Systems Analysis technologies, honed through involvement in all phases of the Software Development Lifecycle (SDLC). From meticulous requirements gathering to precise analysis, innovative design, rigorous development, thorough testing, and successful implementation, he has consistently delivered exceptional results.
Throughout his career, he has taken on multifaceted roles, from leading technical project management teams to owning solutions that drive operational excellence. His conscientious and proactive approach is unwavering, whether he is working independently or collaboratively within a team. His ability to connect with colleagues on a personal level underscores his commitment to fostering a harmonious and productive workplace environment.
Date: May 29, 2024
Tags: Information Security, ISO/IEC 27001, ISO/IEC 42001, Artificial Intelligence, GDPR
-------------------------------------------------------------------------------
Find out more about ISO training and certification services
Training: ISO/IEC 27001 Information Security Management System - EN | PECB
ISO/IEC 42001 Artificial Intelligence Management System - EN | PECB
General Data Protection Regulation (GDPR) - Training Courses - EN | PECB
Webinars: https://pecb.com/webinars
Article: https://pecb.com/article
-------------------------------------------------------------------------------
For more information about PECB:
Website: https://pecb.com/
LinkedIn: https://www.linkedin.com/company/pecb/
Facebook: https://www.facebook.com/PECBInternational/
Slideshare: http://www.slideshare.net/PECBCERTIFICATION
Executive Directors Chat Leveraging AI for Diversity, Equity, and InclusionTechSoup
Let’s explore the intersection of technology and equity in the final session of our DEI series. Discover how AI tools, like ChatGPT, can be used to support and enhance your nonprofit's DEI initiatives. Participants will gain insights into practical AI applications and get tips for leveraging technology to advance their DEI goals.
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Lecture-2 (discriptive statistics).ppt
1. NURSING Dream ● Discover ● Deliver
Lemma Derseh (BSc., MPH)
1
University of Gondar
College of medicine and health science
Department of Epidemiology and
Biostatistics
Descriptive statistics
2. NURSING Dream ● Discover ● Deliver
Statistical Methods (branches of statistics)
collection
organizing
summarizing
presenting of data
Descriptive Statistics
making inferences
hypothesis testing
determining relationship
making the prediction
Inferential Statistics
Biostatistics
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
3. NURSING Dream ● Discover ● Deliver
Descriptive Statistics
1. Involves
– Collecting Data
– Presenting Data
– Characterizing
Data
2. Purpose
– Describe Data
x = 74.5, S2 = 213
0
50
100
1St 2nd 3rd 4th
Class
size
Batch (one department)
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
4. NURSING Dream ● Discover ● Deliver
Descriptive statistics cont…
Types of descriptive statistics
Tables/charts/graphs …………..
Measures of central tendency
Measures of variability
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
Numerical summary
measures
Pictorial measure
5. NURSING Dream ● Discover ● Deliver
Tables/charts/graphs
Tables are used in categorical variables or
categorized numerical data
Tables:
Frequency (for nominal and ordinal data)
Relative frequency (for nominal and ordinal data)
Cumulative frequencies (for ordinal data)
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
The methods of describing data differ depending on the
type of the data itself (i.e. Numerical or Categorical).
6. NURSING Dream ● Discover ● Deliver
Describing categorical variables … cont
Frequency is the number of observations in each category
The relative frequency of a class is the portion or
percentage of the data that falls in that class
E.g. 1: The blood type of 30 patients were given as follows:
A AB B B A O O AB AB B O A A B B A AB A O AB
B AB AB O A AB AB O A O
Construct a table for it
6
Type Frequency Relative frequency
A 8 0.267
B 6 0.20
AB 9 0.30
O 7 0.233
Total 30 1.00
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
7. NURSING Dream ● Discover ● Deliver
Distribution of birth weight of newborns between 1976-1996 at TAH.
BWT Freq. Rel.Freq(%) Cum. Freq Cum.rel.freq.(%)
Very low 43 0.4 43 0.4
Low 793 8.0 836 8.4
Normal 8870 88.9 9706 97.3
Big 268 2.7 9974 100
Total 9974 100
7
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
Cumulative relative frequency is relevant for ordinal data
Consider for example, the variable birth weight with levels
‘Very low ’, ‘Low’, ‘Normal’ and ‘Big’.
The cumulative frequency of a class is the sum of the
frequency for that class and all the previous classes.
8. NURSING Dream ● Discover ● Deliver
Charts
Charts are used only for categorical variables
Bar charts
The successive bars are separated (not continuous)
Pie charts
Each sector of a circle indicates a category of data
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
9. NURSING Dream ● Discover ● Deliver
Charts cont…
Bar Chart
Bar charts: display the frequency distribution for
nominal or ordinal data.
The various categories into which the observation fall
are represented along horizontal axis and
9
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
10. NURSING Dream ● Discover ● Deliver
Fig. 1 Bar chart for blood type of 30 patients
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
11. NURSING Dream ● Discover ● Deliver
Pie cart
Pie chart displays the frequency of nominal or ordinal
variables.
The various categories of the variable will be represented
by the sector of the circle.
The area of each sector is proportional to the frequency
of the corresponding category of the variable
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
12. NURSING Dream ● Discover ● Deliver
Fig. 3. Pie chart showing the frequency distribution of the
variable blood group
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
13. NURSING Dream ● Discover ● Deliver
Categorizing Numeric data
In order to present and organize numeric type of data using tables or
graphs, we need to group the dataset as follows:
Number of class: the number of categories the table will have
Class limit: The range for each class
Lower class limit
Upper class limit
Class boundary: Continuous range of the class limit and it is obtained by
subtracting and adding 0.5 from lower and upper class limit respectively (for
non-decimal data but for decimal 0.05)
Lower class boundary
Upper class boundary
Class mark: The average of lower and upper class limit.
13
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
14. NURSING Dream ● Discover ● Deliver
Struge’s rule
Select a set of continuous, non-overlapping intervals such
that each value in the set of observations can be placed in
one, and only one, of the intervals.
– Where K = number of class intervals
– n = number of observations
– W = width of the class interval
– L = the largest value
– S = the smallest value
14
K 1 3.322(logn)
W
L S
K
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
15. NURSING Dream ● Discover ● Deliver
Struge’s rule cont…
For datasets with integral values subtracted or add 05.from
class limits to find class boundaries
The answer obtained by applying Sturge’s rule should not be
regarded as final, but should be considered as a guide only.
The number of class intervals specified by the rule should be
increased or decreased for convenience and clear presentation
15
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
16. NURSING Dream ● Discover ● Deliver
Example 1
The blood lead level measured in μg/dl for 88 sample
individuals living in a region are given as follows(numbers
with blue color are for females and the black for males)
20,21, 22,22,23,23,23,24,24,24,24,25,25,25,25,25,26,26,26,26,26,27,
27,27,27,27,27,28,28,28,28,28,28,28,28,29,29,29,29,29,30,30,30,30,
30,30,30,30,30,31,31,31,31,31,31,31,32,32,32,32,32,33,33,33,33,33,
33,33,34,34,34,34,35,35,35,35,36,36,36,36,36,37,37,37,37,38,38,39
Construct frequency distribution for the data.
Solution:
16
7
.
2
7
19
7
20
39
K
S
L
W
46
.
7
88)
3.322(log(
1
)
3.322(logn
1
K
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
≈ 3
17. NURSING Dream ● Discover ● Deliver
Solution
Blood lead level
Mi frequency RF CF RCF
Class
Limit
Class
Boundaries
20-22 19.5-22.5 21 4 4/88 4 4/88
23-25 22.5-25.5 24 12 12/88 16 16/88
26-28 25.5-28.5 27 19 19/88 35 35/88
29-31 28.5-31.5 30 21 21/88 56 56/88
32-34 31.5-34.5 33 16 16/88 72 72/88
35-37 34.5-37.5 36 13 13/88 85 85/88
38-40 37.5-40.5 39 3 3/88 88 88/88
17
Where:
RF = relative frequency
Mi = class mark
CF = cumulative frequency
RCF = relative cumulative frequency
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
18. NURSING Dream ● Discover ● Deliver
Graphs
Some examples are:
Histogram,
Frequency polygon,
Cumulative Relative Frequency Curve etc
18
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
19. NURSING Dream ● Discover ● Deliver
Histograms
Histograms are frequency distributions with continuous class
interval that have been turned into graphs.
The area of each column is proportional to the number of
observations in that interval
19
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
20. NURSING Dream ● Discover ● Deliver
Example
The distribution of the blood lead level of 88 individuals
Blood LL No. of Individuals
19.5-22.5 4
22.5-25.5 12
25.5-28.5 19
28.5-31.5 21
31.5-34.5 16
34.5-37.5 13
37.5-40.5 3
20
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
19.5 22.5 25.5 28.5 31.5 34.5 37.5 40.5
Blood lead level
21. NURSING Dream ● Discover ● Deliver
Frequency polygons
Instead of drawing bars for each class interval, sometimes
a single point is drawn at the mid point of each class
interval and consecutive points joined by straight line.
Graphs drawn in this way are called frequency polygons
(line graphs).
21
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
22. NURSING Dream ● Discover ● Deliver
Frequency polygons cont…
Frequency polygon for the blood lead level of study
participants
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
23. NURSING Dream ● Discover ● Deliver
Frequency polygon of blood lead level for
males and females
23
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
Frequency polygons are superior to histograms for
comparing two or more sets of data.
24. NURSING Dream ● Discover ● Deliver
Cumulative frequency curve (ogive)
The horizontal axis displays the different categories/intervals
The vertical axis displays cumulative (relative) frequency.
A point is placed at the true upper limit of each interval; the
height represents the cumulative relative frequency
associated with that interval. The points are then connected
by straight lines.
Like frequency polygons, cumulative frequency curve may be
used to compare sets of data.
Cumulative frequency curve can also be used to obtain
percentiles of a set of data.
24
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
25. NURSING Dream ● Discover ● Deliver
Cumulative frequency curve cont…
Cumulative relative frequency curve for the blood lead
level of study participants
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
Cumulative
frequency
(prportion
of
individuals
)
The graph ends
at the upper
boundary of the
last class.
The graph begins at the lower
boundary of the first class.
26. NURSING Dream ● Discover ● Deliver
Box plots
A visual picture called box (box-and-whisker )plot can be
used to convey a fair amount of information about the
distribution of a set of data.
It is used as an exploratory data analysis tool
The box shows the distance between the first and the
third quartiles,
The median is marked as a line within the box and
The end lines show the minimum and maximum values
respectively
26
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
27. NURSING Dream ● Discover ● Deliver
Box plot is the five-number summary:
The minimum entry
Q1
Q2 (median)
Q3
The maximum entry
Box plots cont…
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
The quartiles are sets of values which divide the distribution
into four parts such that there are an equal number of
observations in each part.
Q1 = [(n+1)/4]th
Q2 = [2(n+1)/4]th
Q3 = [3(n+1)/4]th
28. NURSING Dream ● Discover ● Deliver
Example: Use the following age data of 15 patients to draw
a box-and-whisker plot.
35 35 36 37 37 38 42 43 43 44 45 48 48 51 55
Box plots cont…
Q3
Q2
Q1
Max
Min
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
29. NURSING Dream ● Discover ● Deliver
Illustration of Box-plot using the age of 15 patients
29
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
Notice the
distribution of
data in each
quarter(distance
between
quartiles)
30. NURSING Dream ● Discover ● Deliver
A box-plot indicating the distribution of blood
lead level of individuals by sex
30
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
31. NURSING Dream ● Discover ● Deliver
Measures of central tendency
It is often useful to summarize, in a single number or statistic,
the general location of the data or the point at which the data
tend to cluster.
Such statistics are called measures of location or measures of
central tendency.
We describe them mean, median and mode.
Arithmetic mean
The arithmetic mean, usually abbreviated to ‘mean’ is the sum of
the observations divided by the number of observations.
31
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
32. NURSING Dream ● Discover ● Deliver
Arithmetic Mean
32
.
n
x
=
x
then
,
sample
a
of
values
observed
n
are
x
...,
,
x
,
x
If
n
1
=
i
i
n
2
1
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
a) Ungrouped mean
Population mean: , if x’s are population observations
x
μ
N
92
.
29
88
9)
3
...
22
21
(20
n
x
=
x
88
1
=
i
n
1
=
i
i
Example: Blood lead level for 88 sample individuals
33. NURSING Dream ● Discover ● Deliver
Arithmetic Mean cont…
b) Grouped data
In calculating the mean from grouped data, we assume that
all values falling into a particular class interval are located
at the mid-point of the interval. It is calculated as follow:
where,
k = the number of class intervals
mi = the mid-point of the ith class interval
fi = the frequency of the ith class interval
33
k
1
=
i
i
k
1
=
i
i
i
f
f
m
=
x
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
34. NURSING Dream ● Discover ● Deliver
Arithmetic Mean cont…
Blood lead
level
( CB)
Class
mark
(Mi)
frequency
19.5-22.5 21 4
22.5-25.5 24 12
25.5-28.5 27 19
28.5-31.5 30 21
31.5-34.5 33 16
34.5-37.5 36 13
37.5-40.5 39 3
86
.
29
)
3
..
.
12
(4
x3)
39
...
24x12
(21x4
=
x 7
1
=
i
7
1
=
i
Example: Arithmetic mean for grouped data of blood
lead level
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
35. NURSING Dream ● Discover ● Deliver
Properties of the arithmetic mean
The mean can be used as a summary measure for both discrete
and continuous data, in general however, it is not appropriate
for either nominal or ordinal data.
For a given set of data there is one and only one arithmetic
mean.
Algebraic sum of the deviations of the given values from their
arithmetic mean is always zero.
The arithmetic mean is greatly affected by the extreme values.
In grouped data if any class interval is open, arithmetic mean
cannot be calculated.
35
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
36. NURSING Dream ● Discover ● Deliver
Median
With the observations arranged in an increasing or decreasing order,
the median is defined as the middle observation.
Ungrouped data
If the number of observations is odd, the median is defined as the
[(n+1)/2]th observation.
If the number of observations is even the median is the average of
the two middle (n/2)th and [(n/2)+1]th values i.e
Example , where n is even: 19, 20, 20, 21, 22, 24, 27, 27, 27, 34
Then, the median = (22 + 24)/2 = 23
The ungrouped median for the blood lead level data is the average
of the 44th & 45th observation; which is (30+30)/2 =30
36
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
37. NURSING Dream ● Discover ● Deliver
Median Cont…
Grouped data
In calculating the median from grouped data, we assume that
the values within a class-interval are evenly distributed
through the interval.
– The first step is to locate the class interval in which it is
located.
– Find n/2 and see a class interval with a minimum
cumulative frequency which contains n/2.
(Note:- All class intervals with cumulative frequencies ≥ n/2
contain the median)
37
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
38. NURSING Dream ● Discover ● Deliver
Median for Grouped data …cont
To find a unique median value, use the following interpolation formal.
where,
Lm = lower true class boundary of the interval containing the median
Fc = cumulative frequency of the interval just bellow the median class
interval
fm = frequency of the interval containing the median
W= class interval width
n = total number of observations
38
W
f
F
2
n
L
=
x
~
m
c
m
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
39. NURSING Dream ● Discover ● Deliver
Median for grouped data cont…
Example
Using the data on the blood lead level of 88 individuals, the
grouped median is:
79
.
29
3
21
35
44
28.5
W
f
F
2
n
L
=
x
~
m
c
m
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
40. NURSING Dream ● Discover ● Deliver
Properties of median
The median can be used as a summary measure for
ordinal, discrete and continuous data, in general
however, it is not appropriate for nominal data.
There is only one median for a given set of data
Median is a positional average and hence it is not
drastically affected by extreme values (It is robust or
resistant to extreme values)
Median can be calculated even in the case of open end
intervals
It is not a good representative of data if the number of
items is small
40
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
41. NURSING Dream ● Discover ● Deliver
Mode
Any observation of a variable at which the distribution reaches a
peak is called a mode.
Most distributions encountered in practice have one peak and
are described as uni-modal.
E.g. Consider the example of ten numbers
19 21 20 20 34 22 24 27 27 27
In the above data set, the mode is 27
The mode of grouped data, usually refers to the modal class,
(the class interval with the highest frequency)
If a single value for the mode of grouped data must be
specified, it is taken as the mid point of the modal class interval
41
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
42. NURSING Dream ● Discover ● Deliver
Properties of mode
The mode can be used as a summary measure for
nominal, ordinal, discrete and continuous data, in general
however, it is more appropriate for nominal and ordinal
data.
It is not affected by extreme values
It can be calculated for distributions with open end classes
Sometimes its value is not unique
The main drawback of mode is that it may not exist
42
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
43. NURSING Dream ● Discover ● Deliver
Measures of variability (Dispersion)
In order to fully understand the nature of the distribution of data set,
both measures of location and dispersion are important
Some measures of variability are: range, inter-quartile range,
variance, standard deviation and the coefficient of variation.
Range:
The range is the difference between the largest and the smallest
observations in the data set.
Being determined by only the two extreme observations, use of the
range is limited because it tells us nothing about how the data
between the extremes are spread.
Example1 : We use the data set of 10 numbers:
19 , 21,20, 20, 34, 22, 24, 27, 27, 27
The range = 34 – 19 = 15
43
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
44. NURSING Dream ● Discover ● Deliver
Quartiles and Inter-quartile Range, Percentiles
• The inter-quartile range (IQR) is the difference between the
third and the first quartiles.
Q3 – Q1
• Example: Consider the age data of 15 patients to find IQR
• IQR = 48 – 37 = 11
44
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
35 35 36 37 37 38 42 43 43 44 45 48 48 51 55
Q3
Q2
Q1
45. NURSING Dream ● Discover ● Deliver
Quartiles and Inter-quartile Range, Percentiles
Percentiles divide the data into 100 parts of observations in
each part.
It follows that the 25th percentile is the first quartile, the 50th
percentile is the median and the 75th percentile is the third
quartile.
45
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
46. NURSING Dream ● Discover ● Deliver
Variance
A good measure of dispersion should make use of all the data.
Intuitively, a good measure could be derived by combining, in
some way, the deviations of each observation from the mean.
The variance achieves this by averaging the sum of the squares
of the deviations from the mean.
46
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
47. NURSING Dream ● Discover ● Deliver
Variance cont…
The population variance of a population data set of N entries is
2
2 ( )
.
x μ
N
The sample variance of the set x1, x2, ..., xn of n
observations with mean x is
S
(x x)
n -1
2
i
2
i=1
n
Note : The sum of the deviations from the mean is zero, thus it
is more useful to square the deviations, add them, find the
mean (to get the variance).
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
48. NURSING Dream ● Discover ● Deliver
Standard Deviation
Being the square of the deviations, the variance is limited as
a descriptive statistic because it is not in the same units as
in the observations.
By taking the square root of the variance, we obtain a
measure of dispersion in the original units.
It is usually denoted by s.d or simply s and the formula is
given by:
48
1
-
n
)
x
(x
S
n
1
=
i
2
i
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
49. NURSING Dream ● Discover ● Deliver
Examples
Example 1: Let us use the age data of 15 individuals
Example 2: consider the example of the blood lead level of 88
individuals given before . Find its variance
Solution
49
86
.
29
88
9)
3
...
22
21
(20
n
x
=
x
88
1
=
i
n
1
=
i
i
46
.
20
1
-
88
)
x
(x
S
88
1
=
i
2
i
2
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
35 35 36 37 37 38 42 43 43 44 45 48 48 51 55
47
.
42
,
,
12
.
38
1
-
15
)
x
(x
S
15
1
=
i
2
i
2
X
Where
50. NURSING Dream ● Discover ● Deliver
Coefficient of variation
When we want to compare the variability in two sets of data, the
standard deviation which calculates the absolute variation may
mislead us especially if the two data sets are:
with different units of measurement ,or
have widely different means
The coefficient of variation (CV) gives relative variation & is the
best measure used to compare the variability in two sets of data.
CV is often presented as the given ratio multiplied by 100%.
50
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
51. NURSING Dream ● Discover ● Deliver
Mean, standard deviation and the
normal distribution
For unimodal, moderately symmetrical, sets of data
approximately:
68% of observations lie within 1 standard deviation of
the mean.
95% of observations lie within 2 standard deviations of
the mean.
i.e. Normally Distributed Data
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
52. NURSING Dream ● Discover ● Deliver
x
The Empirical
Rule
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
53. NURSING Dream ● Discover ● Deliver
x - s x x + s
68% within
1 standard deviation
34% 34%
The Empirical Rule
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
54. NURSING Dream ● Discover ● Deliver
x - 2s x - s x x + 2s
x + s
68% within
1 standard deviation
34% 34%
95% within
2 standard deviations
The Empirical Rule
13.5% 13.5%
55. NURSING Dream ● Discover ● Deliver
x - 3s x - 2s x - s x x + 2s x + 3s
x + s
68% within
1 standard deviation
34% 34%
95% within
2 standard deviations
99.7% of data are within 3 standard deviations of the mean
The Empirical Rule
0.1% 0.1%
2.4% 2.4%
13.5% 13.5%
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
56. NURSING Dream ● Discover ● Deliver
Choosing Appropriate measures
If data are symmetric, with no serious outliers, use mean
and standard deviation.
If data are skewed, and/or have serious outliers, use IQR
and median.
If comparing variation across two variables, use coefficient
of variation if the variables are in different units and/or
scales or the means are significantly different.
If the scales/units and mean are roughly the same direct
comparison of the standard deviation is fine.
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
57. NURSING Dream ● Discover ● Deliver
Median Mode Mean
Fig. 2(a). Symmetric Distribution
Mean = Median = Mode
Mode Median Mean
Fig. 2(b). Distribution skewed to the right
Mean > Median > Mode
Mean Median Mode
Fig. 2(c). Distribution skewed to the left
Mean < Median < Mode
57
Lemma Derseh, Department of Epidemiology and Biostatistics, University of Gondar
Editor's Notes
page 79 of text
Some student have difficulty understand the idea of ‘within one standard deviation of the mean’. Emphasize that this means the interval from one standard deviation below the mean to one standard deviation above the mean.