Application of a test or a procedure to large number of population who have no symptoms of a particular disease for the purpose of determining their likelihood of having the disease.
in this slide you will learn about
what is screening
types and uses of screening
difference between screening and diagnostic tests
criteria of screening
and
evaluation of screening tests
Application of a test or a procedure to large number of population who have no symptoms of a particular disease for the purpose of determining their likelihood of having the disease.
in this slide you will learn about
what is screening
types and uses of screening
difference between screening and diagnostic tests
criteria of screening
and
evaluation of screening tests
What is Cohort?
Indication and Elements of Cohort Study.
What is Relative risk and Attributable risk, and its interpretation?
Advantages & disadvantages of Cohort study.
Difference between Case control & Cohort study.
Medical research involves many epidemiology study scheme
The study design is decided by The nature of the research question.
Descriptive study,analytical study, exploratory study
Clinical trial,cross sectional study,case control study (diseases and non disease group/ ) cohorte study (exposed and non exposed)
A study design is a specific plan or protocol for conducting the study, which allows the investigator to translate the conceptual hypothesis into an operational one
Screening for disease or Early detection of disease is detecting a disease at an earlier stage than would usually occur in standard clinical practice.
This denotes detecting disease at a pre-symptomatic stage, at which point the patient has no clinical complaint ( no symptoms or signs) and therefore no reason to seek medical care for the condition
Early detection of disease is beneficial and that intervention at an earlier stage of the disease process is more effective or easier to implement than a later intervention
Screening for diseases from community medicine. It explains the definition of screening, lead time, uses of screening, differences between screening and diagnostic test, criteria for a disease to be screened and criteria for a screening test, cut-off points, etc
What is Cohort?
Indication and Elements of Cohort Study.
What is Relative risk and Attributable risk, and its interpretation?
Advantages & disadvantages of Cohort study.
Difference between Case control & Cohort study.
Medical research involves many epidemiology study scheme
The study design is decided by The nature of the research question.
Descriptive study,analytical study, exploratory study
Clinical trial,cross sectional study,case control study (diseases and non disease group/ ) cohorte study (exposed and non exposed)
A study design is a specific plan or protocol for conducting the study, which allows the investigator to translate the conceptual hypothesis into an operational one
Screening for disease or Early detection of disease is detecting a disease at an earlier stage than would usually occur in standard clinical practice.
This denotes detecting disease at a pre-symptomatic stage, at which point the patient has no clinical complaint ( no symptoms or signs) and therefore no reason to seek medical care for the condition
Early detection of disease is beneficial and that intervention at an earlier stage of the disease process is more effective or easier to implement than a later intervention
Screening for diseases from community medicine. It explains the definition of screening, lead time, uses of screening, differences between screening and diagnostic test, criteria for a disease to be screened and criteria for a screening test, cut-off points, etc
Frequency Measures Used in EpidemiologyIntroductionIn e.docxMARRY7
Frequency Measures Used in Epidemiology
Introduction
In epidemiological studies, many qualitative variables have only two possible categories, such as
Alive or dead
Case or control
Exposed and unexposed
The frequency measures for dichotomous variable include:
Ratio
Proportion
Rate
( All the above 3 measure are based on the same formula: )
Ratios, Proportion, and Rates Compared
In a ratio, the values of x and y may be completely independent from each other or x is a part of y
For example , the gender of the children attending a specific program could be compared in one of the following ways:
Proportion is a ratio in which X is included in Y
For example , the gender of the children attending a specific program
Rate is a proportion that measures the occurrence of an event in a population over time
Rate = X
Ratios, Proportion, and Rates Compared
Example 1: The following table was part of an article published by Dr. Mshana and his colleagues. The title of this study is “Outbreak of a novel Enterobacter sp. carrying blaCTX-M-15 in a neonatal unit of a tertiary care hospital in Tanzania. ". Please use this table to answer the following questions.
Source: Mshana SE, Gerwing L, Minde M, Hain T, Domann E, Lyamuya E, et al. Outbreak of a novel Enterobacter sp. carrying blaCTX-M-15 in a neonatal unit of a tertiary care hospital in Tanzania. International journal of antimicrobial agents. 2011;38(3):265-9.
4
Example 1
What is the ratio of males to females? 7 : 10
What proportion of premature babies? 12/17=0.706
What proportion of patients were discharged? 11/17=0.647
What is the ratio of prematurity to birth asphyxia ? 12 : 5
Source: Mshana SE, Gerwing L, Minde M, Hain T, Domann E, Lyamuya E, et al. Outbreak of a novel Enterobacter sp. carrying blaCTX-M-15 in a neonatal unit of a tertiary care hospital in Tanzania. International journal of antimicrobial agents. 2011;38(3):265-9.
5
Example 2:
In 1989, 733,151 new cases of gonorrhea were reported among the United States civilian population. The 1989 mid-year U.S. civilian population was estimated to be 246,552,000. What is the 1989 gonorrhea incidence rate for the U.S. civilian population? (For these data we will use a value of 105 for 10n ).
Answer:
Incidence rate = X
Incidence rate = X = 297.4 per 100,000
6
Measures of association:
They are used to quantify the relationship between exposure and disease among two groups
They are used to compare the disease occurrence among one group with the disease occurrence in the another group
They include the following measures based on the study design:
Risk Ratio (RR):
It also called relative risk
It is used to compare the risk of health related events in two groups
The following formula cis used to calculate the RR:
A risk ratio of 1.0 indicates identical risk in the two groups
A risk ratio greater than 1.0 indicates an increased risk for the numerator group
A risk ratio greater than 1.0 ...
Statistics is a scientific study of numerical data based on natural phenomena.
It is also the science of collecting, organizing, interpreting and reporting data.
Running head PHASE 1 SCENARIO NCLEX MEMOORIAL HOSPITAL1PHASE .docxtoltonkendal
Running head: PHASE 1 SCENARIO NCLEX MEMOORIAL HOSPITAL 1
PHASE 1 SCENARIO NCLEX MEMORIAL HOSPITAL 6
PHASE 1/ Option 2 SCENARIO NCLEX MEMORIAL HOSPITAL
Name: Rodney Wheeler
Institution: Rasmussen College
Course: STA3215 Section 01 Inferential Statistics and Analytics
Date: 02/17/17
Introduction
The scenario I will be working with is that I am working at NCLEX Memorial Hospital in the infectious disease unit. As a healthcare professional, I need to work to improve the health of individuals, families and communities in various settings. The current situation that has posed as a problem at the hospital and raised eyebrows is that in the past few days, there has been an increase in patients admitted with a particular infectious disease. The basic statistical analysis shows that the disease does not affect minors hence the ages of the infected patients does play a critical role in the method that shall be required to treat the patients in order to impact positively on the health and well-being of the clients being served whether infected with the disease or associated with those infected. After speaking to the manager, we decided that we shall work together in utilising the available statistical analysis to look closer into the ages of the infected patients. To do that, I had to put together a spreadsheet with the data containing the information we shall need to carry out the analysis.
Data Analysis
From the data collected and input on an Excel sheet, there are sixty patients with the infectious disease. Of the patient’s whose data has already been collected an input on the excel sheet, the ages range from thirty-five years of age to seventy-six. There is only one patient in their thirties with the age of thirty-five. There are five patients in their forties, One forty-five, one forty-six, two at forty-eight and two at forty-nine. There are fifteen patients in their fifties, two at fifty, one fifty-two, one fifty-three, one fifty-four, four at fifty-five, one fifty-six, one at fifty-eight and four at fifty-nine. There are twenty-three patients in their sixties, five at sixty, one at sixty-two, one at sixty-three, two at sixty-four, one at sixty-five, three at sixty-eight and seven at sixty-nine. Finally, we have fifteen infected patients in their seventies, six at seventy, three at seventy-one, three at seventy-two, one at seventy-three, one at seventy-four and one at seventy-six. From the graph in Figure 1 below, the horizontal axis depicts the age group of patients infected with the disease and the vertical axis depicts the number of patients in the age group infected with the disease.
Figure 1
Data Classification
The qualitative variables in our data analysis would be the names of the patients infected with the disease while the quantitative data would be their ages, number of patients in each age category or age bracket that are infected with the disease and the number of patients in each specific age that are affect ...
ARTIFICIAL INTELLIGENCE IN HEALTHCARE.pdfAnujkumaranit
Artificial intelligence (AI) refers to the simulation of human intelligence processes by machines, especially computer systems. It encompasses tasks such as learning, reasoning, problem-solving, perception, and language understanding. AI technologies are revolutionizing various fields, from healthcare to finance, by enabling machines to perform tasks that typically require human intelligence.
micro teaching on communication m.sc nursing.pdfAnurag Sharma
Microteaching is a unique model of practice teaching. It is a viable instrument for the. desired change in the teaching behavior or the behavior potential which, in specified types of real. classroom situations, tends to facilitate the achievement of specified types of objectives.
Lung Cancer: Artificial Intelligence, Synergetics, Complex System Analysis, S...Oleg Kshivets
RESULTS: Overall life span (LS) was 2252.1±1742.5 days and cumulative 5-year survival (5YS) reached 73.2%, 10 years – 64.8%, 20 years – 42.5%. 513 LCP lived more than 5 years (LS=3124.6±1525.6 days), 148 LCP – more than 10 years (LS=5054.4±1504.1 days).199 LCP died because of LC (LS=562.7±374.5 days). 5YS of LCP after bi/lobectomies was significantly superior in comparison with LCP after pneumonectomies (78.1% vs.63.7%, P=0.00001 by log-rank test). AT significantly improved 5YS (66.3% vs. 34.8%) (P=0.00000 by log-rank test) only for LCP with N1-2. Cox modeling displayed that 5YS of LCP significantly depended on: phase transition (PT) early-invasive LC in terms of synergetics, PT N0—N12, cell ratio factors (ratio between cancer cells- CC and blood cells subpopulations), G1-3, histology, glucose, AT, blood cell circuit, prothrombin index, heparin tolerance, recalcification time (P=0.000-0.038). Neural networks, genetic algorithm selection and bootstrap simulation revealed relationships between 5YS and PT early-invasive LC (rank=1), PT N0—N12 (rank=2), thrombocytes/CC (3), erythrocytes/CC (4), eosinophils/CC (5), healthy cells/CC (6), lymphocytes/CC (7), segmented neutrophils/CC (8), stick neutrophils/CC (9), monocytes/CC (10); leucocytes/CC (11). Correct prediction of 5YS was 100% by neural networks computing (area under ROC curve=1.0; error=0.0).
CONCLUSIONS: 5YS of LCP after radical procedures significantly depended on: 1) PT early-invasive cancer; 2) PT N0--N12; 3) cell ratio factors; 4) blood cell circuit; 5) biochemical factors; 6) hemostasis system; 7) AT; 8) LC characteristics; 9) LC cell dynamics; 10) surgery type: lobectomy/pneumonectomy; 11) anthropometric data. Optimal diagnosis and treatment strategies for LC are: 1) screening and early detection of LC; 2) availability of experienced thoracic surgeons because of complexity of radical procedures; 3) aggressive en block surgery and adequate lymph node dissection for completeness; 4) precise prediction; 5) adjuvant chemoimmunoradiotherapy for LCP with unfavorable prognosis.
Ethanol (CH3CH2OH), or beverage alcohol, is a two-carbon alcohol
that is rapidly distributed in the body and brain. Ethanol alters many
neurochemical systems and has rewarding and addictive properties. It
is the oldest recreational drug and likely contributes to more morbidity,
mortality, and public health costs than all illicit drugs combined. The
5th edition of the Diagnostic and Statistical Manual of Mental Disorders
(DSM-5) integrates alcohol abuse and alcohol dependence into a single
disorder called alcohol use disorder (AUD), with mild, moderate,
and severe subclassifications (American Psychiatric Association, 2013).
In the DSM-5, all types of substance abuse and dependence have been
combined into a single substance use disorder (SUD) on a continuum
from mild to severe. A diagnosis of AUD requires that at least two of
the 11 DSM-5 behaviors be present within a 12-month period (mild
AUD: 2–3 criteria; moderate AUD: 4–5 criteria; severe AUD: 6–11 criteria).
The four main behavioral effects of AUD are impaired control over
drinking, negative social consequences, risky use, and altered physiological
effects (tolerance, withdrawal). This chapter presents an overview
of the prevalence and harmful consequences of AUD in the U.S.,
the systemic nature of the disease, neurocircuitry and stages of AUD,
comorbidities, fetal alcohol spectrum disorders, genetic risk factors, and
pharmacotherapies for AUD.
The prostate is an exocrine gland of the male mammalian reproductive system
It is a walnut-sized gland that forms part of the male reproductive system and is located in front of the rectum and just below the urinary bladder
Function is to store and secrete a clear, slightly alkaline fluid that constitutes 10-30% of the volume of the seminal fluid that along with the spermatozoa, constitutes semen
A healthy human prostate measures (4cm-vertical, by 3cm-horizontal, 2cm ant-post ).
It surrounds the urethra just below the urinary bladder. It has anterior, median, posterior and two lateral lobes
It’s work is regulated by androgens which are responsible for male sex characteristics
Generalised disease of the prostate due to hormonal derangement which leads to non malignant enlargement of the gland (increase in the number of epithelial cells and stromal tissue)to cause compression of the urethra leading to symptoms (LUTS
Knee anatomy and clinical tests 2024.pdfvimalpl1234
This includes all relevant anatomy and clinical tests compiled from standard textbooks, Campbell,netter etc..It is comprehensive and best suited for orthopaedicians and orthopaedic residents.
Explore natural remedies for syphilis treatment in Singapore. Discover alternative therapies, herbal remedies, and lifestyle changes that may complement conventional treatments. Learn about holistic approaches to managing syphilis symptoms and supporting overall health.
Acute scrotum is a general term referring to an emergency condition affecting the contents or the wall of the scrotum.
There are a number of conditions that present acutely, predominantly with pain and/or swelling
A careful and detailed history and examination, and in some cases, investigations allow differentiation between these diagnoses. A prompt diagnosis is essential as the patient may require urgent surgical intervention
Testicular torsion refers to twisting of the spermatic cord, causing ischaemia of the testicle.
Testicular torsion results from inadequate fixation of the testis to the tunica vaginalis producing ischemia from reduced arterial inflow and venous outflow obstruction.
The prevalence of testicular torsion in adult patients hospitalized with acute scrotal pain is approximately 25 to 50 percent
Ozempic: Preoperative Management of Patients on GLP-1 Receptor Agonists Saeid Safari
Preoperative Management of Patients on GLP-1 Receptor Agonists like Ozempic and Semiglutide
ASA GUIDELINE
NYSORA Guideline
2 Case Reports of Gastric Ultrasound
HOT NEW PRODUCT! BIG SALES FAST SHIPPING NOW FROM CHINA!! EU KU DB BK substit...GL Anaacs
Contact us if you are interested:
Email / Skype : kefaya1771@gmail.com
Threema: PXHY5PDH
New BATCH Ku !!! MUCH IN DEMAND FAST SALE EVERY BATCH HAPPY GOOD EFFECT BIG BATCH !
Contact me on Threema or skype to start big business!!
Hot-sale products:
NEW HOT EUTYLONE WHITE CRYSTAL!!
5cl-adba precursor (semi finished )
5cl-adba raw materials
ADBB precursor (semi finished )
ADBB raw materials
APVP powder
5fadb/4f-adb
Jwh018 / Jwh210
Eutylone crystal
Protonitazene (hydrochloride) CAS: 119276-01-6
Flubrotizolam CAS: 57801-95-3
Metonitazene CAS: 14680-51-4
Payment terms: Western Union,MoneyGram,Bitcoin or USDT.
Deliver Time: Usually 7-15days
Shipping method: FedEx, TNT, DHL,UPS etc.Our deliveries are 100% safe, fast, reliable and discreet.
Samples will be sent for your evaluation!If you are interested in, please contact me, let's talk details.
We specializes in exporting high quality Research chemical, medical intermediate, Pharmaceutical chemicals and so on. Products are exported to USA, Canada, France, Korea, Japan,Russia, Southeast Asia and other countries.
3. CONTENTS
INTRODUCTION
BASIC CONSIDERATIONS
MEASURES OF FREQUENCY
MEASURES OF CENTRAL TENDENCY
MEASURES OF DISPERSION
MEASURES OF LOCATION
MEASURE OF SHAPE
BOX-AND-WHISKER PLOTS
SUMMARY
CONCLUSION
PUBLIC HEALTH SIGNIFICANCE
REFERENCE
3
4. INTRODUCTION
Nearly everyday statistics are used to support assertions about health
and what people can do to improve their health; like the roles of
diet,exercise,the environment etc..
Because the effects are often small and vary greatly from person to
person an understanding of statistics and how it allows researchers to
draw conclusions from data is very essential for every person
interested in public health.
Statistics play a crucial role in research ,planning and decision making
in the health sciences.
4
Perez S, Ruizb M. Descriptive statistics .Allergol Immunopathol.2009;37(6):314–320
Kaur P, Stoltzfus J, Yellapu V. Descriptive statistics.Int J Acad Med 2018;4:60-63
5. INTRODUCTION
Statistics is the set of procedures and techniques used to collect, organise
and analyse data, which are the basis for making decisions in situations of
uncertainty.
Data generally consist of an extensive number of measurements or
observations that are too numerous or complicated to be understood
through simple observation.
There are ways to condense and organize information into a set of
descriptive measures and visual devices that enhance the understanding
of complex data.
5
Perez S, Ruizb M. Descriptive statistics .Allergol Immunopathol.2009;37(6):314–320
Kaur P, Stoltzfus J, Yellapu V. Descriptive statistics.Int J Acad Med 2018;4:60-63
6. BASIC CONSIDERATIONS
Statistics are divided into descriptive and inferential.
Descriptive statistics refers to the collection, presentation, description,
analysis and interpretation of data collected.
Its purpose is to summarize the findings from a set of values.
And it helps to form a set of conclusions about themselves.
It can be used to summarise or describe any data set, either a population
or a sample.
6
Perez S, Ruizb M. Descriptive statistics .Allergol Immunopathol.2009;37(6):314–320
7. BASIC CONSIDERATIONS
Inferential statistics refers to the set of techniques used to gain conclusions
from the population through manipulation of the sample data.
It is the process of making generalisations about the population from a
representative sample of data.
Before analyzing any dataset, one should be familiar with different types of
variables.
Variable; Any quantity that varies. Any attribute, phenomenon, or event that
can have different values.
7
Perez S, Ruizb M. Descriptive statistics .Allergol Immunopathol.2009;37(6):314–320
Porta M, Last JM, Greenland S. A Dictionary of Epidemiology. 5th ed. Oxford: Oxford University Press (OUP); 2008.
8. BASIC CONSIDERATIONS
Types of variable;
Quantitative Variables
Quantitative variables are numerical scales that measure the amount of
something.
Example, height,weight of preschool children, and the age of patients
seen in a dental clinic.
8
• QUANTITATIVE VARIABLE
• QUALITATIVE VARIABLE
Perez S, Ruizb M. Descriptive statistics .Allergol Immunopathol.2009;37(6):314–320
9. BASIC CONSIDERATIONS
There are two types of numerical scales:
Continuous and Discrete
Continuous numerical variables can take any value between two
points (e.g., weight,length), so they are values with decimals.
Discrete numerical variables take values from discrete scales, integer
values (e.g no; of patients).
9
Perez S, Ruizb M. Descriptive statistics .Allergol Immunopathol.2009;37(6):314–320
10. BASIC CONSIDERATIONS
Qualitative variables/Categorical variables
Some characteristics are not capable of being measured but can be
categorized only.
for example, when an ill person is given a medical diagnosis, a
person is designated as belonging to an ethnic group, or a person,
place, or object is said to possess or not to possess some
characteristic of interest.
In such cases measuring consists of categorizing.
10
Perez S, Ruizb M. Descriptive statistics .Allergol Immunopathol.2009;37(6):314–320
11. Measurement Scales
Scale ; A device or system for measuring equal portions.
The Nominal Scale
The lowest measurement scale is the nominal scale.
As the name implies it consists of “naming” observations or classifying
them into various mutually exclusive and collectively exhaustive
categories.
Examples include such dichotomies as male–female,adults-children,
those with periodontal diseases – dental caries etc..
11
BASIC CONSIDERATIONS
Daniel WW, Cross CL. Biostatistics: basic concepts and methodology for the Health Sciences. 10th ed . New Delhi : Wiley; 2014.
Porta M, Last JM, Greenland S. A Dictionary of Epidemiology. 5th ed. Oxford: Oxford University Press (OUP); 2008.
12. The Ordinal Scale
Whenever observations are not only different from category to category
but can be ranked according to some criterion, they are said to be
measured on an ordinal scale.
Individuals may be classified according to socioeconomic status as
low,medium, or high.
The intelligence of children may be above average, average, or below
average.
Oral hygiene or prognosis as poor,fair,good etc..
12
BASIC CONSIDERATIONS
Daniel WW, Cross CL. Biostatistics: basic concepts and methodology for the Health Sciences. 10th ed . New Delhi : Wiley; 2014.
13. The Interval Scale
Is a more sophisticated scale than the nominal or ordinal in that with
this scale not only it is possible to order measurements, but also the
distance between any two measurements is known.
Difference between a measurement of 20 and a measurement of 30 is
equal to the difference between measurements of 30 and 40.
The ability to do this implies the use of a unit distance and a zero point,
both of which are arbitrary.
13
BASIC CONSIDERATIONS
Daniel WW, Cross CL. Biostatistics: basic concepts and methodology for the Health Sciences. 10th ed . New Delhi : Wiley; 2014.
14. The selected zero point is not necessarily a true zero
best example of an interval scale is provided by the way in which
temperature is usually measured (degrees Fahrenheit or Celsius).
The Ratio Scale
The highest level of measurement is the ratio scale. This scale is
characterized by the fact that equality of ratios as well as equality of
intervals may be determined.
Fundamental to the ratio scale is a true zero point.
The measurement of such familiar traits as height, weight, and length
makes use of the ratio scale.
14
BASIC CONSIDERATIONS
Daniel WW, Cross CL. Biostatistics: basic concepts and methodology for the Health Sciences. 10th ed . New Delhi : Wiley; 2014.
15. MEASURES OF FREQUENCY
Absolute Frequency
Is the number of times a particular value occurs in the data.
Relative Frequency
Is the number of times a particular value occurs in the data (absolute
frequency) relative to the total number of values for that variable.
The relative frequency may be expressed in proportions, and
percentages.
15
Kaur P, Stoltzfus J, Yellapu V. Descriptive statistics.Int J Acad Med 2018;4:60-63
16. CLASS
INTERVAL
ABSOLUTE
FREQUENCY
RELATIVE
FREQUENCY
30-39 11 .0582
40-49 46 .2434
50-59 70 .3704
60-69 45 .2381
70-79 16 .0847
80-89 1 .0053
TOTAL 189 1.0001
16
Frequency Distributions of the Ages of Subjects
Measures of Frequency
Kaur P, Stoltzfus J, Yellapu V. Descriptive statistics.Int J Acad Med 2018;4:60-63
17. Rate
A rate measures the occurrence of some particular event
(development of disease or the occurrence of death) in a population during a
given time period.
It is a statement of the risk of developing a condition.
Ratio
Another measure of disease frequency is a ratio.
It expresses a relation in size between two random quantities.
The numerator is not a component of the denominator.
The ratio of white blood cells relative to red cells is 1:600 or 1/600,
meaning that for each white cell, there are 600 red cells
17
Measures of Frequency
Park K. Park's textbook of Preventive and social medicine. 25th ed. India: Bhanot Publishers; 2017.
18. Proportion A proportion is a ratio which indicates the relation in magnitude
of a part of the whole. The numerator is always included in the
denominator.
Percentage is another way of expressing a proportion as fraction of 100.
The total percentage of an entire dataset should always add up to 100%.
For example, in total of thirty participants, where 2 experience adverse
drug effects, 2/30 = 0.066 × 100 = 6.6% of participants experience adverse
effects.
18
Measures of Frequency
Park K. Park's textbook of Preventive and social medicine. 25th ed. India: Bhanot Publishers; 2017.
19. Measures of frequency are often expressed visually in the form of
tables, histograms (for quantitative variables), or bar graphs (for
qualitative variables) to make the information more easily interpretable.
19
HISTOGRAM
BAR DIAGRAM
Measures of Frequency
Perez S, Ruizb M. Descriptive statistics .Allergol Immunopathol.2009;37(6):314–320
Park K. Park's textbook of Preventive and social medicine. 25th ed. India: Bhanot Publishers; 2017.
20. MEASURES OF CENTRAL
TENDENCY
When given a set of raw data one of the most useful ways of
summarising that data is to find an average of that set of data.
An average is a measure of the centre of the data set.
There are three common ways of describing the centre of a set of
numbers. They are the
-The mean
-The median
-The mode
20
Pagano M,Gauvreau K. Principles Of Biostatistics. 2nd ed. New Delhi:Ceneage Learning India Pvt Ltd; 2000.
Park K. Park's textbook of Preventive and social medicine. 25th ed. India: Bhanot Publishers; 2017.
21. Arithmetic Mean
The most familiar measure of central tendency is the arithmetic mean.
It is the descriptive measure most people have in mind when they
speak of the “average.”
To obtain the mean, the individual observations are first added
together, and then divided by the number of observations
21
Pagano M,Gauvreau K. Principles Of Biostatistics. 2nd ed. New Delhi:Ceneage Learning India Pvt Ltd; 2000.
Park K. Park's textbook of Preventive and social medicine. 25th ed. India:
Bhanot Publishers; 2017.
Measures Of Central Tendency
22. Properties of the Mean
Uniqueness. For a given set of data there is one and only one
arithmetic mean.
Simplicity. The arithmetic mean is easily understood and easy to
compute.
Since each and every value in a set of data enters into the
computation of the mean, it is affected by each value. Extreme values,
distort it that it becomes undesirable as a measure of central tendency.
22
Pagano M,Gauvreau K. Principles Of Biostatistics. 2nd ed. New Delhi:Ceneage Learning India Pvt Ltd; 2000.
Park K. Park's textbook of Preventive and social medicine. 25th ed. India: Bhanot Publishers; 2017.
Measures Of Central Tendency
23. The Sample Mean
We use x- to designate the sample mean and n to indicate the number of
values
in the sample.
The population mean
In a finite population of values, represented by xN, where N is the number of
values in the population. Finally, we will use the Greek letter µ to stand for the
population mean.
23
Pagano M,Gauvreau K. Principles Of Biostatistics. 2nd ed. New Delhi:Ceneage Learning India Pvt Ltd; 2000.
Measures Of Central Tendency
24. SUBJECT FEV1
(liters)
1 2.30
2 2.15
3 3.50
4 2.60
5 2.75
6 2.82
7 4.05
8 2.25
9 2.68
10 3.00
11 4.02
12 2.85
13 3.38
24
Forced expiratory volumes
Pagano M,Gauvreau K. Principles Of Biostatistics. 2nd ed. New Delhi:Ceneage Learning India Pvt Ltd; 2000.
Measures Of Central Tendency
25. The mean can be used as a summary measure for both discrete and
continuous measurements.
In general, however, it is not appropriate for either nominal or ordinal data.
One exception to this rule applies when we have dichotomous data and the
two possible outcomes are represented by the values 0 and 1.
In this situation, the mean of the observations is equal to the proportion of 1s
in the data set.
Eg; asthmatic males=[0+ 1 + 1 + 0 + 0 + 1 + 1 + 1 + 0 + 1 + 1 + 1 + 0]/13
= 0.615(Therefore, 61.5% of the study subjects are males)
25
Pagano M,Gauvreau K. Principles Of Biostatistics. 2nd ed. New Delhi:Ceneage Learning India Pvt Ltd; 2000.
Measures Of Central Tendency
26. Grouped Data
Subject Duration
(years)
1 12
2 11
3 12
4 6
5 11
6 11
7 8
8 5
9 5
10 5
26
Duration of transfusion in the sickle cell disease
Pagano M,Gauvreau K. Principles Of Biostatistics. 2nd ed. New Delhi:Ceneage
Learning India Pvt Ltd; 2000.
Measures Of Central Tendency
27. To compute the mean of a set of data arranged as a frequency
distribution, we begin by assuming that all the values that fall into a
particular interval are equal to the midpoint of that interval.(group mean)
To find the mean of the grouped data, we first sum the measurements by
multiplying the midpoint of each interval by the corresponding frequency
and adding these products; we then divide by the total number of values.
27
k is the number of intervals in the table, m; is the
midpoint of the ith interval,
and!; is the frequency associated with the ith
interval.
K; is the number of intervals in the table.
m; is the midpoint of the ith interval.
fi
; is the frequency associated with the ith interval.
Pagano M,Gauvreau K. Principles Of Biostatistics. 2nd ed. New Delhi:Ceneage Learning India Pvt Ltd; 2000.
Measures Of Central Tendency
28. Absolute frequencies of serum
cholesterol levels
Cholester
ol Level
(mg/100
ml) of Men
Number
Of Men
80-119 13
120-159 150
160-199 442
200-239 299
240-279 115
280-319 34
320-359 9
360-399 5
Total 1067
28
Pagano M,Gauvreau K. Principles Of Biostatistics. 2nd ed. New Delhi:Ceneage
Learning India Pvt Ltd; 2000.
Measures Of Central Tendency
29. Geometric mean
The geometric mean is a type of average, usually used for
growth rates, like population growth or interest rates.
While the arithmetic mean adds items, the geometric
mean multiplies items.
Also, you can only get the geometric mean for positive numbers.
29
Pagano M,Gauvreau K. Principles Of Biostatistics. 2nd ed. New Delhi:Ceneage Learning India Pvt Ltd; 2000.
Measures Of Central Tendency
30. Suppose a rectangle has a dimensions of 9m*4m.
What is the side length of a square with equivalent area?
4m
Square side length=(9*4) 1/2
= 6
30
9m
6m
6m
Pagano M,Gauvreau K. Principles Of Biostatistics. 2nd ed. New Delhi:Ceneage Learning India Pvt Ltd; 2000.
Measures Of Central Tendency
31. -Lets say stipend increases in value by 10% at the end of first year, 20%
the second year, and 30% the third year.Q;Average rate of increase per
year?
Year 0-1 lakh *1.1 (1.1)*(1.2)*(1.3)=1.716
Year 1-1.1 lakh *1.2 G.M=(1.716)1/3
Year 2-1.32 lakh *1.3 =1.1972
Year 3-1.71 lakh
31
Pagano M,Gauvreau K. Principles Of Biostatistics. 2nd ed. New Delhi:Ceneage Learning India Pvt Ltd; 2000.
Measures Of Central Tendency
32. Harmonic mean
The Harmonic Mean (HM) is defined as the reciprocal of the arithmetic
mean of the given data values.
It is based on all the observations, and it is rigidly defined.
Harmonic mean gives less weightage to the large values and large
weightage to the small values to balance the values correctly.
It is applied in the case of times and average rates.
32
Pagano M,Gauvreau K. Principles Of Biostatistics. 2nd ed. New Delhi:Ceneage Learning India Pvt Ltd; 2000.
Measures Of Central Tendency
33. Since the harmonic mean is the reciprocal of the arithmetic mean, the
formula to define the harmonic mean “HM” is given as follows:
If x1, x2, x3,…, xn are the individual items up to n terms, then,
Harmonic Mean, HM = n / [(1/x1)+(1/x2)+(1/x3)+…+(1/xn)]
Eg;
We travel 10 km at 60 km/hr,than another 10 km at 20 km/hr,what is our
average speed?
H.M=2/[(1/60)+(1/20)]=30km/hr
33
Pagano M,Gauvreau K. Principles Of Biostatistics. 2nd ed. New Delhi:Ceneage Learning India Pvt Ltd; 2000.
Measures Of Central Tendency
34. Median
The median is an average of a different kind, which does not depend
upon the total and number of items.
To obtain the median, the data is first arranged in an ascending or
descending order of magnitude, and then the value of the middle
observation is located, which is called the median.
Positional average of a data set.
The median of a finite set of values is that value which divides the set
into two equal parts
If the number of values is odd, the median will be the middle value
when all values have been arranged in order of magnitude.
34
Pagano M,Gauvreau K. Principles Of Biostatistics. 2nd ed. New Delhi:Ceneage Learning India Pvt Ltd; 2000.
Park K. Park's textbook of Preventive and social medicine. 25th ed. India: Bhanot Publishers; 2017.
Measures Of Central Tendency
35. When the number of values is even, there is no single middle value.
Instead there are two middle values.
In this case the median is taken to be the mean of these two middle
values, when all values have been arranged in the order of their
magnitudes.
Therefore, if a set of data contains a total of ‘n’ observations where ‘n’ is
odd, the median is the middle value, or the
[(n + 1)/2]th largest measurement;
If n is even, the median is usually taken to be the average of the two
middlemost values, the (n/2)th and [(n/2) + 1]th observations
35
Pagano M,Gauvreau K. Principles Of Biostatistics. 2nd ed. New Delhi:Ceneage Learning India Pvt Ltd; 2000.
Measures Of Central Tendency
36. Properties of the Median
Uniqueness. As is true with the mean, there is only one median for a
given set of data.
Simplicity. The median is easy to calculate.
It is not as drastically affected by extreme values as is the mean,which
makes it the best measure of central tendency when the data is
skewed.
36
Pagano M,Gauvreau K. Principles Of Biostatistics. 2nd ed. New Delhi:Ceneage Learning India Pvt Ltd; 2000.
Measures Of Central Tendency
37. For eg;consider the set of;
2.15, 2.25, 2.30, 2.60, 2.68, 2.75, 2.82, 2.85, 3.00, 3.38, 3.50, 4.02, 4.05.
Since there are an odd number of observations in the list, the median is the
(13 + 1)/2 =7th observation, or 2.82. Six of the measurements are less than or
equal to 2.82 liters, and six are greater than or equal to 2.82.
subject 12 was recorded as 40.2 rather than 4.02;
2.15, 2.25, 2.30, 2.60, 2.68, 2.75, 2.82, 2.85, 3.00, 3.38, 3.50, 4.05, 40.2.
median FEV1 would remain 2.82liters ,unlike the mean as it is much less
sensitive to unusual data points.
37
Pagano M,Gauvreau K. Principles Of Biostatistics. 2nd ed. New Delhi:Ceneage Learning India Pvt Ltd; 2000.
Measures Of Central Tendency
38. Mode
The mode is the commonly occurring value in a distribution of data.
It is the most frequent item or the most "fashionable" value in a series of
observations.
The mode is located from the frequency distribution table ,taking the value
of the variable with the maximum frequency.
When mode is ill defined it can be calculated using the equation;
Mode=3Median-2Mean
If all the values are different there is no mode;
on the other hand, a set of values may have more than one mode.
38
Pagano M,Gauvreau K. Principles Of Biostatistics. 2nd ed. New Delhi:Ceneage Learning India Pvt Ltd; 2000.
Park K. Park's textbook of Preventive and social medicine. 25th ed. India: Bhanot Publishers; 2017.
Measures Of Central Tendency
39. Example;
Let us consider a laboratory with 10 employees whose ages are
20, 21, 20, 20, 34, 22, 24, 27, 27, 27
We could say that these data have two modes, 20 and 27.
The sample consisting of the values 10, 21, 33, 53, 54 has no mode
If it is required to know the value that has high influence in the series
mode may be computed.
39
Pagano M,Gauvreau K. Principles Of Biostatistics. 2nd ed. New Delhi:Ceneage Learning India Pvt Ltd; 2000.
Measures Of Central Tendency
40. The mode may be used also for describing qualitative data.
For example, suppose the patients seen in a mental health clinic during a
given year received one of the following diagnoses: mental retardation,
organic brain syndrome, psychosis, neurosis, and personality disorder.
The diagnosis occurring most frequently in the group of patients would be
called the modal diagnosis.
40
Pagano M,Gauvreau K. Principles Of Biostatistics. 2nd ed. New Delhi:Ceneage Learning India Pvt Ltd; 2000.
Measures Of Central Tendency
41. The advantages of mode are that it is easy to understand, and is not
affected by the extreme items.
The disadvantages are that the exact location is often uncertain and is
often not clearly defined .
Therefore, mode is not often used in biological or medical statistics.
41
Pagano M,Gauvreau K. Principles Of Biostatistics. 2nd ed. New Delhi:Ceneage Learning India Pvt Ltd; 2000.
Measures Of Central Tendency
42. Rationale for using
Mean,median And Mode
The best measure of central tendency for a given set of data often
depends on the way in which the values are distributed.
If they are symmetric and unimodal then the mean, the median and the
mode should all be roughly the same.
42
Pagano M,Gauvreau K. Principles Of Biostatistics. 2nd ed. New Delhi:Ceneage Learning India Pvt Ltd; 2000.
Measures Of Central Tendency
43. If the distribution of values is symmetric but bimodal, so that the
corresponding frequency polygon would have two peaks, then the mean
and the median should again be approximately the same.
This common value could lie between the two peaks, and hence be a
measurement that is extremely unlikely to occur.
Here it might be better to report two modes rather than the mean or the
median
43
Pagano M,Gauvreau K. Principles Of Biostatistics. 2nd ed. New Delhi:Ceneage Learning India Pvt Ltd; 2000.
Measures Of Central Tendency
44. When the data are not symmetric, the median is often the best measure of
central tendency.
Because the mean is sensitive to extreme observations, it is pulled in the
direction of the outlying data values.
44
Pagano M,Gauvreau K. Principles Of Biostatistics. 2nd ed. New Delhi:Ceneage Learning India Pvt Ltd; 2000.
Measures Of Central Tendency
45. SKEWNESS
Skewness is a measure of the symmetry of the variable, used to show
the data distribution around its mean.
If a distribution is symmetric, the left half of its graph (histogram or
frequency polygon) will be a mirror image of its right half.
45
Daniel WW, Cross CL. Biostatistics: basic concepts and methodology for the Health Sciences. 10th ed . New Delhi : Wiley; 2014.
Measures Of Central Tendency
46. When the left half and right half of the graph of a distribution are not
mirror images of each other, the distribution is asymmetric.
If the graph (histogram or frequency polygon) of a distribution is
asymmetric, the distribution is said to be skewed .
46
Daniel WW, Cross CL. Biostatistics: basic concepts and methodology for the Health Sciences. 10th ed . New Delhi : Wiley; 2014.
Measures Of Central Tendency
47. If a distribution is not symmetric because its graph extends further to
the right than to the left, that is, if it has a long tail to the right,
positively skewed. its mean is greater than its mode.
47
Daniel WW, Cross CL. Biostatistics: basic concepts and methodology for the Health Sciences. 10th ed . New Delhi : Wiley; 2014.
Measures Of Central Tendency
48. If a distribution is not symmetric because its graph extends further to
the left than to the right, that is, if it has a long tail to the left, negatively
skewed. its mean is less than its mode.
48
Daniel WW, Cross CL. Biostatistics: basic concepts and methodology for the Health Sciences. 10th ed . New Delhi : Wiley; 2014.
Measures Of Central Tendency
49. MEASURES OF DISPERSION
A measure of dispersion conveys information regarding the amount of
variability present in a set of data.
If all the values are the same, there is no dispersion; if they are not all
the same, dispersion is present in the data.
The amount of dispersion may be small when the values, though
different, are close together.
Other terms used synonymously with dispersion include variation,
spread, and scatter.
49
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
Park K. Park's textbook of Preventive and social medicine. 25th ed. India: Bhanot Publishers; 2017.
50. Measures of dispersion includes
Range
Mean Deviation
Variance
Standard deviation
Coefficient of variation.
50
Measures Of Dispersion
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
Park K. Park's textbook of Preventive and social medicine. 25th ed. India: Bhanot Publishers; 2017.
51. Range
The range is by far the simplest measure of dispersion
The range is the difference between the largest and smallest value in a
set of observations. If we denote the range by R, the largest value by
XL, and the smallest value by XS,
R=XL-XS
If we have grouped data, the range is taken as the difference between
the mid-points of the extreme categories.
51
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
Measures Of Dispersion
Park K. Park's textbook of Preventive and social medicine. 25th ed. India: Bhanot Publishers; 2017.
52. Ordinarily in medical practice, the normal range covers the
observations falling in 95% confidence limits.
The main advantage in using the range is the simplicity of its
computation.
As a measure of dispersion the range is severely limited. Since it
depends only on two observations, the lowest and the highest, we will
get misleading idea of dispersion if these values are outliers.
52
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
Measures Of Dispersion
Park K. Park's textbook of Preventive and social medicine. 25th ed. India: Bhanot Publishers; 2017.
53. Eg;The marks of a group of thirty students on two tests.
On test A, the range is 70 − 45 = 25.
On test B, the range is 72−40 = 32, but apart from the outliers, the
distribution of marks on test B is clearly less spread out than that of A.
53
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
Measures Of Dispersion
54. Mean Deviation
If mean blood pressure of a large representative series is taken, some
observations are found above the mean or plus and others are below
the mean or minus.
summing up the differences or deviations from the mean, in any
distribution, the sum of plus and minus differences will be equal and
the net balance will be zero.
54
Measures Of Dispersion
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
55. vertical parallel lines or modulus indicates deviations from the mean
ignoring negative sign or it is taken as positive.
Though simple and easy, mean deviation is not used in statistical
analyses being of less mathematical value, particularly in drawing
inferences.
55
Measures Of Dispersion
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
56. Example : The diastolic blood pressure of
10 individuals was as follows :
83, 75, 81 , 79, 71, 95, 75, 77, 84 ,90.
The mean deviation?
Mean=810/10=81
M.D=56/10=5.6
56
B.P MEAN DEVIATION
83 81 2
75 81 -6
81 81 0
79 81 -2
71 81 -10
95 81 14
75 81 -6
77 81 -4
84 81 3
90 81 9
Measures Of Dispersion
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
Park K. Park's textbook of Preventive and social medicine. 25th ed. India: Bhanot Publishers; 2017.
57. Variance
The variance quantifies the amount of variability, or spread, around the
mean of the measurements.
To accomplish this, we might simply attempt to calculate the average
distance of the individual observations from :X-
It is therefore easy to see that the variance can be described as the
average squared deviation of individual values from the mean of that set.
57
Measures Of Dispersion
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
Park K. Park's textbook of Preventive and social medicine. 25th ed. India: Bhanot Publishers; 2017.
58. In computing the variance of a sample of values the square of each
difference is used to ensure a positive numerator and hence a much
more valuable measure of dispersion.
The Variance When the values of a set of observations lie close to their
mean, the dispersion is less than when they are scattered over a wide
range.
58
Measures Of Dispersion
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
Park K. Park's textbook of Preventive and social medicine. 25th ed. India: Bhanot Publishers; 2017.
59. Standard Deviation (R.M.S deviation)
The variance represents squared units and, therefore, is not an
appropriate measure of dispersion when we wish to express this
concept in terms of the original units.
To obtain a measure of dispersion in original units, we merely take the
square root of the variance. The result is called the standard deviation.
It is the most frequently used
measure of deviation.
59
Measures Of Dispersion
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
Park K. Park's textbook of Preventive and social medicine. 25th ed. India: Bhanot Publishers; 2017.
60. Uses of standard deviation
It summarizes the deviations of a large distribution from mean in one
figure used as a unit of variation.
Indicates whether the variation of difference of an individual from the
mean is by chance, or due to some special reasons.
It also helps in finding the suitable size of sample.
60
Measures Of Dispersion
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
61. B.P Deviation
From Mean
Squared
Deviation
83 2 4
75 -6 36
81 0 -
79 -2 4
71 -10 100
95 14 196
75 -6 36
77 -4 16
84 3 9
90 9 81
Total 482
61
n =10
Mean = 81
Variance(s2
) =(482/9)=53.55
S.D = (variance)1/2
= (482/9)1/2
= 7.31
Measures Of Dispersion
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
62. Coefficient of Variation
When one desires to compare the dispersion in two sets of data,
comparing the two standard deviations may lead to fallacious results. It
may be that the two variables involved are measured in different units.
Needed in situations like these is a measure of relative variation rather
than absolute variation.
Coefficient of variation (CV) is used to compare the variability of one
character in two different groups having different magnitude of values
or two characters in the same group by expressing in percentage.
62
Measures Of Dispersion
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
63. The coefficient of variation is calculated from standard deviation and
mean of the characteristic. The ratio of SD and mean is found in
percentage. Thus SD expressed as percentage of mean is the
coefficient of variation.
Example;
In two series of adults aged 21 years and children 3 months old
following values were obtained for the height.which series shows
greater variation?
Persons Mean hight SD
63
Measures Of Dispersion
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
64. Thus, we find that heights in children show greater variation
than in adults being in the ratio of 8.33%/6.25 or 1.3:1.0
64
Person
s
Mean Height(cm) Standard Deviation
Adults 160 10
children 60 5
CV-adults =(10/160)*100 =6.25%
CV-children =(5/60)*100 =8.33%
Measures Of Dispersion
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
Park K. Park's textbook of Preventive and social medicine. 25th ed. India: Bhanot Publishers; 2017.
65. Degrees of Freedom
The phrase “degrees of freedom” was introduced by Sir Ronald Fisher in
1922 without mentioning its purpose.
Everett (2002) explains degrees of freedom as ;
‘Essentially the term means the number of independent units of information in
a sample relevant to the estimation of a parameter or the calculation of a
statistic.’
65
Eisenhauer JG. Degrees of Freedom. Teaching Statistics. 2008;30(3):75–8.
Measures Of Dispersion
66. A setof independent results x1,x2, . . .xn has n degrees of freedom, but “n–
1” if the mean x- is known, since any one of the xi is now dependent on the
sum of the others.
Eg;sample of n=5.to calculate the sample variance, Suppose we find x-=10.
Since n x-=50,the sum of all five observations equals 50; thus four
observations could be freely altered, but once any four of the observations
are fixed, the final observation is determined by default.
66
Eisenhauer JG. Degrees of Freedom. Teaching Statistics. 2008;30(3):75–8.
Measures Of Dispersion
67. Consequently there are only “n-1” degrees of freedom (df) for use in
calculating the sample variance;the effective sample size has been
reduced to df =n– 1.
Note that a sample of size n retains n degrees of freedom if the population
mean μ is known, since this does not determine xi for
i=1 . . .n if the other(n– 1) values are known.
The concept is of importance in statistical inference since it defines the
effective size of a sample.’
67
Eisenhauer JG. Degrees of Freedom. Teaching Statistics. 2008;30(3):75–8.
Measures Of Dispersion
68. MEASURES OF LOCATION
PERCENTILES AND QUARTILES
The mean and median are special cases of a family of parameters known
as location parameters.
These descriptive measures are called location parameters because they
can be used to designate certain positions on the horizontal axis when the
distribution of a variable is graphed.
In that sense the so-called location parameters “locate” the distribution on
the horizontal axis.
68
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
69. Centiles or percentiles are values in a series of observations
arranged in ascending order of magnitude which divide the distribution
into 100 equal parts.
Given a set of n observations; x1; x2; ... xn
The pth percentile P is the value of X such that p percent or less of the
observations are less than P and (100 –p) percent or less of the
observations are greater than P.
69
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
Measures Of Location
70. Thus, the median is 50th centile. The 50th percentile will have 50%
observations on either side.
Accordingly, 10th percentile should have 10% observations to the left
and 90% to the right.
Eg;If children at age 3½ years form 10th percentile, it means 10% of
entire population is below 3½ years of age and 90% is above that age
70
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
Measures Of Location
71. Quartiles: They are 3 different points located on the entire range of a
variable such as height—Q1, Q2 and Q3.
Q1 or lower quartile will have 25% observations of heights falling on its
left and 75% on its right;
71
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
Measures Of Location
72. 72
Q2 or median will have 50% of
observations on either side.
Q3 will have 75% observations on its left and 25% on its right.
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
Measures Of Location
73. Eg;
Suppose now we have a small data set of twelve observations
15 18 19 20 20 20 21 23 23 24 24 25
73
Measures Of Location
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
74. 15 18 19 20 20 20 21 23 23 24 24 25
To find the first quartile we consider the observations less than the
median.
15 18 19 20 20 20
we consider the observations which are greater than the median.
21 23 23 24 24 25
74
First quartile(Q1)=19.5
Third quartile(Q3)=23.5
Median(Q2)=20.5
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
Measures Of Location
75. To find the observed value corresponding to a given percentile
K=[n*p%]/100 k=largest observation
If “k”is not an integer round it up to next integer.
Eg; n=12(previous example)
75th percentile=(12*75)/100=9
Since k=9,the midpoint between 9th and 10th observations corresponds to the
75th percentile.
75
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
Measures Of Location
76. Interquartile Range
The interquartile range (IQR) is the difference between the third and
first quartiles.
The range provides a crude measure of the variability present in a set of
data.A similar measure that reflects the variability among the middle 50
percent of the observations in a data set is the interquartile range.
76
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
Measures Of Location
77. A large IQR indicates a large amount of variability among the middle 50
percent of the relevant observations,
small IQR indicates a small amount of variability among the relevant
observations.
77
we see that 50% of the area is
between the first and third
quartiles.
This means that 50% of the
observations lie between the first
and third quartiles.
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
Measures Of Location
78. MEASURE OF SHAPE
KURTOSIS
Kurtosis is a measure of the degree to which a distribution is “peaked”
or flat in comparison to a normal distribution whose graph is
characterized by a bell-shaped appearance.
78
Daniel WW, Cross CL. Biostatistics: basic concepts and methodology for the Health Sciences. 10th ed . New Delhi : Wiley; 2014.
79. mesokurtic
A normal, or bell-shaped distribution, is said to be
mesokurtic.
79
Measure Of Shape
Daniel WW, Cross CL. Biostatistics: basic concepts and methodology for the Health Sciences. 10th ed . New Delhi : Wiley; 2014.
80. 80
platykurtic
A distribution, in comparison to a normal distribution, may possesses an
excessive proportion of observations in its tails, so that its graph exhibits a
flattened appearance. Such a distribution is said to be platykurtic.
Measure Of Shape
Daniel WW, Cross CL. Biostatistics: basic concepts and methodology for the Health Sciences. 10th ed . New Delhi : Wiley; 2014.
81. Leptokurtic
Conversely, a distribution, in comparison to a
normal distribution, may possess a smaller proportion of observations in
its tails, so that its graph exhibits a more peaked appearance. Such a
distribution is said to be leptokurtic.
81
Measure Of Shape
Daniel WW, Cross CL. Biostatistics: basic concepts and methodology for the Health Sciences. 10th ed . New Delhi : Wiley; 2014.
82. A perfectly mesokurtic distribution has a kurtosis measure of 3 based
on the equation.
82
Measure Of Shape
Daniel WW, Cross CL. Biostatistics: basic concepts and methodology for the Health Sciences. 10th ed . New Delhi : Wiley; 2014.
83. 83
Most computer algorithms reduce the measure by 3, as is done in Equation
so that the kurtosis measure of a;
Mesokurtic distribution -zero
Leptokurtic distribution -positive
Leptokurtic distribution -negative
Measure Of Shape
Daniel WW, Cross CL. Biostatistics: basic concepts and methodology for the Health Sciences. 10th ed . New Delhi : Wiley; 2014.
84. BOX-AND-WHISKER PLOTS
The box-plot is another way of representing a data set graphically.
It is constructed using the quartiles, and gives a good indication of the
spread of the data set and its symmetry.(or lack of symmetry).
It is a very useful method for comparing two or more data sets.
The box-plot consists of a scale, a box drawn between the first and third
quartile, the median placed within the box, whiskers on both sides of the
box and outliers (if any).
84
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
85. 85
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
Box-and-whisker Plots
86. Steps;
86
Represent the variable of interest on the horizontal axis.
Draw a box in the space above the horizontal axis in such a way that
the left end of the box aligns with the first quartile Q1 and the right end
of the box aligns with the third quartile Q3.
Divide the box into two parts by a vertical line that aligns with the
median Q2.
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
Box-and-whisker Plots
87. 87
Draw a horizontal line called a whisker from the left end of the box to a
point that aligns with the smallest measurement in the data set.
Draw another horizontal line, or whisker, from the right end of the box
to a point that aligns with the largest measurement in the data set.
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
Box-and-whisker Plots
88. Outliers
Outliers are data uncommonly distant from the rest of the sample.
One method for identifying outliers is by calculating percentiles. Outliers
are considered to be those which fall outside the range:
A graphical way to verify the existence of outliers is through the box-and-
whisker diagram or boxplot.
The boxplot represent a box whose edges are 25th and 75th percentiles,
with the median of the data and lines or whiskers that are P25 -1.5IQR
and P75+1.5IQR quantities, identifying as outliers the points that fall
outside this range.
88
Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning Centre, University of Sydney; 1990.
Box-and-whisker Plots
89. Eg;
Step 1:Marks of 21 students out of 100
30 44 45 46 47 48 49
49 50 51 52 53 53 54
54 59 59 61 70 81 81
89
Median (Q2) = 52
First quartile(Q1) = 47.5
Third quartile (Q3) = 59
IQR =11.50
Box-and-whisker Plots
91. SUMMARY
Descriptive statistics are used to summarize data in an organized manner
by describing the charecteristics of variables in a sample or population.
Calculating descriptive statistics should always occur before making
inferential statistical comparisons.
Descriptive statistics include types of variables (nominal, ordinal, interval,
and ratio) as well as measures of frequency, central tendency,
dispersion/variation,location, and shape.
Descriptive statistics condense data into a simpler summary,which makes
the further proceedings much easier.
91
Kaur P, Stoltzfus J, Yellapu V. Descriptive statistics.Int J Acad Med 2018;4:60-63
Perez S, Ruizb M. Descriptive statistics .Allergol Immunopathol.2009;37(6):314–320
92. CONCLUSION
Descriptive statistics form a critical part of initial data analysis
However descriptive statistics does not allow us to make conclusions
beyond the data we have analysed or reach conclusions regarding any
hypotheses we might have made.
But it provide the foundation for comparing variables with inferential
statistical tests.
Therefore, as part of good research practice, it is essential that one report
the most appropriate descriptive statistics using a systematic approach to
reduce the likelihood of presenting misleading results.
92
Kaur P, Stoltzfus J, Yellapu V. Descriptive statistics.Int J Acad Med 2018;4:60-63
Perez S, Ruizb M. Descriptive statistics .Allergol Immunopathol.2009;37(6):314–320
93. PUBLIC HEALTH
SIGNIFICANCE
Accurate, comprehensive, high-quality data and statistics forms core
element of evidence-based public health policy.
By raising health awareness among the general public, they can also help
achieve better social and health outcomes and reduce health inequalities.
Since the results of statistical analysis are fundamental in influencing the
future of public health and health sciences, the appropriate use of
descriptive statistics allow health-care administrators and providers to
more effectively weigh the impact of health policies and programs.
93
Perez S, Ruizb M. Descriptive statistics .Allergol Immunopathol.2009;37(6):314–320
Kaur P, Stoltzfus J, Yellapu V. Descriptive statistics.Int J Acad Med 2018;4:60-63
94. REFERENCE
1. Park K. Park's textbook of Preventive and social medicine. 25th ed. India:
Bhanot Publishers; 2017.
2. Kothari CR. Research Methodology:Methods and Techniques.2nd revised
ed .New Delhi:New Age International Publishers;2004.
3. Peter S.Essentials Of Public Health Dentistry.5th ed.New Delhi:Arya Medi
Publishing House;2013.
4. Mahajan BK.Methods in Biostatistics For Medical Students And Research
Workers.6th ed.New Delhi:Jay pee Publishers;1997.
5. Daniel WW, Cross CL. Biostatistics: basic concepts and methodology for
the Health Sciences. 10th ed . New Delhi : Wiley; 2014.
94
95. 6. Pagano M,Gauvreau K. Principles Of Biostatistics. 2nd ed. New Delhi:Ceneage
Learning India Pvt Ltd; 2000.
7. Kim JS, Dailey RJ. Biostatistics for oral healthcare. Ames: Blackwell
Munksgaard; 2008.
8. Nicholas J. Introduction to descriptive statistics. Sydney: Mathematics Learning
Centre, University of Sydney; 1990.
9. Porta M, Last JM, Greenland S. A Dictionary of Epidemiology. 5th ed. Oxford:
Oxford University Press ; 2008.
95
References
96. 10. Eisenhauer JG. Degrees of Freedom. Teaching Statistics. 2008;30(3):75–
80.
11. Kaur P, Stoltzfus J, Yellapu V. Descriptive statistics.Int J Acad Med
2018;4:60-63
12. Perez S, Ruizb M. Descriptive statistics .Allergol
Immunopathol.2009;37(6):314–320
96
population’ refers to the total of items about which information is desired,sample-a smaller (but hopefully representative) collection of units from a population used to determine truths about that population” (Field,
in that it does not have to indicate a total absence of the quantity being measured
10 km at 60km/hr take 10 mts,10 km at 20 km/hr 30 mts,total 20 km in 40 mts,speed is 30km/hr
The number of values equal to or greater than the median is equal to the number of values equal to or less than the median.
Mean=5.73
A bimodal distribution often indicates that the population from which the values are taken actually consists of two distinct subgroups that differ in the characteristic being measured; in this situation,15-34,>55
It will be in the same units as the original
measurements.
For “n” less than 30
Then it is no longer true that all five observations are free to be replaced by random draws from the larger population.
The sum of the deviations of the values from their mean is equal to zero, as can be
shown. If, then, we know the values of n 1 of the deviations from the mean, we know the
nth one, since it is automatically determined because of the necessity for all n values to add
to zero.
Try examples with 12345
K=2.5 round to 3,
For a given value of x,to find corresponding percentile=[(no of obsrvns<x)+0.5]/100
In finance, kurtosis is used as a measure of financial risk. A large kurtosis is associated with a high level of risk for an investment because it indicates that there are high probabilities of extremely large and extremely small returns. On the other hand, a small kurtosis signals a moderate level of risk because the probabilities of extreme returns are relatively low.
Kurtosis is the fourth moment in statistics.