This document provides an overview of key concepts in statistics and biostatistics, including variables, scales of measurement, types of data, and descriptive and inferential analysis. It defines statistics as the science of collecting, organizing, summarizing, and analyzing numerical data. Biostatistics specifically applies these statistical methods to medical data. Different types of data - nominal, ordinal, discrete, continuous - require different statistical analyses. Descriptive statistics summarize data through measures like mean, median, and standard deviation, while inferential statistics make predictions about larger datasets based on samples. The document outlines appropriate statistical tests and graphs to use for different types of medical data, such as chi-square for categorical variables and t-tests or ANOVA for continuous variables.
3. VARIABLE,SCALE,DATA
• Variable isa characteristics whichvaries and
• Scale is a device on which observations are
taken.
• Data is set of observations/measurements taken
from experiment/survey or external source of a
specific variable using some appropriate
measurement scale
4. What is Statistics?...
A science of:
• Collecting numerical
information (data)
• Evaluating the numerical
information (classify, summarize,
organize, analyze)
•Drawing conclusions based on
evaluation
5. Statisticsand Bio-statistics
Statistics is generally understood as the subject dealing with
number and data, more broadly it involves activities suchas
collection of data from survey or experiment,
summarization or management of data, presentation of
results in a convincing format, analysis of data or drawing
valid inferencesfrom findings.
Whereas Bio-Statistics is science which helps us in managing
medical data with application of statistical
methods/techniques/tools or a collection of statistical
procedures particularly well-suited to the analysis of
healthcare-related data
6. What ismedicaldata?
Thedata whichisrelated to patient careor numerical
information regarding patient’sclinical characteristics,
mortality rate survival rate, diseasedistribution,
prevalenceof disease,efficacy of treatment,and
other suchinformation iscalledmedical data.
7. NATUREOF DATA
• Data is the value you get from observing
(measuring, counting, assessing etc.) from
experiment or survey.
• Data iseither categorical or metric.
• Categorical data is further divided into
Nominal and ordinal,
• Whereas metric into discrete and continuous
(quantitative)data.
10. Types of Data…
Quantitative Data:
There is a natural numeric scale
(can be subdivided into interval and ratio data)
Example:- age, height, weight
Qualitative Data:
Measuring a characteristic for which there is no
natural numeric scale (can be subdivided into
nominal and ordinal data)
Example:- Gender, Eye color
11. Quantitative data...
Discrete Data :
When data is taken from some counting process,
Values are distinct and separate.
Values are invariably whole numbers.
Example: Number of children in a family, number of patients in
different wards, number of nurses, number of hospitals in different cities.
Continuous Data :
When data is taken from some measuring process
Those which have uninterrupted range of values.
Can assume either integral or fractional values.
Example : Height, Weight, Age
12. Qualitative Data…
Nominal data :
To classify characteristics of people, objects or events
into categories.
No meaningful order of classes.
Example: Gender (Male / Female).
Ordinal data (Ranking scale) :
Characteristics can be put into ordered categories.
Example: Socio-economic status (Low/ Medium/ High).
13. Primary Scalesof Measurement
Scale Basic
Characteristics
Common
Examples
Examples Permissible Statistics
Descriptive Inferential
Nominal Numbers identify
&classifyobjects
Social Security
nos., numbering of
football players
Brandnos., store
types
Percentages,
mode
Chi-square,
binomial test
Ordinal Nos.indicate the
relativepositions
of objectsbutnot
the magnitudeof
differences
between them
Quality rankings,
rankingsof teams
in a tournament
Preference
rankings, market
position, social
class
Percentile,
median
Rank-order
correlation
, Friedman
ANOVA
Interval Differences
between objects
Temperature
(Fahrenheit)
Attitudes,
opinions, index
Range, mean,
standard
Product-
moment
Ratio Zeropointis fixed,
ratios of scale
values can be
compared
Length, weight Age, sales,
income, costs
Geometric
mean,harmonic
mean
Coefficient of
variation
14. Nominal Scale
Thenumbersserve only aslabels or tags for identifying and
classifying objects.
When usedfor identification, there isa strict one-to-one
correspondence between the numbersand the objects.
Thenumbersdo not reflect the amountof the characteristic
possessedby the objects.
Theonly permissible operation onthe numbersin a nominal
scale is counting.
Social security number,hockey players number, brands,
attributes, stores and other objects
15. ORDINAL SCALE
• A ranking scale in which numbers are assigned to objects to indicate
therelative extentto whichtheobjectspossess somecharacteristic.
• Can determine whether an object has more or less of a characteristic
thansomeotherobject, but nothow muchmoreor less.
• Any series of numbers can be assigned that preserves the ordered
relationshipsbetween theobjects.
• So relative position of objects not the magnitude of difference
between the objects.
• In addition to the counting operation allowable for nominal scale
data, ordinal scales permit the use of statistics based on percentile,
quartile, median.Possessdescriptionand order, notdistanceor origin
16. INTERVALSCALE
• Numerically equal distances on the scale represent equal
values in the characteristic being measured.
• It permits comparisonof the differences between objects.
• Thedifference between 1 & 2 issameas between 2 & 3
• Thelocation of the zero point isnot fixed.
• Both the zero point and the units of measurement are
arbitrary.
• Everyday temperature scale. Attitudinal data obtained on
rating scales.Donot possessorigin characteristics (zero and
exact measurement)
17. RATIOSCALE
• Thehighest scale that allows to identify objects, rank order of
objects, and compare intervals or differences. It is also
meaningfulto computeratios of scale values
• Possess all the properties of the nominal, ordinal, and interval
scales.
• It hasanabsolutezero point.
• Height, weight, age, money. Sales, costs, market share and
numberof customersare variables measuredona ratio scale
• All statisticaltechniquescanbe applied to ratio data.
18. • After collecting the accurate and reliable data
successfully by using the appropriate method from
the source, the next step is how to extract the
pertinent and usefulinformation buried inthe data
for further manipulationand interpretation.
• Theprocessof performing certain calculations and
evaluation in order to extract relevant information
fromdata iscalled data analysis.
Data Analysis
19. • The data analysis may take several steps to reach
certain conclusions. Simple data can be organized
very easily, while the complex data requires proper
processing.
• The word “processing” means the recasting and
dealing with data makingready for analysis.
Cont……
21. QUESTIONNAIRECHECKING
A questionnaire returned from the field may be
unacceptable for several reasons.
Partsof the questionnaire maybe incomplete.
Thepattern of responsesmayindicate that therespondent did not
understand or follow the instructions.
Theresponsesshowlittle variance.
One or morepages are missing.
Thequestionnaire isreceived after the pre-established cutoff date.
Thequestionnaire isanswered by someonewho doesnot qualify for
participation.
22. DATAPREPARATION
Preparation of datafile
It isimportant toconvertraw data intoa usabledata for
analysis (codingwhere it needed), simply transform
information fromquestionnairetocomputer database
Theanalysis andresultswill surelydependonthequality
of data
Thereare possibilitiesof errorsin handling instruments,
raw data, transcribing, data entry,assigningcodes,values,
value labels
Data needtobecleanedtofulfill theanalysis conditions
24. •One of the first stepsin analyzing data isto
“clean” it of any obviousdata entry errors:
Outliers? (really high or low numbers)
Example: Age = 110 (really 10 or 11?)
•Value entered that doesn’t exist for variable?
Example:2 entered where 1=male, 0=female
•Missing values?
Did the person not give an answer?Was answer
accidentally not entered into the database?
Data cleaning
25. •May be able to setdefined limits whenentering data
Preventsentering a 2 whenonly 1, 0, or missingare acceptable
values
•Univariate data analysis isa usefulway to check the
quality of the data
Cont……
27. Statistical Applications...
Descriptive Statistics
Summarizes or describes the data set at
hand. Evaluate the data set for patterns and
reduce information to a convenient form.
Inferential Statistics
Use sample data to study associations, or to
compare differences or predictions about a
larger set of data.
28. Descriptive
Statistics…
Measures of central tendency are
statistics that summarize a distribution
of scores by reporting the most typical
or representative value of the
distribution.
Measures of dispersion are statistics that
indicate the amount of variety or
heterogeneity in a distribution of scores.
30. Data Analysis May Be Descriptive Or Inferential
Descriptive Contains Mean, Median , Mode, Standard
Deviation, Frequency, Percentage, Range, Percentile
On The Other Hand Confidence Interval, Testing Of
Hypothesis, P-value, ANOVA etc. Related To Inferential
31. UNI-VARIATEDESCRIPTIVEANALYSIS
Graphical Method
For nominal& ordinal data weuseBar or pie chart
Forcontinuousdata weuse histogram
Numerical method
For nominal& ordinal data weuseFrequency/proportions
Forcontinuousdata weuseMean,Standarddeviation
32. Summary Guide
Scale Nominal Ordinal
Displaying data
Histogram
Box-plot
Bar chart, Pie chart Bar chart, Pie chart
Summarizing data
Mean, Median,
SD
Frequency table,
Percentage
s,
Proportion
Frequency table,
Percentage
s,
Proportion
33. Summary Guide for appropriate
analysis for
two variable
Type of
variables
Graphical
display
Relationship
Categorical-
categorical
Multiple bar Contingency
table
Categorical-
Scale
Box-plot Descriptive
statistics
for each group
Scale-scale Scatter plot Correlation