3. Dedicated to
Professor. S. G. Deshmukh
3
Slides mostly from Prof. Deshmukh’s Statistics course at IIT Delhi
4. 4
What is Statistics?
• Science of gathering, analyzing, interpreting, and presenting data,
and drawing conclusions.
• Scientific method that enables us to make decisions as responsibly
as possible.
• Word “statistics” is both: Plural and Singular !
• Plays an important role in every area of decision making
• Often incorrectly thought of as just a collection of data, graphs and
diagrams
AU@IITR
5. 5
Statistics in Business
• Accounting — auditing and cost estimation
• Economics — regional, national, international performance
• Finance — investments and portfolio management
• Management — HR, compensation, and Quality management
• MIS - performance of systems which gather, summarize, and
disseminate information to various managerial levels
• Marketing — market analysis and consumer research
AU@IITR
6. 6
Answers Questions from Everyday Life
• Education: In which B-school I can get the highest RoI?
• Business: Will a new marketing strategy be profitable?
• Industry: Will a product’s life exceed the warranty period?
• Medicine: Will a new vaccine reduce the chance of COVID?
• Government: Will a change in interest rates affect inflation?
AU@IITR
7. 7
Areas of concern: Some examples
• ToI: whether an increase in the subscription price will adversely
affect the number of subscribers.
• Pepsi : whether a celebrity’s advertisements have led to increased
sales
• Ministry of Home Affairs: impact of streamlined procedures for
passport applications
• Supreme Court: whether use of CNG vehicles/ Odd-Even rule has
reduced the level of pollution in Delhi
AU@IITR
8. 8
Statistics all pervading !
• In Cricket (Ex: Records of centuries, wickets etc.)
• In Movies (Ex: imdb.com )
• In Media (Ex: TV serial ratings)
• In Stock market (Ex: Share prices)
• In National Economy (Ex: WPI, Inflation, Growth, etc)
AU@IITR
9. 9
Statistics Day: 29th June
Birth anniversary of great statistician, Prof P C
Mahalonobis
– Founder of Indian Statistical Institute (1931)
– Started Journal Sankhya
– Central Statistical Organization (CSO) for systematization
and collection of administrative data
– National Sample Survey Organization (NSSO) for
conducting large scale surveys to support policy planning
AU@IITR
10. 10
Decision making process..
1. Collect pertinent information that is as reliable as possible.
2. Select the parts of the available information that are most helpful
to make rational decisions.
3. Draw conclusions as sensibly as possible based on the available
evidence.
4. Evaluate the risk and value (performance measures) of alternative
actions.
5. Make the decision
AU@IITR
12. 12
Statistics: Science of variability..?
• Practically everything varies
• Variation occurs among individuals, processes
• Variation also occurs over time
AU@IITR
13. 13
Population Versus Sample
• Population — the whole
– a collection of persons, objects, or items under study
– Census — gathering data from the entire population
• Sample — a portion of the whole
– a subset of the population
– a part of the population from which we collect information, used
to draw conclusions about the whole (statistical inference)
• Why not collect information for the whole population?
AU@IITR
14. 14
Statistics: Two broad categories
• Descriptive Statistics — using data gathered on a group to
describe or reach conclusions about that same group only.
• Inferential Statistics — using sample data to reach conclusions
about the population from which the sample was taken.
AU@IITR
15. 15
Descriptive statistics..
• Encompasses the following:
– Graphical or pictorial display of patterns
– Condensation of large masses of data into a form such as tables
– Preparation of summary measures to give a concise
description of complex information (e.g. an average figure)
AU@IITR
16. 16
Inferential Statistics..
• Encompasses the following:
– Determining whether characteristics of a situation are usual or
unusual (happened by chance) - e.g. SQC
– Estimating values of numerical quantities and determining the
reliability of those estimates – Confidence interval
– Using past occurrences to attempt to predict the future
AU@IITR
17. 17
Types of Studies
• Observational Studies
– Observe individuals and measure variables of interest but do not
attempt to influence the responses.
– Purpose is to describe some group or situation.
– No outside interference, subjects select themselves into groups, cannot
say anything about cause and effect.
• Designed Experiments
– Impose some treatment(s) on individuals or groups of individuals in
order to observe their responses.
– Purpose of an experiment is to study whether the treatment(s) causes a
change in the response.
AU@IITR
18. 18
Examples
• Scientific Surveys
– Central or state government surveys
– Institutional surveys.
– NGO survey
– Commercial survey research firms (IMRB)
• Designed Experiments
– Laboratory experiments
– Clinical Trials
– Field experiments
AU@IITR
19. 19
Discussion Example
• A professor needed some data to illustrate a point. His
favorite student went out into the lobby and asked the first 12
male students who walked by what their height and weight
were.
• What are the limitations of this data set? What could we infer
about the population of all students from this data set?
AU@IITR
20. 20
Discussion Example…
• Population
– Set of all elements of interest in a particular study
– Example: Set of all IIT Roorkee students
• Sample
– A subset of population
– Example: Set of all MBA 1st year students ?
AU@IITR
21. 21
Parameter vs. Statistic
• Parameter — descriptive measure of the population
• Statistic — descriptive measure of a sample
Measurement
Statistic
Roman or lowercase
Parameter
Greek or uppercase
Data Elements x X
Mean x̄ μ
Standard deviation s σ
Variance s2 σ2
Number of elements n N
Correlation Coefficient r ρ
AU@IITR
22. 22
Process of Inferential Statistics
)
(parameter
Population
Sample
x
(statistic )
Calculate x
to estimate
Select a
random sample
AU@IITR
23. 23
What are Data?
• Data: Systematically recorded information together with
context
• Context Tells
• What was measured
• Where data were collected
• When data were collected
• Why study was performed
• How data were collected
Data are useless without context
• Note: Data is plural and datum is singular.
AU@IITR
24. 24
Data...
• Secondary data : Data that has been gathered earlier for
some other purpose
– Sources: Company reports, GoI reports, RBI reports etc.
• Primary data: Data that are collected first hand specifically for
the purpose of facilitating a study
– Sources: Observations, Questionnaire, Interview etc.
AU@IITR
25. 25
Examples of Data available from company
Employee records Name, code, designation, address, salary,
leave,
Production record Item code, quantity produced, labor cost,
material cost
Inventory record Item code, units-on-hand, reorder level,
order quantity
Sales record Product number, volume, volume by
region, category of item etc.
Customer record Age, gender, income level, address,
quantity purchased
AU@IITR
26. 26
Examples of Data available from various Agencies
Reserve Bank of India
www.rbi.org.in
Lending/borrowing rates, financial health of
the country
Census of India
www.censusindia.net
Population figures, demographic details
Centre for Monitoring of
Indian Economy
www.cmie.com
Economic indicators related to Indian
economy, sector-wise performance
Confederation of Indian
Industry
www.cii.in
Business performance, company records etc.
IIT Roorkee
AIS
Academic related data
AU@IITR
27. 27
Levels of Data Measurement
• Nominal — Lowest level of measurement
• Ordinal
• Interval
• Ratio — Highest level of measurement
AU@IITR
28. 28
Nominal Level Data
• Numbers are used to classify or categorize
▪ aka Categorical data
▪ Employment Classification
1 for Professor; 2 for Staff; 3 for Contractual Workers
▪ Gender :”M”, “F”
▪ Degree of a student at IIT Roorkee
1 for B Tech, 2 for M Tech, 3 for M Sc; 4 for MBA, 5 for PhD
AU@IITR
29. 29
Ordinal Level Data
▪ Numbers are used to indicate rank or order
▪ Relative magnitude of numbers is meaningful
▪ Differences between numbers are not comparable
▪ Performance: 5 Excellent, 4 Good, 3 Average, 1 Poor
▪ Position within an organization
▪ 1 President, 2 VP, 3 Plant Manager, 4 Supervisor, 5 labor
1 2 3 4 5
Strongly
Agree
Agree Strongly
Disagree
Disagree
Neutral
AU@IITR
30. 30
Interval Level Data
• Distances between consecutive integers are equal
– Relative magnitude of numbers is meaningful
– Differences between numbers are comparable
– Location of origin, zero, is arbitrary
Examples: Date, Clock time, Monetary Utility, Temperature (degree
F/C)
AU@IITR
31. 31
Ratio Level Data
• Highest level of measurement
– Relative magnitude of numbers is meaningful
– Differences between numbers are comparable
– Location of origin, zero, is absolute (natural)
Examples: Height, Weight, Volume, Profit, Loss, Revenues, Inventory
Turnover
AU@IITR
32. 32
Usage potential of various levels of data
Qualitative /
Categorical
Quantitative /
Numerical
Quantitative variables can also be classified into Discrete & Continuous.
AU@IITR
33. 33
Data Level, Operations, & Statistical Methods
Data Level
Nominal
Ordinal
Interval
Ratio
Meaningful Operations
Classifying and Counting
All of the above plus Ranking
All of the above plus Addition,
Subtraction, Multiplication,
and Division
All of the above
Statistical
Methods
Nonparametric
Nonparametric
Parametric
Parametric
Some control over the measurement scale:
Temperature: Choose degree C/F → Interval. Degree Kelvin → Ratio scale
Income: ask categories (low, medium, high) → Ordinal. Actual income → Ratio
AU@IITR
34. 34
OK to compute Nominal Ordinal Interval Ratio
Frequency distribution Yes Yes Yes Yes
Median and percentiles No Yes Yes Yes
Add or subtract No No Yes Yes
Mean, std deviation, std error
of the mean
No No Yes Yes
Ratios, coefficient of variation No No No Yes
Knowledge of the measurement scale can prevent mistakes
AU@IITR
35. 35
Methods of visual presentation of data:
Graphs & Tables → Book Levin Chapter 2
AU@IITR
36. 36
Can Statistics be trusted?
It is easy to lie with statistics. But it is easier to lie without them.
Frederick Mosteller
Figures won’t lie, but liars will figure.
Charles Grosvenor
There are three kinds of lies: Lies, damned lies, and statistics.
Mark Twain
Science without Statistics bear no fruit,
Statistics without Science have no roots !
AU@IITR