SlideShare a Scribd company logo
1 of 28
Download to read offline
Zahid Mian
Part of the Brown-bag Series
 Population
 Sample
 Variable
 Statistic
 Sample
 Skew
 Mean
 Median
 Mode
 Range
 Percentile
 Variance
 Standard Deviation
 Covariance
 Correlation Coefficient
 Skewness
 Why do we care about CentralTendency?
 What is most valuable to you:
 Average price of home in a neighborhood
 Median price of home …
 Range of prices …
 Mode …
 What does it say about neighborhood if:
 Average price is $500K
 Median price is $350K
 Range is $750K
 𝜇 for Population 𝜇 =
𝑋
𝑁
 𝑥 for Sample 𝑥 =
𝑋
𝑛
 N = 1,2,3,4,4,5,5,5,5,6
 Mean = 4
 The “middle” value from sorted list
 𝑀𝑒𝑑𝑖𝑎𝑛 =
𝑛+1
2
𝑡ℎ
term
 Data: 1,2,3,4,4,5,5,5,5,6 Median: 4.5
 Data: 1,2,3,4,4,5,5,5,5,6,7 Median: 5
 The number that occurs the most
 Data: 1,2,3,4,4,5,5,5,5,6,7
 Mode: 5 (appears 4 times)
> table(c(1,2,3,4,4,5,5,5,5,6,7))
1 2 3 4 5 6 7
1 1 1 2 4 1 1
 Cuts off data by n percent
 Quiz Scores:
67,72,88,82,80,90,95,60,77,89,99,85,77
 What is the score that cuts off 30% of all
scores? 50%?
quantile(c(67,72,88,82,80,90,95,60,77,89,99,85,77), c(.3, .5))
30% 50%
77 82
30% of all scores were 77 or below; 50% of all scores were 82 or
below
 Graph showing data within quartiles
MaxValue
MinValue
Median
Q3 (75%)
Q1 (25%)
 How to see the “dispersion” of data
 Sets of quiz scores for different classes:
 Set1: 80, 79, 80, 81, 80, 80, 79, 79
 Set2: 75, 100, 60, 100, 100, 75, 75, 60
 Just by looking at the number you should be
able to state that Set2 is more dispersed
 standard deviation measures the dispersion
around the mean
 Other measures include range and variance
 difference between the highest and lowest
values in the data
 Data: 1,2,3,4,4,5,5,5,5,6,7
 Range: 6 (7-1)
 Measures dispersion around the mean (how
far from the normal)
 𝜎2
=
(𝑥−𝜇)2
𝑛
 Steps to calculate population variance (𝜎2
)
 Calculate mean
 For each number in set
▪ subtract the mean
▪ Square the result (why square it?)
 Get the average of differences
 Data:
67,72,88,82,80,90,95,6
0,77,89,99,85,77
 Large variance
indicates data is
spread out; small
indicates data is close
to mean.
 For Sample (note n-1)
 𝑠2 =
(𝑥− 𝑥)2
𝑛−1
 Mean: 81.62
 (67-81.62)2 = 213.74
 (72-81.62)2 = 92.54
 …
 (72-81.62)2 = 21.34
 Average of 213.74,
92.54,…, 21.74
 𝜎2 = 123.09
 Simply the square root of variance
 𝜎 = 𝑖(𝑥 𝑖 −𝜇)2
𝑁
 𝜎2
= 123.09
 𝜎= 11.09
 𝜎 is useful in determining what is “normal”
 The mean of scores was 81.62, so most scores
are within 1 𝜎 (+/- 11.09). All scores are within
2 𝜎 (+/- 22.18).
1
𝜎
Mean
2
𝜎
 Measures how two variables (x,y) are linearly
related. Positive value indicates linear relation.
 Test Scores (x):
67,72,88,82,80,90,95,60,77,89,99,85,77
 StudyTime (y):
30,45,80,85,75,85,120,30,45,75,85,110,40
 Is there a relationship betweenTest Scores and
Time spent studying?
 𝜎𝑥𝑦 = 269.43
 What if everyone studied for 30 minutes?
 𝜎𝑥𝑦 = 0 (so no linear relation)
 A normalized measurement of how two
variables are linearly related.
 Sample: 𝑟𝑥𝑦 =
𝑠 𝑥𝑦
𝑠 𝑥 𝑠 𝑦
 Population: 𝜌 𝑥𝑦 =
𝜎 𝑥𝑦
𝜎 𝑥 𝜎 𝑦
 From Previous Example: 𝜌 𝑥𝑦= 0.83 (the closer
this value to 1, the stronger the relationship)
 Intuitively we would say there is no relation
 But be careful …
 Let’s say I have data for sales of ice vs. temp
> cor(temps, sales)
[1] -0.001413245
Correlation Coefficient is nearly 0, so no
relation, right?
The scatter plot
clearly shows a
strong relationship
between sales and
temps. Maybe
when it’s too hot
people just don’t
want to leave the
house.
Always visualize data!
Identical simple statistical properties—so alwaysVisualize!
 Measure of “symmetry” of
data. Negative value
indicates mean is less than
median (left skewed).
Positive value indicates
mean is larger than
median (right skewed).
> skewness(scores)
[1] -0.302365
 Data:
60,62,62,62,65,65,65,
75,82,96,99,100,100
> skewness(scores)
[1] 0.4652821
 This means a few
students did really well
and lifted the overall
mean score.
 Mean, Median, Mode exactly at center
 99.999% of all data within 3 𝜎 of mean
 Important for making inferences
 Test scores are generally normal distribution
 Height of humans follow normal distribution
 Need to be careful not to apply normal
distribution rules against non-normal data
 https://en.wikipedia.org/wiki/Normality_test
 z-Score indicates how far above or below the mean
a given score in the distribution is
 Scenario:Which exam did Scott do better?
 Scott got a 65/100 on Exam1; 𝜇 is 60; 𝜎 is 10
 Scott got a 42/200 on Exam2; 𝜇 is 37; 𝜎 is 5
 First, need to standardize scores (Exam1 is out of
100; Exam2 out of 200)
 This standardization is the z-score
 𝑧 =
𝑟𝑎𝑤 𝑠𝑐𝑜𝑟𝑒 −𝑚𝑒𝑎𝑛
𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛
or 𝑧 =
𝑋−𝜇
𝜎
 z of -1.5 means student
scored 1.5 standard
deviations below the
mean
 In the case of test scores,
positive numbers are good
 Less than 10% scored worse
 Which score marks the
97th percentile?
 What percentage of
population scored
between score1 and
score2 (say 75 and 90)?
 Measurements of CentralTendency and
Variability are critical to study of statistics
 CentralTendency tries to provide information
about the “central” value of your set
 Variability tries to provide information about
the dispersion of data in your set
 Covariance tries to provide information about
how two variables are related
 z-Scores are useful with a normal distribution
Statistics101: Numerical Measures

More Related Content

What's hot

Normal Distribution
Normal DistributionNormal Distribution
Normal DistributionCIToolkit
 
Mean, median, and mode
Mean, median, and modeMean, median, and mode
Mean, median, and modeguest455435
 
Lesson 6 measures of central tendency
Lesson 6 measures of central tendencyLesson 6 measures of central tendency
Lesson 6 measures of central tendencyMaris Ganace
 
Standard deviation
Standard deviationStandard deviation
Standard deviationMai Ngoc Duc
 
Variability, the normal distribution and converted scores
Variability, the normal distribution and converted scoresVariability, the normal distribution and converted scores
Variability, the normal distribution and converted scoresNema Grace Medillo
 
Measures of central tendency
Measures of central tendencyMeasures of central tendency
Measures of central tendencyMmedsc Hahm
 
Measures of central tendency
Measures of central tendencyMeasures of central tendency
Measures of central tendencyDiksha Verma
 
Measures of central tendency ict integration
Measures of central tendency   ict integrationMeasures of central tendency   ict integration
Measures of central tendency ict integrationRichard Paulino
 
Measures of central tendency
Measures of central tendencyMeasures of central tendency
Measures of central tendencyNilanjan Bhaumik
 
Dr digs central tendency
Dr digs central tendencyDr digs central tendency
Dr digs central tendencydrdig
 
Lesson 6 measures of central tendency
Lesson 6 measures of central tendencyLesson 6 measures of central tendency
Lesson 6 measures of central tendencynurun2010
 
Central Tendency Presentation
Central Tendency PresentationCentral Tendency Presentation
Central Tendency Presentationshafiqrimon
 
Measures of central tendency ict integration
Measures of central tendency   ict integrationMeasures of central tendency   ict integration
Measures of central tendency ict integrationRichard Paulino
 
Measurement of central tendency
Measurement of central tendencyMeasurement of central tendency
Measurement of central tendencykalpanaG16
 

What's hot (20)

Normal Distribution
Normal DistributionNormal Distribution
Normal Distribution
 
Statistics
StatisticsStatistics
Statistics
 
Mean, median, and mode
Mean, median, and modeMean, median, and mode
Mean, median, and mode
 
Bbs10 ppt ch03
Bbs10 ppt ch03Bbs10 ppt ch03
Bbs10 ppt ch03
 
Arithmatic Mean
Arithmatic MeanArithmatic Mean
Arithmatic Mean
 
Lesson 6 measures of central tendency
Lesson 6 measures of central tendencyLesson 6 measures of central tendency
Lesson 6 measures of central tendency
 
Standard deviation
Standard deviationStandard deviation
Standard deviation
 
Median
MedianMedian
Median
 
Variability, the normal distribution and converted scores
Variability, the normal distribution and converted scoresVariability, the normal distribution and converted scores
Variability, the normal distribution and converted scores
 
Measures of central tendency
Measures of central tendencyMeasures of central tendency
Measures of central tendency
 
Measures of central tendency
Measures of central tendencyMeasures of central tendency
Measures of central tendency
 
Measures of central tendency ict integration
Measures of central tendency   ict integrationMeasures of central tendency   ict integration
Measures of central tendency ict integration
 
Measures of central tendency
Measures of central tendencyMeasures of central tendency
Measures of central tendency
 
Dr digs central tendency
Dr digs central tendencyDr digs central tendency
Dr digs central tendency
 
Central tendency
Central tendencyCentral tendency
Central tendency
 
Lesson 6 measures of central tendency
Lesson 6 measures of central tendencyLesson 6 measures of central tendency
Lesson 6 measures of central tendency
 
Central Tendency Presentation
Central Tendency PresentationCentral Tendency Presentation
Central Tendency Presentation
 
Measures of central tendency ict integration
Measures of central tendency   ict integrationMeasures of central tendency   ict integration
Measures of central tendency ict integration
 
Measurement of central tendency
Measurement of central tendencyMeasurement of central tendency
Measurement of central tendency
 
Central tendency and Measure of Dispersion
Central tendency and Measure of DispersionCentral tendency and Measure of Dispersion
Central tendency and Measure of Dispersion
 

Viewers also liked

Falownik do silnika jednofazowego - CM-1F-..
Falownik do silnika jednofazowego - CM-1F-..Falownik do silnika jednofazowego - CM-1F-..
Falownik do silnika jednofazowego - CM-1F-..P.H.U. ZAWEX
 
Top 8 coding auditor resume samples
Top 8 coding auditor resume samplesTop 8 coding auditor resume samples
Top 8 coding auditor resume samplesRihannaEminem678
 
Traducción genética
Traducción genéticaTraducción genética
Traducción genéticaNicole Abreu
 
criminal_division_guidance_on_best_practices_for_victim_response_and_reportin...
criminal_division_guidance_on_best_practices_for_victim_response_and_reportin...criminal_division_guidance_on_best_practices_for_victim_response_and_reportin...
criminal_division_guidance_on_best_practices_for_victim_response_and_reportin...Jon Polenberg
 
How is New Innovative Technology going to affect the Future of Retail - Linke...
How is New Innovative Technology going to affect the Future of Retail - Linke...How is New Innovative Technology going to affect the Future of Retail - Linke...
How is New Innovative Technology going to affect the Future of Retail - Linke...George Fairfield
 
Presentation1 intergrated
Presentation1 intergratedPresentation1 intergrated
Presentation1 intergratedEdwin Shankar
 
Kalendarz 2016 of firmy ZAWEX - falowniki - wentylatory - na grzewnice
Kalendarz 2016 of firmy ZAWEX - falowniki - wentylatory - na grzewniceKalendarz 2016 of firmy ZAWEX - falowniki - wentylatory - na grzewnice
Kalendarz 2016 of firmy ZAWEX - falowniki - wentylatory - na grzewniceP.H.U. ZAWEX
 

Viewers also liked (10)

Falownik do silnika jednofazowego - CM-1F-..
Falownik do silnika jednofazowego - CM-1F-..Falownik do silnika jednofazowego - CM-1F-..
Falownik do silnika jednofazowego - CM-1F-..
 
Top 8 coding auditor resume samples
Top 8 coding auditor resume samplesTop 8 coding auditor resume samples
Top 8 coding auditor resume samples
 
Traducción genética
Traducción genéticaTraducción genética
Traducción genética
 
Potter_3301_L2A1
Potter_3301_L2A1Potter_3301_L2A1
Potter_3301_L2A1
 
criminal_division_guidance_on_best_practices_for_victim_response_and_reportin...
criminal_division_guidance_on_best_practices_for_victim_response_and_reportin...criminal_division_guidance_on_best_practices_for_victim_response_and_reportin...
criminal_division_guidance_on_best_practices_for_victim_response_and_reportin...
 
The sharia compliance challenge in
The sharia compliance challenge inThe sharia compliance challenge in
The sharia compliance challenge in
 
How is New Innovative Technology going to affect the Future of Retail - Linke...
How is New Innovative Technology going to affect the Future of Retail - Linke...How is New Innovative Technology going to affect the Future of Retail - Linke...
How is New Innovative Technology going to affect the Future of Retail - Linke...
 
Presentation1 intergrated
Presentation1 intergratedPresentation1 intergrated
Presentation1 intergrated
 
Kalendarz 2016 of firmy ZAWEX - falowniki - wentylatory - na grzewnice
Kalendarz 2016 of firmy ZAWEX - falowniki - wentylatory - na grzewniceKalendarz 2016 of firmy ZAWEX - falowniki - wentylatory - na grzewnice
Kalendarz 2016 of firmy ZAWEX - falowniki - wentylatory - na grzewnice
 
Parenting E book
Parenting E bookParenting E book
Parenting E book
 

Similar to Statistics101: Numerical Measures

best for normal distribution.ppt
best for normal distribution.pptbest for normal distribution.ppt
best for normal distribution.pptDejeneDay
 
statical-data-1 to know how to measure.ppt
statical-data-1 to know how to measure.pptstatical-data-1 to know how to measure.ppt
statical-data-1 to know how to measure.pptNazarudinManik1
 
Mean_Median_Mode.ppthhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh...
Mean_Median_Mode.ppthhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh...Mean_Median_Mode.ppthhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh...
Mean_Median_Mode.ppthhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh...JuliusRomano3
 
Basics of Stats (2).pptx
Basics of Stats (2).pptxBasics of Stats (2).pptx
Basics of Stats (2).pptxmadihamaqbool6
 
Central tendency _dispersion
Central tendency _dispersionCentral tendency _dispersion
Central tendency _dispersionKirti Gupta
 
Sriram seminar on introduction to statistics
Sriram seminar on introduction to statisticsSriram seminar on introduction to statistics
Sriram seminar on introduction to statisticsSriram Chakravarthy
 
Ders 1 mean mod media st dev.pptx
Ders 1 mean mod media st dev.pptxDers 1 mean mod media st dev.pptx
Ders 1 mean mod media st dev.pptxErgin Akalpler
 
Topic 8a Basic Statistics
Topic 8a Basic StatisticsTopic 8a Basic Statistics
Topic 8a Basic StatisticsYee Bee Choo
 
asDescriptive_Statistics2.ppt
asDescriptive_Statistics2.pptasDescriptive_Statistics2.ppt
asDescriptive_Statistics2.pptradha91354
 
Lect 3 background mathematics for Data Mining
Lect 3 background mathematics for Data MiningLect 3 background mathematics for Data Mining
Lect 3 background mathematics for Data Mininghktripathy
 
Lect 3 background mathematics
Lect 3 background mathematicsLect 3 background mathematics
Lect 3 background mathematicshktripathy
 
Descriptive Statistics.pptx
Descriptive Statistics.pptxDescriptive Statistics.pptx
Descriptive Statistics.pptxShashank Mishra
 
Measures of Dispersion .pptx
Measures of Dispersion .pptxMeasures of Dispersion .pptx
Measures of Dispersion .pptxVishal543707
 
Analysing our results
Analysing our resultsAnalysing our results
Analysing our resultsgwsis
 

Similar to Statistics101: Numerical Measures (20)

best for normal distribution.ppt
best for normal distribution.pptbest for normal distribution.ppt
best for normal distribution.ppt
 
statical-data-1 to know how to measure.ppt
statical-data-1 to know how to measure.pptstatical-data-1 to know how to measure.ppt
statical-data-1 to know how to measure.ppt
 
DescriptiveStatistics.pdf
DescriptiveStatistics.pdfDescriptiveStatistics.pdf
DescriptiveStatistics.pdf
 
Mean_Median_Mode.ppthhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh...
Mean_Median_Mode.ppthhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh...Mean_Median_Mode.ppthhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh...
Mean_Median_Mode.ppthhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh...
 
Basics of Stats (2).pptx
Basics of Stats (2).pptxBasics of Stats (2).pptx
Basics of Stats (2).pptx
 
Central tendency _dispersion
Central tendency _dispersionCentral tendency _dispersion
Central tendency _dispersion
 
Sriram seminar on introduction to statistics
Sriram seminar on introduction to statisticsSriram seminar on introduction to statistics
Sriram seminar on introduction to statistics
 
Ders 1 mean mod media st dev.pptx
Ders 1 mean mod media st dev.pptxDers 1 mean mod media st dev.pptx
Ders 1 mean mod media st dev.pptx
 
Statistics
StatisticsStatistics
Statistics
 
Topic 8a Basic Statistics
Topic 8a Basic StatisticsTopic 8a Basic Statistics
Topic 8a Basic Statistics
 
asDescriptive_Statistics2.ppt
asDescriptive_Statistics2.pptasDescriptive_Statistics2.ppt
asDescriptive_Statistics2.ppt
 
Lect 3 background mathematics for Data Mining
Lect 3 background mathematics for Data MiningLect 3 background mathematics for Data Mining
Lect 3 background mathematics for Data Mining
 
Lect 3 background mathematics
Lect 3 background mathematicsLect 3 background mathematics
Lect 3 background mathematics
 
21.StatsLecture.07.ppt
21.StatsLecture.07.ppt21.StatsLecture.07.ppt
21.StatsLecture.07.ppt
 
Descriptive Statistics.pptx
Descriptive Statistics.pptxDescriptive Statistics.pptx
Descriptive Statistics.pptx
 
Descriptive statistics
Descriptive statisticsDescriptive statistics
Descriptive statistics
 
statistics
statisticsstatistics
statistics
 
Descriptive statistics and graphs
Descriptive statistics and graphsDescriptive statistics and graphs
Descriptive statistics and graphs
 
Measures of Dispersion .pptx
Measures of Dispersion .pptxMeasures of Dispersion .pptx
Measures of Dispersion .pptx
 
Analysing our results
Analysing our resultsAnalysing our results
Analysing our results
 

More from zahid-mian

Mongodb Aggregation Pipeline
Mongodb Aggregation PipelineMongodb Aggregation Pipeline
Mongodb Aggregation Pipelinezahid-mian
 
MongoD Essentials
MongoD EssentialsMongoD Essentials
MongoD Essentialszahid-mian
 
Hadoop Technologies
Hadoop TechnologiesHadoop Technologies
Hadoop Technologieszahid-mian
 
Intro to modern cryptography
Intro to modern cryptographyIntro to modern cryptography
Intro to modern cryptographyzahid-mian
 
Hadoop M/R Pig Hive
Hadoop M/R Pig HiveHadoop M/R Pig Hive
Hadoop M/R Pig Hivezahid-mian
 
NoSQL Databases
NoSQL DatabasesNoSQL Databases
NoSQL Databaseszahid-mian
 
Amazon SimpleDB
Amazon SimpleDBAmazon SimpleDB
Amazon SimpleDBzahid-mian
 
C# 6 New Features
C# 6 New FeaturesC# 6 New Features
C# 6 New Featureszahid-mian
 
Introduction to d3js (and SVG)
Introduction to d3js (and SVG)Introduction to d3js (and SVG)
Introduction to d3js (and SVG)zahid-mian
 

More from zahid-mian (9)

Mongodb Aggregation Pipeline
Mongodb Aggregation PipelineMongodb Aggregation Pipeline
Mongodb Aggregation Pipeline
 
MongoD Essentials
MongoD EssentialsMongoD Essentials
MongoD Essentials
 
Hadoop Technologies
Hadoop TechnologiesHadoop Technologies
Hadoop Technologies
 
Intro to modern cryptography
Intro to modern cryptographyIntro to modern cryptography
Intro to modern cryptography
 
Hadoop M/R Pig Hive
Hadoop M/R Pig HiveHadoop M/R Pig Hive
Hadoop M/R Pig Hive
 
NoSQL Databases
NoSQL DatabasesNoSQL Databases
NoSQL Databases
 
Amazon SimpleDB
Amazon SimpleDBAmazon SimpleDB
Amazon SimpleDB
 
C# 6 New Features
C# 6 New FeaturesC# 6 New Features
C# 6 New Features
 
Introduction to d3js (and SVG)
Introduction to d3js (and SVG)Introduction to d3js (and SVG)
Introduction to d3js (and SVG)
 

Recently uploaded

专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPTBoston Institute of Analytics
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsVICTOR MAESTRE RAMIREZ
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfchwongval
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 

Recently uploaded (20)

专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default  Presentation : Data Analysis Project PPTPredictive Analysis for Loan Default  Presentation : Data Analysis Project PPT
Predictive Analysis for Loan Default Presentation : Data Analysis Project PPT
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 
Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Advanced Machine Learning for Business Professionals
Advanced Machine Learning for Business ProfessionalsAdvanced Machine Learning for Business Professionals
Advanced Machine Learning for Business Professionals
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
Multiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdfMultiple time frame trading analysis -brianshannon.pdf
Multiple time frame trading analysis -brianshannon.pdf
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 

Statistics101: Numerical Measures

  • 1. Zahid Mian Part of the Brown-bag Series
  • 2.  Population  Sample  Variable  Statistic  Sample  Skew
  • 3.  Mean  Median  Mode  Range  Percentile  Variance  Standard Deviation  Covariance  Correlation Coefficient  Skewness
  • 4.  Why do we care about CentralTendency?  What is most valuable to you:  Average price of home in a neighborhood  Median price of home …  Range of prices …  Mode …  What does it say about neighborhood if:  Average price is $500K  Median price is $350K  Range is $750K
  • 5.  𝜇 for Population 𝜇 = 𝑋 𝑁  𝑥 for Sample 𝑥 = 𝑋 𝑛  N = 1,2,3,4,4,5,5,5,5,6  Mean = 4
  • 6.  The “middle” value from sorted list  𝑀𝑒𝑑𝑖𝑎𝑛 = 𝑛+1 2 𝑡ℎ term  Data: 1,2,3,4,4,5,5,5,5,6 Median: 4.5  Data: 1,2,3,4,4,5,5,5,5,6,7 Median: 5
  • 7.  The number that occurs the most  Data: 1,2,3,4,4,5,5,5,5,6,7  Mode: 5 (appears 4 times) > table(c(1,2,3,4,4,5,5,5,5,6,7)) 1 2 3 4 5 6 7 1 1 1 2 4 1 1
  • 8.  Cuts off data by n percent  Quiz Scores: 67,72,88,82,80,90,95,60,77,89,99,85,77  What is the score that cuts off 30% of all scores? 50%? quantile(c(67,72,88,82,80,90,95,60,77,89,99,85,77), c(.3, .5)) 30% 50% 77 82 30% of all scores were 77 or below; 50% of all scores were 82 or below
  • 9.  Graph showing data within quartiles MaxValue MinValue Median Q3 (75%) Q1 (25%)
  • 10.  How to see the “dispersion” of data  Sets of quiz scores for different classes:  Set1: 80, 79, 80, 81, 80, 80, 79, 79  Set2: 75, 100, 60, 100, 100, 75, 75, 60  Just by looking at the number you should be able to state that Set2 is more dispersed  standard deviation measures the dispersion around the mean  Other measures include range and variance
  • 11.  difference between the highest and lowest values in the data  Data: 1,2,3,4,4,5,5,5,5,6,7  Range: 6 (7-1)
  • 12.  Measures dispersion around the mean (how far from the normal)  𝜎2 = (𝑥−𝜇)2 𝑛  Steps to calculate population variance (𝜎2 )  Calculate mean  For each number in set ▪ subtract the mean ▪ Square the result (why square it?)  Get the average of differences
  • 13.  Data: 67,72,88,82,80,90,95,6 0,77,89,99,85,77  Large variance indicates data is spread out; small indicates data is close to mean.  For Sample (note n-1)  𝑠2 = (𝑥− 𝑥)2 𝑛−1  Mean: 81.62  (67-81.62)2 = 213.74  (72-81.62)2 = 92.54  …  (72-81.62)2 = 21.34  Average of 213.74, 92.54,…, 21.74  𝜎2 = 123.09
  • 14.  Simply the square root of variance  𝜎 = 𝑖(𝑥 𝑖 −𝜇)2 𝑁  𝜎2 = 123.09  𝜎= 11.09  𝜎 is useful in determining what is “normal”  The mean of scores was 81.62, so most scores are within 1 𝜎 (+/- 11.09). All scores are within 2 𝜎 (+/- 22.18).
  • 16.  Measures how two variables (x,y) are linearly related. Positive value indicates linear relation.  Test Scores (x): 67,72,88,82,80,90,95,60,77,89,99,85,77  StudyTime (y): 30,45,80,85,75,85,120,30,45,75,85,110,40  Is there a relationship betweenTest Scores and Time spent studying?  𝜎𝑥𝑦 = 269.43  What if everyone studied for 30 minutes?  𝜎𝑥𝑦 = 0 (so no linear relation)
  • 17.  A normalized measurement of how two variables are linearly related.  Sample: 𝑟𝑥𝑦 = 𝑠 𝑥𝑦 𝑠 𝑥 𝑠 𝑦  Population: 𝜌 𝑥𝑦 = 𝜎 𝑥𝑦 𝜎 𝑥 𝜎 𝑦  From Previous Example: 𝜌 𝑥𝑦= 0.83 (the closer this value to 1, the stronger the relationship)
  • 18.  Intuitively we would say there is no relation  But be careful …  Let’s say I have data for sales of ice vs. temp > cor(temps, sales) [1] -0.001413245 Correlation Coefficient is nearly 0, so no relation, right?
  • 19. The scatter plot clearly shows a strong relationship between sales and temps. Maybe when it’s too hot people just don’t want to leave the house. Always visualize data!
  • 20. Identical simple statistical properties—so alwaysVisualize!
  • 21.  Measure of “symmetry” of data. Negative value indicates mean is less than median (left skewed). Positive value indicates mean is larger than median (right skewed). > skewness(scores) [1] -0.302365
  • 22.  Data: 60,62,62,62,65,65,65, 75,82,96,99,100,100 > skewness(scores) [1] 0.4652821  This means a few students did really well and lifted the overall mean score.
  • 23.
  • 24.  Mean, Median, Mode exactly at center  99.999% of all data within 3 𝜎 of mean  Important for making inferences  Test scores are generally normal distribution  Height of humans follow normal distribution  Need to be careful not to apply normal distribution rules against non-normal data  https://en.wikipedia.org/wiki/Normality_test
  • 25.  z-Score indicates how far above or below the mean a given score in the distribution is  Scenario:Which exam did Scott do better?  Scott got a 65/100 on Exam1; 𝜇 is 60; 𝜎 is 10  Scott got a 42/200 on Exam2; 𝜇 is 37; 𝜎 is 5  First, need to standardize scores (Exam1 is out of 100; Exam2 out of 200)  This standardization is the z-score  𝑧 = 𝑟𝑎𝑤 𝑠𝑐𝑜𝑟𝑒 −𝑚𝑒𝑎𝑛 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 or 𝑧 = 𝑋−𝜇 𝜎
  • 26.  z of -1.5 means student scored 1.5 standard deviations below the mean  In the case of test scores, positive numbers are good  Less than 10% scored worse  Which score marks the 97th percentile?  What percentage of population scored between score1 and score2 (say 75 and 90)?
  • 27.  Measurements of CentralTendency and Variability are critical to study of statistics  CentralTendency tries to provide information about the “central” value of your set  Variability tries to provide information about the dispersion of data in your set  Covariance tries to provide information about how two variables are related  z-Scores are useful with a normal distribution