Descriptive statistics -review(2)

•Download as PPT, PDF•

3 likes•1,323 views

This document discusses descriptive statistics for one variable. Descriptive statistics summarize and describe data through measures of central tendency (mean, median, mode), variability (variance, standard deviation), and relative standing (percentiles). The mean is the average value, the median is the middle value, and the mode is the most frequent value. Variance and standard deviation describe how spread out the data is. Percentiles indicate what percentage of values are below a given number. Examples are provided to demonstrate calculating and interpreting these common descriptive statistics.

Technology Economy & Finance

Statistics has two major chapters:
• Descriptive Statistics
• Inferential statistics

Statistics
Descriptive Statistics
• Gives numerical and
graphic procedures to
summarize a collection
of data in a clear and
understandable way
Inferential Statistics
• Provides procedures
to draw inferences
about a population
from a sample

Descriptive Measures
• Central Tendency measures. They are
computed to give a “center” around which the
measurements in the data are distributed.
• Variation or Variability measures. They
describe “data spread” or how far away the
measurements are from the center.
• Relative Standing measures. They describe
the relative position of specific measurements in the
data.

Measures of Central Tendency
• Mean:
Sum of all measurements divided by the number
of measurements.
• Median:
A number such that at most half of the
measurements are below it and at most half of the
measurements are above it.
• Mode:
The most frequent measurement in the data.

Example of Mean
Measurements Deviation
x x - mean
3 -1
5 1
5 1
1 -3
7 3
2 -2
6 2
7 3
0 -4
4 0
40 0
• MEAN = 40/10 = 4
• Notice that the sum of the
“deviations” is 0.
• Notice that every single
observation intervenes in
the computation of the
mean.

Example of Median
• Median: (4+5)/2 =
4.5
• Notice that only the
two central values are
used in the
computation.
• The median is not
sensible to extreme
values
Measurements Measurements
Ranked
x x
3 0
5 1
5 2
1 3
7 4
2 5
6 5
7 6
0 7
4 7
40 40

Example of Mode
Measurements
x
3
5
5
1
7
2
6
7
0
4
• In this case the data have
two modes:
• 5 and 7
• Both measurements are
repeated twice

Example of Mode
Measurements
x
3
5
1
1
4
7
3
8
3
• Mode: 3
• Notice that it is possible for a
data not to have any mode.

Variance (for a sample)
• Steps:
– Compute each deviation
– Square each deviation
– Sum all the squares
– Divide by the data size (sample size) minus
one: n-1

Example of Variance
Measurements Deviations Square of
deviations
x x - mean
3 -1 1
5 1 1
5 1 1
1 -3 9
7 3 9
2 -2 4
6 2 4
7 3 9
0 -4 16
4 0 0
40 0 54
• Variance = 54/9 = 6
• It is a measure of
“spread”.
• Notice that the larger
the deviations (positive
or negative) the larger
the variance

The standard deviation
• It is defines as the square root of the
variance
• In the previous example
• Variance = 6
• Standard deviation = Square root of the
variance = Square root of 6 = 2.45

Percentiles
• The p-the percentile is a number such that at most p%
of the measurements are below it and at most 100 – p
percent of the data are above it.
• Example, if in a certain data the 85th
percentile is 340
means that 15% of the measurements in the data are
above 340. It also means that 85% of the
measurements are below 340
• Notice that the median is the 50th
percentile

For any data
• At least 75% of the measurements differ from the mean
less than twice the standard deviation.
• At least 89% of the measurements differ from the mean
less than three times the standard deviation.
Note: This is a general property and it is called Tchebichev’s Rule: At
least 1-1/k2
of the observation falls within k standard deviations from the
mean. It is true for every dataset.

Example of Tchebichev’s Rule
Suppose that for a certain
data is :
• Mean = 20
• Standard deviation =3
Then:
• A least 75% of the
measurements are
between 14 and 26
• At least 89% of the
measurements are
between 11 and 29

Further Notes
• When the Mean is greater than the Median the
data distribution is skewed to the Right.
• When the Median is greater than the Mean the
data distribution is skewed to the Left.
• When Mean and Median are very close to each
other the data distribution is approximately
symmetric.

What's hot

Descriptive StatisticsBhagya Silva

Descriptive statisticsAttaullah Khan

What is Statisticssidra-098

Descriptive statisticsAnand Thokal

Basic Statistics & Data AnalysisAjendra Sharma

Descriptive statistics iiMohammad Ihmeidan

Descriptive statisticsAileen Balbido

Data Display and SummaryDrZahid Khan

Measures of VariabilityMary Krystle Dawn Sulleza

Descriptive StatisticsNeny Isharyanti

Four data types Data Scientist should knowRanjit Nambisan

Measures of central tendancy Pranav Krishna

VariabilityKaori Kubo Germano, PhD

Inferential statisticsDalia El-Shafei

Measures of VariationSamuel John Parreño

Descriptive statisticsBurak Mızrak

Statisticsitutor

Unit 1 - Statistics (Part 1).pptxMalla Reddy University

Introduction to StatisticsSaurav Shrestha

Central tendencyAndi Koentary

What's hot (20)

Descriptive Statistics

Descriptive statistics

What is Statistics

Descriptive statistics

Basic Statistics & Data Analysis

Descriptive statistics ii

Descriptive statistics

Data Display and Summary

Measures of Variability

Descriptive Statistics

Four data types Data Scientist should know

Measures of central tendancy

Variability

Inferential statistics

Measures of Variation

Descriptive statistics

Statistics

Unit 1 - Statistics (Part 1).pptx

Introduction to Statistics

Central tendency

Similar to Descriptive statistics -review(2)

Statr sessions 4 to 6Ruru Chowdhury

2. chapter ii(analyz)Chhom Karath

Chapter 3 Ken Black 2.pptNurinaSWGotami

determinatiion of University of Balochistan

Measure of Variability Report.pptxCalvinAdorDionisio

IV STATISTICS I.pdfPyaePhyoKoKo2

Ch5-quantitative-data analysis.pptxzerihunnana

State presentation2Lata Bhatta

Dscriptive statisticsDr. Anees Alyafei

Business statisticsRavi Prakash

Statistics for Medical studentsANUSWARUM

3. Statistical Analysis.pptxjeyanthisivakumar

Measures of dispersionMayuri Joshi

Statistics for machine learning shifa noorulainShifaNoorUlAin1

trs-9.pptssuserbbe0b6

Basic statisctis -Anandh ShankarAnandh Shankar Sundararajan

StatisticsManish Runthala

Biostatistics mean median mode unit 1.pptxSailajaReddyGunnam

Descriptive Analysis.pptxParveen Vashisth

Basic Statistical Descriptions of Data.pptxAnusuya123

Similar to Descriptive statistics -review(2) (20)

Statr sessions 4 to 6

2. chapter ii(analyz)

Chapter 3 Ken Black 2.ppt

determinatiion of

Measure of Variability Report.pptx

IV STATISTICS I.pdf

Ch5-quantitative-data analysis.pptx

State presentation2

Dscriptive statistics

Business statistics

Statistics for Medical students

3. Statistical Analysis.pptx

Measures of dispersion

Statistics for machine learning shifa noorulain

trs-9.ppt

Basic statisctis -Anandh Shankar

Statistics

Biostatistics mean median mode unit 1.pptx

Descriptive Analysis.pptx

Basic Statistical Descriptions of Data.pptx

Recently uploaded

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j

GenAI Risks & Security Meetup 01052024.pdflior mazor

Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer

Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays

2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong

Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge

Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko

A Year of the Servo Reboot: Where Are We Now?Igalia

Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun

TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc

presentation ICT roal in 21st century educationjfdjdjcjdnsjd

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays

Real Time Object Detection Using Open CVKhem

HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics

Apidays New York 2024 - The value of a flexible API Management solution for O...apidays

Boost PC performance: How more available memory can improve productivityPrincipled Technologies

What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco

Recently uploaded (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...

GenAI Risks & Security Meetup 01052024.pdf

Axa Assurance Maroc - Insurer Innovation Award 2024

Advantages of Hiring UIUX Design Service Providers for Your Business

[2024]Digital Global Overview Report 2024 Meltwater.pdf

Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe

2024: Domino Containers - The Next Step. News from the Domino Container commu...

Driving Behavioral Change for Information Management through Data-Driven Gree...

Handwritten Text Recognition for manuscripts and early printed texts

A Year of the Servo Reboot: Where Are We Now?

Data Cloud, More than a CDP by Matt Robison

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments

presentation ICT roal in 21st century education

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...

Real Time Object Detection Using Open CV

HTML Injection Attacks: Impact and Mitigation Strategies

Apidays New York 2024 - The value of a flexible API Management solution for O...

Boost PC performance: How more available memory can improve productivity

What Are The Drone Anti-jamming Systems Technology?

Descriptive statistics -review(2)

1. Descriptive Statistics for one variable

2. Statistics has two major chapters: • Descriptive Statistics • Inferential statistics

3. Statistics Descriptive Statistics • Gives numerical and graphic procedures to summarize a collection of data in a clear and understandable way Inferential Statistics • Provides procedures to draw inferences about a population from a sample

4. Descriptive Measures • Central Tendency measures. They are computed to give a “center” around which the measurements in the data are distributed. • Variation or Variability measures. They describe “data spread” or how far away the measurements are from the center. • Relative Standing measures. They describe the relative position of specific measurements in the data.

5. Measures of Central Tendency • Mean: Sum of all measurements divided by the number of measurements. • Median: A number such that at most half of the measurements are below it and at most half of the measurements are above it. • Mode: The most frequent measurement in the data.

6. Example of Mean Measurements Deviation x x - mean 3 -1 5 1 5 1 1 -3 7 3 2 -2 6 2 7 3 0 -4 4 0 40 0 • MEAN = 40/10 = 4 • Notice that the sum of the “deviations” is 0. • Notice that every single observation intervenes in the computation of the mean.

7. Example of Median • Median: (4+5)/2 = 4.5 • Notice that only the two central values are used in the computation. • The median is not sensible to extreme values Measurements Measurements Ranked x x 3 0 5 1 5 2 1 3 7 4 2 5 6 5 7 6 0 7 4 7 40 40

8. Example of Mode Measurements x 3 5 5 1 7 2 6 7 0 4 • In this case the data have two modes: • 5 and 7 • Both measurements are repeated twice

9. Example of Mode Measurements x 3 5 1 1 4 7 3 8 3 • Mode: 3 • Notice that it is possible for a data not to have any mode.

10. Variance (for a sample) • Steps: – Compute each deviation – Square each deviation – Sum all the squares – Divide by the data size (sample size) minus one: n-1

11. Example of Variance Measurements Deviations Square of deviations x x - mean 3 -1 1 5 1 1 5 1 1 1 -3 9 7 3 9 2 -2 4 6 2 4 7 3 9 0 -4 16 4 0 0 40 0 54 • Variance = 54/9 = 6 • It is a measure of “spread”. • Notice that the larger the deviations (positive or negative) the larger the variance

12. The standard deviation • It is defines as the square root of the variance • In the previous example • Variance = 6 • Standard deviation = Square root of the variance = Square root of 6 = 2.45

13. Percentiles • The p-the percentile is a number such that at most p% of the measurements are below it and at most 100 – p percent of the data are above it. • Example, if in a certain data the 85th percentile is 340 means that 15% of the measurements in the data are above 340. It also means that 85% of the measurements are below 340 • Notice that the median is the 50th percentile

14. For any data • At least 75% of the measurements differ from the mean less than twice the standard deviation. • At least 89% of the measurements differ from the mean less than three times the standard deviation. Note: This is a general property and it is called Tchebichev’s Rule: At least 1-1/k2 of the observation falls within k standard deviations from the mean. It is true for every dataset.

15. Example of Tchebichev’s Rule Suppose that for a certain data is : • Mean = 20 • Standard deviation =3 Then: • A least 75% of the measurements are between 14 and 26 • At least 89% of the measurements are between 11 and 29

16. Further Notes • When the Mean is greater than the Median the data distribution is skewed to the Right. • When the Median is greater than the Mean the data distribution is skewed to the Left. • When Mean and Median are very close to each other the data distribution is approximately symmetric.

Descriptive statistics -review(2)

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Descriptive statistics -review(2)

Similar to Descriptive statistics -review(2) (20)

More from Hanimarcelo slideshare

More from Hanimarcelo slideshare (14)

Recently uploaded

Recently uploaded (20)

Descriptive statistics -review(2)