SlideShare a Scribd company logo
1 of 19
Lecture #3: Making Data
Sets into Tables and
Graphs
Organizing Data
• Looking at data can be overwhelming
• There is a lot of raw (unsorted) data
• It’s important to organize the data for
summarization
• One way to summarize data is to use
tables and graphs
• To do this, we’ll need to consider the
qualities of the variable under
consideration
• Qualitative or Quantitative
• If Quantitative:
• What level of measurement?
• Discrete or continuous?
ZIP Codes
Recall that a ZIP Code is a
quantitative variable at the
nominal level of
measurement (almost like
it’s a categorical variable)
since there is no real order
to ZIP codes.
We’ll organize the ZIP codes
by reading down the list
and making a tally mark
next to the headers
(creating new headers as
we find new ZIP codes)
Class Tally Frequency (f) Relative Frequency (
𝒇
𝒏
)
95458 || 2 1.96%
95490 |||| |||| 9 8.82%
95469 ||| 3 2.94%
95482 |||| |||| |||| |||| |||| |||| |||| |||| |||| ||||
|||| |||| ||
62 60.78%
95461 | 1 0.98%
95470 |||| |||| 10 9.8%
95415 || 2 1.96%
95428 | 1 0.98%
95485 | 1 0.98%
95449 || 2 1.96%
95451 || 2 1.96%
95425 | 1 0.98%
95454 | 1 0.98%
95463 | 1 0.98%
95437 || 2 1.96%
95453 || 2 1.96%
𝑛 = 𝑓 = 102
𝑓
𝑛
=99.98%
ZIP Codes
Recall that a ZIP Code is a
quantitative variable at the
nominal level of
measurement (almost like
it’s a categorical variable)
since there is no real order
to ZIP codes.
We’ll organize the ZIP codes
by reading down the list
and making a tally mark
next to the headers
(creating new headers as
we find new ZIP codes)
Class Tally Frequency (f) Relative Frequency (
𝒇
𝒏
)
95458 || 2 1.96%
95490 |||| |||| 9 8.82%
95469 ||| 3 2.94%
95482 |||| |||| |||| |||| |||| |||| |||| |||| |||| ||||
|||| |||| ||
62 60.78%
95461 | 1 0.98%
95470 |||| |||| 10 9.8%
95415 || 2 1.96%
95428 | 1 0.98%
95485 | 1 0.98%
95449 || 2 1.96%
95451 || 2 1.96%
95425 | 1 0.98%
95454 | 1 0.98%
95463 | 1 0.98%
95437 || 2 1.96%
95453 || 2 1.96%
𝑛 = 𝑓 = 102
𝑓
𝑛
=99.98%
Women’s Heights
It would be helpful to actually look at data grouped, instead of just ‘as it is’
◦ This is especially true since the order here has meaning
◦ Also, we don’t want to look at each height separately (too many numbers)
◦ We’ll group the heights into a small enough number of groups that we can see any patterns that exist
◦ We’ll do this by making a Grouped Frequency Distribution Table
How we’ll we group them?
◦ Pick two numbers
◦ Lower Limit of the First Class and Class Width
◦ If we all use these number, and use them correctly, we’ll get identical tables.
Lower Limit of the First Class
◦ This is the smallest number we’re going to tally
Women’s Heights
Lower Limit of the First Class
◦ This is the smallest number we’re going to tally
◦ It must be either the height of the shortest person, or an even smaller height
Class Width
◦ This is how many separate heights are in each class
◦ It is also the difference between the lower limit of successive classes
Let’s use 57 as the lower limit of the first class and 2 as the class width
This gives us class limits of 57-58, 59-60, 61-62, 63-64, 65-66, 67-68, 69-70, 71-72 (we can stop
here because nobody was taller than 72 inches)
Women’s Heights
Before we tally the heights up, let’s address an issue that comes up when the variable is
continuous (as height is)
What if were rounding to something finer than whole inches?
What if a person is actually 63.6 inches tall?
We don’t really want gaps between our classes, so we use something called Class Boundaries
Class Boundaries
◦ Split the difference between the upper class limit of one class and the lower class limit of the next
higher class
◦ The boundary between the first two classes is 58.5, halfway from 58 to 59, and it is the both the upper
class boundary of the first class and the lower class boundary of the second class.
Women’s Heights
Class Limits Class Boundaries Tally Frequency (f) Relative Frequency (
𝒇
𝒏
)
57-58 56.5-58.5 | 1 0.02
59-60 58.5-60.5 |||| || 7 0.12
61-62 60.5-62.5 |||| |||| || 12 0.20
63-64 62.5-64.5 |||| |||| |||| 14 .023
65-66 64.5-66.5 |||| |||| |||| 15 0.25
67-68 66.5-68.5 |||| 5 0.08
69-70 68.5-70.5 |||| 4 0.07
71-72 70.5-72.5 || 2 0.03
𝑛 = 𝑓 = 60
𝑓
𝑛
= 1
Graphs
Graphs are a great way to demonstrate data
◦ They help us to look for patterns
◦ There are different ways of displaying the data
◦ Today, we’ll consider the following:
◦ Pareto chart
◦ Pie chart
◦ Histogram
◦ Scatterplot
Pareto Chart
If you are dealing with categorical data
or quantitative data at the nominal
level of measurement, the Pareto chart
gives a very good picture
◦ A Pareto chart is a bar graph whose bars:
◦ Do not touch (usually)
◦ Are arranged from the class of the largest
frequency to the smallest frequency
◦ Can be arranged vertically or horizontally
◦ Often relative frequencies are used, but here is
an example using Men’s Zip Codes just using
regular frequencies
0
5
10
15
20
25
95482 95490 95470 95458 95469 95461 95415 95428 95485 95453
Frequency Men's ZIP Codes
Pareto Chart
Looking at this image, what jumps out
at you?
◦ Can you see why this is a good choice to
present the data?
◦ It is obvious which is the most common
ZIP code
◦ You may note that there are two
different ZIP codes that have 4 men
living in them, and two with 2 men, and
5 with 1 man
◦ When the category is tied, the order doesn’t
matter, so long as they stay descending in
frequency0
5
10
15
20
25
95482 95490 95470 95458 95469 95461 95415 95428 95485 95453
Frequency
Men's ZIP Codes
Pie chart
Another way to present qualitative and quantitative variables at the nominal level of
measurement is with the Pie chart
◦ A Pie chart is a graph which represents 100% of the data being looked at
◦ You “slice” the “pie” so that the size of each piece indicates the size of the frequencies of the different
categories
◦ We determine the size of the piece by what is called the central angle
◦ The central angle is the angle made by the two edges of the slice (assuming you started from the exact
center of the pie)
Central
Angle
Pie chart
We measure angles in degrees, and there are 360 of them
around the center of the pie
We need to determine how many degrees to make the central
angle for the slice representing each class
◦
𝑓
𝑛
∗ 360
◦ This splits up the central angles precisely proportionately to the
frequency of the classes.
◦ 95482:
25
42
∗ 360 ≈ 214°
◦ 95490, 95470:
4
42
∗ 360 ≈ 34°
◦ 95458, 95469:
2
42
∗ 360 ≈ 17°
◦ 95461, 95415, 95428, 95485, 95453:
1
42
∗ 360 ≈ 9°
◦ These angles total up to 361°
95482
95490
95470
95458
95469
95461
95415
95428
95485
95453
Men's ZIP Codes
Histograms
Neither the Pareto chart nor the Pie chart
is suitable for variables at the interval and
ratio levels of measurement
You can’t put them in any order in the chart
The best way to convey the data from these
types of variables is with a Histogram
A Histogram is a bar graph in which the bars
touch
 Thus we use the class boundaries when we mark off the
scale on the horizontal axis
As with the Pareto chart, the vertical axis
shows the frequencies
 Be sure the frequency scale goes high enough to
accommodate the class with the greatest frequency, but
not too much higher
Histograms
Neither the Pareto chart nor the Pie chart
is suitable for variables at the interval and
ratio levels of measurement
One advantage of a histogram is that it can
readily display large data sets
The histogram can give you the shape of the
data, the center, and the spread of the data
Here is an example of a previous classes
women’s heights divided into 8 classes
You’ll note the ‘break’ in the horizontal axis;
that is to show that the axis has been
interrupted
 This is proper to show, and not always done
Histograms
Neither the Pareto chart nor the Pie chart
is suitable for variables at the interval and
ratio levels of measurement
You’ll also note that once you have shows
that the horizontal axis is not perfectly to
scale, you should choose where to start
(where to place the 56.5) and then
everything else has been decided from
there!
Bivariate Data (Two variables)
Sometimes we want to look at two
variables at once
Bivariate – Two Variables
Suppose we want to study the connection
between people’s ages and the number of
pets they have
 Here, the ordered pair is (age, # of pets)
 (19, 2), (23, 2), (18, 4), (18, 2), (28, 0), (19, 3), (37, 1),
(20, 0), (34, 0), (40, 1), (18, 27), (19, 0), (18, 2), (18, 1),
(18, 4), (20, 1), (19, 3), (26, 2), (23, 2), (29, 1), (23, 0),
(19, 5), (19, 10), (29, 0), (19, 2), (19, 0)
This is called a Scatter Plot
Each point on a Scatter Plot gives us two
pieces of data about a single member of the
sample, one datum for each variable
 Are there any data points that seem odd?
 It’s easy to see that (18, 27) is an outlier; are there
others?
 Let’s take this one out and see what things look like
now…
0
5
10
15
20
25
30
15 20 25 30 35 40 45
Bivariate Data (Two variables)
This allows us to see what kind of variability is going on a little bit easier
Was this ‘OK’ to do?
 Outliers happen, sometimes from mistakes, sometimes simply because they do exist
 You should note that you have removed an outlier to look at the data
0
2
4
6
8
10
12
15 20 25 30 35 40 45
The horizontal variable is the x-variable, and
it’s sometimes called the independent
variable.
The vertical variable is the y-variable, and it’s
sometimes called the dependent variable
Note: This terminology is not meant to imply
that the one causes the other
Activity: Making a grouped frequency
distribution table
Construct a grouped frequency distribution table for the heights of the men in the Class Data
Base, using 60 as the lower limit of the first class and 3 inches as the class width. Have columns
for the Class Limits, the Class Boundaries, the Tally, the Frequency, and the Relative Frequency to
the nearest hundredth.

More Related Content

What's hot

Variance and standard deviation
Variance and standard deviationVariance and standard deviation
Variance and standard deviationAmrit Swaroop
 
Skewness and Kurtosis
Skewness and KurtosisSkewness and Kurtosis
Skewness and KurtosisRohan Nagpal
 
UNIT III -Measures of Dispersion (2) (1).ppt
UNIT III -Measures of Dispersion (2) (1).pptUNIT III -Measures of Dispersion (2) (1).ppt
UNIT III -Measures of Dispersion (2) (1).pptMalihAz2
 
Basic Descriptive Statistics
Basic Descriptive StatisticsBasic Descriptive Statistics
Basic Descriptive Statisticssikojp
 
Graphical Representation of data
Graphical Representation of dataGraphical Representation of data
Graphical Representation of dataJijo K Mathew
 
Displaying Distributions with Graphs
Displaying Distributions with GraphsDisplaying Distributions with Graphs
Displaying Distributions with Graphsnszakir
 
IB Chemistry on Uncertainty, Error Analysis, Random and Systematic Error
IB Chemistry on Uncertainty, Error Analysis, Random and Systematic ErrorIB Chemistry on Uncertainty, Error Analysis, Random and Systematic Error
IB Chemistry on Uncertainty, Error Analysis, Random and Systematic ErrorLawrence kok
 
Area maths sola (mission impossible)
Area   maths sola (mission impossible)Area   maths sola (mission impossible)
Area maths sola (mission impossible)Dan_TS
 
Ppt accuracy precisionsigfigs 2014 fridays notes
Ppt accuracy precisionsigfigs 2014 fridays notesPpt accuracy precisionsigfigs 2014 fridays notes
Ppt accuracy precisionsigfigs 2014 fridays notesmantlfin
 
IB Chemistry on Uncertainty, Error Analysis, Random and Systematic Error
IB Chemistry on Uncertainty, Error Analysis, Random and Systematic ErrorIB Chemistry on Uncertainty, Error Analysis, Random and Systematic Error
IB Chemistry on Uncertainty, Error Analysis, Random and Systematic ErrorLawrence kok
 
IB Chemistry on Uncertainty calculation and significant figures
IB Chemistry on Uncertainty calculation and significant figuresIB Chemistry on Uncertainty calculation and significant figures
IB Chemistry on Uncertainty calculation and significant figuresLawrence kok
 
Real No+Significant Figures
Real No+Significant FiguresReal No+Significant Figures
Real No+Significant FiguresAwais Khan
 
Quantity notes pdf
Quantity notes pdfQuantity notes pdf
Quantity notes pdfSaqib Imran
 

What's hot (19)

Descriptive Statistics
Descriptive StatisticsDescriptive Statistics
Descriptive Statistics
 
Circle Graphs
Circle GraphsCircle Graphs
Circle Graphs
 
Variance and standard deviation
Variance and standard deviationVariance and standard deviation
Variance and standard deviation
 
Skewness and Kurtosis
Skewness and KurtosisSkewness and Kurtosis
Skewness and Kurtosis
 
UNIT III -Measures of Dispersion (2) (1).ppt
UNIT III -Measures of Dispersion (2) (1).pptUNIT III -Measures of Dispersion (2) (1).ppt
UNIT III -Measures of Dispersion (2) (1).ppt
 
Line Graph
Line GraphLine Graph
Line Graph
 
Statr sessions 4 to 6
Statr sessions 4 to 6Statr sessions 4 to 6
Statr sessions 4 to 6
 
Basic Descriptive Statistics
Basic Descriptive StatisticsBasic Descriptive Statistics
Basic Descriptive Statistics
 
Graphical Representation of data
Graphical Representation of dataGraphical Representation of data
Graphical Representation of data
 
Displaying Distributions with Graphs
Displaying Distributions with GraphsDisplaying Distributions with Graphs
Displaying Distributions with Graphs
 
IB Chemistry on Uncertainty, Error Analysis, Random and Systematic Error
IB Chemistry on Uncertainty, Error Analysis, Random and Systematic ErrorIB Chemistry on Uncertainty, Error Analysis, Random and Systematic Error
IB Chemistry on Uncertainty, Error Analysis, Random and Systematic Error
 
Area maths sola (mission impossible)
Area   maths sola (mission impossible)Area   maths sola (mission impossible)
Area maths sola (mission impossible)
 
Ppt accuracy precisionsigfigs 2014 fridays notes
Ppt accuracy precisionsigfigs 2014 fridays notesPpt accuracy precisionsigfigs 2014 fridays notes
Ppt accuracy precisionsigfigs 2014 fridays notes
 
IB Chemistry on Uncertainty, Error Analysis, Random and Systematic Error
IB Chemistry on Uncertainty, Error Analysis, Random and Systematic ErrorIB Chemistry on Uncertainty, Error Analysis, Random and Systematic Error
IB Chemistry on Uncertainty, Error Analysis, Random and Systematic Error
 
IB Chemistry on Uncertainty calculation and significant figures
IB Chemistry on Uncertainty calculation and significant figuresIB Chemistry on Uncertainty calculation and significant figures
IB Chemistry on Uncertainty calculation and significant figures
 
Real No+Significant Figures
Real No+Significant FiguresReal No+Significant Figures
Real No+Significant Figures
 
Descriptive statistics -review(2)
Descriptive statistics -review(2)Descriptive statistics -review(2)
Descriptive statistics -review(2)
 
Diagrams
DiagramsDiagrams
Diagrams
 
Quantity notes pdf
Quantity notes pdfQuantity notes pdf
Quantity notes pdf
 

Similar to Lecture 3 making data sets into tables and graphs

Business Statistics Chapter 2
Business Statistics Chapter 2Business Statistics Chapter 2
Business Statistics Chapter 2Lux PP
 
Introduction To Data Science Using R
Introduction To Data Science Using RIntroduction To Data Science Using R
Introduction To Data Science Using RANURAG SINGH
 
Intro to data science
Intro to data scienceIntro to data science
Intro to data scienceANURAG SINGH
 
Lect 3 background mathematics
Lect 3 background mathematicsLect 3 background mathematics
Lect 3 background mathematicshktripathy
 
Penggambaran Data Secara Numerik
Penggambaran Data Secara NumerikPenggambaran Data Secara Numerik
Penggambaran Data Secara Numerikanom1392
 
Graphs, charts, and tables ppt @ bec doms
Graphs, charts, and tables ppt @ bec domsGraphs, charts, and tables ppt @ bec doms
Graphs, charts, and tables ppt @ bec domsBabasab Patil
 
2.3 Graphs that enlighten and graphs that deceive
2.3 Graphs that enlighten and graphs that deceive2.3 Graphs that enlighten and graphs that deceive
2.3 Graphs that enlighten and graphs that deceiveLong Beach City College
 
Lect 3 background mathematics for Data Mining
Lect 3 background mathematics for Data MiningLect 3 background mathematics for Data Mining
Lect 3 background mathematics for Data Mininghktripathy
 
Types of graphs
Types of graphsTypes of graphs
Types of graphsLALIT BIST
 
2 biostatistics presenting data
2  biostatistics presenting data2  biostatistics presenting data
2 biostatistics presenting dataDr. Nazar Jaf
 
Exploratory Data Analysis week 4
Exploratory Data Analysis week 4Exploratory Data Analysis week 4
Exploratory Data Analysis week 4Manzur Ashraf
 
graphic representations in statistics
 graphic representations in statistics graphic representations in statistics
graphic representations in statisticsUnsa Shakir
 
Statistics (Measures of Dispersion)
Statistics (Measures of Dispersion)Statistics (Measures of Dispersion)
Statistics (Measures of Dispersion)Ron_Eick
 
measure of variability (windri). In research include example
measure of variability (windri). In research include examplemeasure of variability (windri). In research include example
measure of variability (windri). In research include examplewindri3
 

Similar to Lecture 3 making data sets into tables and graphs (20)

Business Statistics Chapter 2
Business Statistics Chapter 2Business Statistics Chapter 2
Business Statistics Chapter 2
 
Chapter3
Chapter3Chapter3
Chapter3
 
Stats chapter 1
Stats chapter 1Stats chapter 1
Stats chapter 1
 
Introduction To Data Science Using R
Introduction To Data Science Using RIntroduction To Data Science Using R
Introduction To Data Science Using R
 
Intro to data science
Intro to data scienceIntro to data science
Intro to data science
 
Intro to Statistics.pptx
Intro to Statistics.pptxIntro to Statistics.pptx
Intro to Statistics.pptx
 
Lect 3 background mathematics
Lect 3 background mathematicsLect 3 background mathematics
Lect 3 background mathematics
 
Penggambaran Data Secara Numerik
Penggambaran Data Secara NumerikPenggambaran Data Secara Numerik
Penggambaran Data Secara Numerik
 
Graphs, charts, and tables ppt @ bec doms
Graphs, charts, and tables ppt @ bec domsGraphs, charts, and tables ppt @ bec doms
Graphs, charts, and tables ppt @ bec doms
 
2.3 Graphs that enlighten and graphs that deceive
2.3 Graphs that enlighten and graphs that deceive2.3 Graphs that enlighten and graphs that deceive
2.3 Graphs that enlighten and graphs that deceive
 
Eda sri
Eda sriEda sri
Eda sri
 
Lect 3 background mathematics for Data Mining
Lect 3 background mathematics for Data MiningLect 3 background mathematics for Data Mining
Lect 3 background mathematics for Data Mining
 
Types of graphs
Types of graphsTypes of graphs
Types of graphs
 
Statistics for ess
Statistics for essStatistics for ess
Statistics for ess
 
2 biostatistics presenting data
2  biostatistics presenting data2  biostatistics presenting data
2 biostatistics presenting data
 
Exploratory Data Analysis week 4
Exploratory Data Analysis week 4Exploratory Data Analysis week 4
Exploratory Data Analysis week 4
 
Statistics
StatisticsStatistics
Statistics
 
graphic representations in statistics
 graphic representations in statistics graphic representations in statistics
graphic representations in statistics
 
Statistics (Measures of Dispersion)
Statistics (Measures of Dispersion)Statistics (Measures of Dispersion)
Statistics (Measures of Dispersion)
 
measure of variability (windri). In research include example
measure of variability (windri). In research include examplemeasure of variability (windri). In research include example
measure of variability (windri). In research include example
 

More from Jason Edington

1 10 everyday reasons why statistics are important
1   10 everyday reasons why statistics are important1   10 everyday reasons why statistics are important
1 10 everyday reasons why statistics are importantJason Edington
 
2 lecture 1 course introduction
2   lecture 1 course introduction2   lecture 1 course introduction
2 lecture 1 course introductionJason Edington
 
1 welcome and ice breakers
1   welcome and ice breakers1   welcome and ice breakers
1 welcome and ice breakersJason Edington
 
Lecture 2 What is Statistics, Anyway
Lecture 2 What is Statistics, AnywayLecture 2 What is Statistics, Anyway
Lecture 2 What is Statistics, AnywayJason Edington
 
Advice to students for better learning, studying
Advice to students for better learning, studyingAdvice to students for better learning, studying
Advice to students for better learning, studyingJason Edington
 
Lecture 1 Course Introduction
Lecture 1 Course IntroductionLecture 1 Course Introduction
Lecture 1 Course IntroductionJason Edington
 
10 everyday reasons why statistics are important
10 everyday reasons why statistics are important10 everyday reasons why statistics are important
10 everyday reasons why statistics are importantJason Edington
 
Statistics ice breakers and orientation
Statistics ice breakers and orientationStatistics ice breakers and orientation
Statistics ice breakers and orientationJason Edington
 
New slideshare for online classes - Canvas
New slideshare for online classes - CanvasNew slideshare for online classes - Canvas
New slideshare for online classes - CanvasJason Edington
 

More from Jason Edington (9)

1 10 everyday reasons why statistics are important
1   10 everyday reasons why statistics are important1   10 everyday reasons why statistics are important
1 10 everyday reasons why statistics are important
 
2 lecture 1 course introduction
2   lecture 1 course introduction2   lecture 1 course introduction
2 lecture 1 course introduction
 
1 welcome and ice breakers
1   welcome and ice breakers1   welcome and ice breakers
1 welcome and ice breakers
 
Lecture 2 What is Statistics, Anyway
Lecture 2 What is Statistics, AnywayLecture 2 What is Statistics, Anyway
Lecture 2 What is Statistics, Anyway
 
Advice to students for better learning, studying
Advice to students for better learning, studyingAdvice to students for better learning, studying
Advice to students for better learning, studying
 
Lecture 1 Course Introduction
Lecture 1 Course IntroductionLecture 1 Course Introduction
Lecture 1 Course Introduction
 
10 everyday reasons why statistics are important
10 everyday reasons why statistics are important10 everyday reasons why statistics are important
10 everyday reasons why statistics are important
 
Statistics ice breakers and orientation
Statistics ice breakers and orientationStatistics ice breakers and orientation
Statistics ice breakers and orientation
 
New slideshare for online classes - Canvas
New slideshare for online classes - CanvasNew slideshare for online classes - Canvas
New slideshare for online classes - Canvas
 

Recently uploaded

A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfakmcokerachita
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 

Recently uploaded (20)

A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdf
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its Characteristics
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 

Lecture 3 making data sets into tables and graphs

  • 1. Lecture #3: Making Data Sets into Tables and Graphs
  • 2. Organizing Data • Looking at data can be overwhelming • There is a lot of raw (unsorted) data • It’s important to organize the data for summarization • One way to summarize data is to use tables and graphs • To do this, we’ll need to consider the qualities of the variable under consideration • Qualitative or Quantitative • If Quantitative: • What level of measurement? • Discrete or continuous?
  • 3. ZIP Codes Recall that a ZIP Code is a quantitative variable at the nominal level of measurement (almost like it’s a categorical variable) since there is no real order to ZIP codes. We’ll organize the ZIP codes by reading down the list and making a tally mark next to the headers (creating new headers as we find new ZIP codes) Class Tally Frequency (f) Relative Frequency ( 𝒇 𝒏 ) 95458 || 2 1.96% 95490 |||| |||| 9 8.82% 95469 ||| 3 2.94% 95482 |||| |||| |||| |||| |||| |||| |||| |||| |||| |||| |||| |||| || 62 60.78% 95461 | 1 0.98% 95470 |||| |||| 10 9.8% 95415 || 2 1.96% 95428 | 1 0.98% 95485 | 1 0.98% 95449 || 2 1.96% 95451 || 2 1.96% 95425 | 1 0.98% 95454 | 1 0.98% 95463 | 1 0.98% 95437 || 2 1.96% 95453 || 2 1.96% 𝑛 = 𝑓 = 102 𝑓 𝑛 =99.98%
  • 4. ZIP Codes Recall that a ZIP Code is a quantitative variable at the nominal level of measurement (almost like it’s a categorical variable) since there is no real order to ZIP codes. We’ll organize the ZIP codes by reading down the list and making a tally mark next to the headers (creating new headers as we find new ZIP codes) Class Tally Frequency (f) Relative Frequency ( 𝒇 𝒏 ) 95458 || 2 1.96% 95490 |||| |||| 9 8.82% 95469 ||| 3 2.94% 95482 |||| |||| |||| |||| |||| |||| |||| |||| |||| |||| |||| |||| || 62 60.78% 95461 | 1 0.98% 95470 |||| |||| 10 9.8% 95415 || 2 1.96% 95428 | 1 0.98% 95485 | 1 0.98% 95449 || 2 1.96% 95451 || 2 1.96% 95425 | 1 0.98% 95454 | 1 0.98% 95463 | 1 0.98% 95437 || 2 1.96% 95453 || 2 1.96% 𝑛 = 𝑓 = 102 𝑓 𝑛 =99.98%
  • 5. Women’s Heights It would be helpful to actually look at data grouped, instead of just ‘as it is’ ◦ This is especially true since the order here has meaning ◦ Also, we don’t want to look at each height separately (too many numbers) ◦ We’ll group the heights into a small enough number of groups that we can see any patterns that exist ◦ We’ll do this by making a Grouped Frequency Distribution Table How we’ll we group them? ◦ Pick two numbers ◦ Lower Limit of the First Class and Class Width ◦ If we all use these number, and use them correctly, we’ll get identical tables. Lower Limit of the First Class ◦ This is the smallest number we’re going to tally
  • 6. Women’s Heights Lower Limit of the First Class ◦ This is the smallest number we’re going to tally ◦ It must be either the height of the shortest person, or an even smaller height Class Width ◦ This is how many separate heights are in each class ◦ It is also the difference between the lower limit of successive classes Let’s use 57 as the lower limit of the first class and 2 as the class width This gives us class limits of 57-58, 59-60, 61-62, 63-64, 65-66, 67-68, 69-70, 71-72 (we can stop here because nobody was taller than 72 inches)
  • 7. Women’s Heights Before we tally the heights up, let’s address an issue that comes up when the variable is continuous (as height is) What if were rounding to something finer than whole inches? What if a person is actually 63.6 inches tall? We don’t really want gaps between our classes, so we use something called Class Boundaries Class Boundaries ◦ Split the difference between the upper class limit of one class and the lower class limit of the next higher class ◦ The boundary between the first two classes is 58.5, halfway from 58 to 59, and it is the both the upper class boundary of the first class and the lower class boundary of the second class.
  • 8. Women’s Heights Class Limits Class Boundaries Tally Frequency (f) Relative Frequency ( 𝒇 𝒏 ) 57-58 56.5-58.5 | 1 0.02 59-60 58.5-60.5 |||| || 7 0.12 61-62 60.5-62.5 |||| |||| || 12 0.20 63-64 62.5-64.5 |||| |||| |||| 14 .023 65-66 64.5-66.5 |||| |||| |||| 15 0.25 67-68 66.5-68.5 |||| 5 0.08 69-70 68.5-70.5 |||| 4 0.07 71-72 70.5-72.5 || 2 0.03 𝑛 = 𝑓 = 60 𝑓 𝑛 = 1
  • 9. Graphs Graphs are a great way to demonstrate data ◦ They help us to look for patterns ◦ There are different ways of displaying the data ◦ Today, we’ll consider the following: ◦ Pareto chart ◦ Pie chart ◦ Histogram ◦ Scatterplot
  • 10. Pareto Chart If you are dealing with categorical data or quantitative data at the nominal level of measurement, the Pareto chart gives a very good picture ◦ A Pareto chart is a bar graph whose bars: ◦ Do not touch (usually) ◦ Are arranged from the class of the largest frequency to the smallest frequency ◦ Can be arranged vertically or horizontally ◦ Often relative frequencies are used, but here is an example using Men’s Zip Codes just using regular frequencies 0 5 10 15 20 25 95482 95490 95470 95458 95469 95461 95415 95428 95485 95453 Frequency Men's ZIP Codes
  • 11. Pareto Chart Looking at this image, what jumps out at you? ◦ Can you see why this is a good choice to present the data? ◦ It is obvious which is the most common ZIP code ◦ You may note that there are two different ZIP codes that have 4 men living in them, and two with 2 men, and 5 with 1 man ◦ When the category is tied, the order doesn’t matter, so long as they stay descending in frequency0 5 10 15 20 25 95482 95490 95470 95458 95469 95461 95415 95428 95485 95453 Frequency Men's ZIP Codes
  • 12. Pie chart Another way to present qualitative and quantitative variables at the nominal level of measurement is with the Pie chart ◦ A Pie chart is a graph which represents 100% of the data being looked at ◦ You “slice” the “pie” so that the size of each piece indicates the size of the frequencies of the different categories ◦ We determine the size of the piece by what is called the central angle ◦ The central angle is the angle made by the two edges of the slice (assuming you started from the exact center of the pie) Central Angle
  • 13. Pie chart We measure angles in degrees, and there are 360 of them around the center of the pie We need to determine how many degrees to make the central angle for the slice representing each class ◦ 𝑓 𝑛 ∗ 360 ◦ This splits up the central angles precisely proportionately to the frequency of the classes. ◦ 95482: 25 42 ∗ 360 ≈ 214° ◦ 95490, 95470: 4 42 ∗ 360 ≈ 34° ◦ 95458, 95469: 2 42 ∗ 360 ≈ 17° ◦ 95461, 95415, 95428, 95485, 95453: 1 42 ∗ 360 ≈ 9° ◦ These angles total up to 361° 95482 95490 95470 95458 95469 95461 95415 95428 95485 95453 Men's ZIP Codes
  • 14. Histograms Neither the Pareto chart nor the Pie chart is suitable for variables at the interval and ratio levels of measurement You can’t put them in any order in the chart The best way to convey the data from these types of variables is with a Histogram A Histogram is a bar graph in which the bars touch  Thus we use the class boundaries when we mark off the scale on the horizontal axis As with the Pareto chart, the vertical axis shows the frequencies  Be sure the frequency scale goes high enough to accommodate the class with the greatest frequency, but not too much higher
  • 15. Histograms Neither the Pareto chart nor the Pie chart is suitable for variables at the interval and ratio levels of measurement One advantage of a histogram is that it can readily display large data sets The histogram can give you the shape of the data, the center, and the spread of the data Here is an example of a previous classes women’s heights divided into 8 classes You’ll note the ‘break’ in the horizontal axis; that is to show that the axis has been interrupted  This is proper to show, and not always done
  • 16. Histograms Neither the Pareto chart nor the Pie chart is suitable for variables at the interval and ratio levels of measurement You’ll also note that once you have shows that the horizontal axis is not perfectly to scale, you should choose where to start (where to place the 56.5) and then everything else has been decided from there!
  • 17. Bivariate Data (Two variables) Sometimes we want to look at two variables at once Bivariate – Two Variables Suppose we want to study the connection between people’s ages and the number of pets they have  Here, the ordered pair is (age, # of pets)  (19, 2), (23, 2), (18, 4), (18, 2), (28, 0), (19, 3), (37, 1), (20, 0), (34, 0), (40, 1), (18, 27), (19, 0), (18, 2), (18, 1), (18, 4), (20, 1), (19, 3), (26, 2), (23, 2), (29, 1), (23, 0), (19, 5), (19, 10), (29, 0), (19, 2), (19, 0) This is called a Scatter Plot Each point on a Scatter Plot gives us two pieces of data about a single member of the sample, one datum for each variable  Are there any data points that seem odd?  It’s easy to see that (18, 27) is an outlier; are there others?  Let’s take this one out and see what things look like now… 0 5 10 15 20 25 30 15 20 25 30 35 40 45
  • 18. Bivariate Data (Two variables) This allows us to see what kind of variability is going on a little bit easier Was this ‘OK’ to do?  Outliers happen, sometimes from mistakes, sometimes simply because they do exist  You should note that you have removed an outlier to look at the data 0 2 4 6 8 10 12 15 20 25 30 35 40 45 The horizontal variable is the x-variable, and it’s sometimes called the independent variable. The vertical variable is the y-variable, and it’s sometimes called the dependent variable Note: This terminology is not meant to imply that the one causes the other
  • 19. Activity: Making a grouped frequency distribution table Construct a grouped frequency distribution table for the heights of the men in the Class Data Base, using 60 as the lower limit of the first class and 3 inches as the class width. Have columns for the Class Limits, the Class Boundaries, the Tally, the Frequency, and the Relative Frequency to the nearest hundredth.