SlideShare a Scribd company logo
1 of 47
Chapter 2
Summarizing and Graphing Data
       Professor Mike Gilmore
    Middlesex Community College
            Spring 2012

                                  1
Why Graphs?
• Describe data
• Explore data
• Compare data

• The goal is to convey a message about the
  data, rather than to decorate…



                                              2
A Bunch of Data
ACAAGATGCCATTGTCCCCCGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGCCACCGCTGCCCTGCCACGGCCACCGCTGCCCTGCC
CCTGGAGGGTGGCCCCACCGGCCGAGACAGCGAGCATATGCAGGAAGCGGCAGGAATAAGGAAAAGCAGCACGGCCACCGCTGCCCT
CTCCTGACTTTCCTCGCTTGGTGGTTTGAGTGGACCTCCCAGGCCAGTGCCGGGCCCCTCATAGGAGAGGACGGCCACCGCTGCCCTGC
AAGCTCGGGAGGTGGCCAGGCGGCAGGAAGGCGCACCCCCCCAGCAATCCGCGCGCCGGGACAGAATGCCACGGCCACCGCTGCCTG
CTGCAGGAACTTCTTCTGGAAGACCTTCTCCTCCTGCAAATAAAACCTCACCCATGAATGCTCACGCAAGACGGCCACCGCTGCCCTGCC
TTTAATTACAGACCTGAAACAAGATGCCATTGTCCCCCGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGCCACCGCTGCCCTGCCAA
CCTGGAGGGTGGCCCCACCGGCCGAGACAGCGAGCATATGCAGGAAGCGGCAGGAATAAGGAAAAGCAGCACGGCCACCGCTGCCCT
CTCCTGACTTTCCTCGCTTGGTGGTTTGAGTGGACCTCCCAGGCCAGTGCCGGGCCCCTCATAGGAGAGGACGGCCACCGCTGCCCTGC
AAGCTCGGGAGGTGGCCAGGCGGCAGGAAGGCGCACCCCCCCAGCAATCCGCGCGCCGGGACAGAATGCCACGGCCACCGCTGCCCTG
CTGCAGGAACTTCTTCTGGAAGACCTTCTCCTCCTGCAAATAAAACCTCACCCATGAATGCTCACGCAAGACGGCCACCGCTGCCCTGCCA
TTTAATTACAGACCTGAAACAAGATGCCATTGTCCCCCGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGCCACCGCTGCCCTGCTATT
CCTGGAGGGTGGCCCCACCGGCCGAGACAGCGAGCATATGCAGGAAGCGGCAGGAATAAGGAAAAGCAGCACGGCCACCGCTGCCCT
CTCCTGACTTTCCTCGCTTGGTGGTTTGAGTGGACCTCCCAGGCCAGTGCCGGGCCCCTCATAGGAGAGGACGGCCACCGCTGCCCTGC
AAGCTCGGGAGGTGGCCAGGCGGCAGGAAGGCGCACCCCCCCAGCAATCCGCGCGCCGGGACAGAATGCCACGGCCACCGCTGCCCT
CTGCAGGAACTTCTTCTGGAAGACCTTCTCCTCCTGCAAATAAAACCTCACCCATGAATGCTCACGCAAGACGGCCACCGCTGCCCTGCA
TTTAATTACAGACCTGAAACAAGATGCCATTGTCCCCCGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGCCACCGCTGCCCTGCTGA
CCTGGAGGGTGGCCCCACCGGCCGAGACAGCGAGCATATGCAGGAAGCGGCAGGAATAAGGAAAAGCAGCACGGCCACCGCTGCCCTG
CTCCTGACTTTCCTCGCTTGGTGGTTTGAGTGGACCTCCCAGGCCAGTGCCGGGCCCCTCATAGGAGAGGACGGCCACCGCTGCCCTGCC
AAGCTCGGGAGGTGGCCAGGCGGCAGGAAGGCGCACCCCCCCAGCAATCCGCGCGCCGGGACAGAATGCCACGGCCACCGCTGCCCTG
CTGCAGGAACTTCTTCTGGAAGACCTTCTCCTCCTGCAAATAAAACCTCACCCATGAATGCTCACGCAAGACGGCCACCGCTGCCCTGCCA
TTTAATTACAGACCTGAAACAAGATGCCATTGTCCCCCGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGCCACCGCTGCCCTGCCGTG
CCTGGAGGGTGGCCCCACCGGCCGAGACAGCGAGCATATGCAGGAAGCGGCAGGAATAAGGAAAAGCAGCACGGCCACCGCTGCCCTG
CTCCTGACTTTCCTCGCTTGGTGGTTTGAGTGGACCTCCCAGGCCAGTGCCGGGCCCCTCATAGGAGAGGACGGCCACCGCTGCCCTGCC
AAGCTCGGGAGGTGGCCAGGCGGCAGGAAGGCGCACCCCCCCAGCAATCCGCGCGCCGGGACAGAATGCCACGGCCACCGCTGCCCTG
CTGCAGGAACTTCTTCTGGAAGACCTTCTCCTCCTGCAAATAAAACCTCACCCATGAATGCTCACGCAAGACGGCCACCGCTGCCCTGCCA
TTTAATTACAGACCTGAAACAAGATGCCATTGTCCCCCGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGCCACCGCTGCCCTGCCGTA
CCTGGAGGGTGGCCCCACCGGCCGAGACAGCGAGCATATGCAGGAAGCGGCAGGAATAAGGAAAAGCAGCACGGCCACCGCTGCCCTG
CTCCTGACTTTCCTCGCTTGGTGGTTTGAGTGGACCTCCCAGGCCAGTGCCGGGCCCCTCATAGGAGAGGACGGCCACCGCTGCCCTGCC
AAGCTCGGGAGGTGGCCAGGCGGCAGGAAGGCGCACCCCCCCAGCAATCCGCGCGCCGGGACAGAATGCCACGGCCACCGCTGCCCTG
CTGCAGGAACTTCTTCTGGAAGACCTTCTCCTCCTGCAAATAAAACCTCACCCATGAATGCTCACGCAAGACGGCCACCGCTGCCCTGCCT
TTTAATTACAGACCTGAAACAAGATGCCATTGTCCCCCGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGCCACCGCTGCCCTGCCTAA
CCTGGAGGGTGGCCCCACCGGCCGAGACAGCGAGCATATGCAGGAAGCGGCAGGAATAAGGAAAAGCAGCACGGCCACCGCTGCCCTG
CTCCTGACTTTCCTCGCTTGGTGGTTTGAGTGGACCTCCCAGGCCAGTGCCGGGCCCCTCATAGGAGAGGACGGCCACCGCTGCCCTGCC
AAGCTCGGGAGGTGGCCAGGCGGCAGGAAGGCGCACCCCCCCAGCAATCCGCGCGCCGGGACAGAATGCCACGGCCACCGCTGCCCTC
CTGCAGGAACTTCTTCTGGAAGACCTTCTCCTCCTGCAAATAAAACCTCACCCATGAATGCTCACGCAAGTTTAATTACAGACCTGAACACG




                                                                                               3
Frequency Table
Nucleotide          Frequency
A                   787
C                   712
T                   79
G                   723


Nucleotide          Frequency
ATG                 51
CCG                 32
AATTG               1
GAGA                3


                                4
Graph




        5
Characteristics of Data
1.   Center
2.   Variation
3.   Distribution
4.   Outliers
5.   Change Over Time




                                    6
Center
                             Bin     Frequency
155   142   149   130
                               109.5         0
151   163   151   142
                               119.5         2
156   133   138   161
                               129.5         2
128   144   172   137
                               139.5         5
151   166   147   163
                               149.5         9
145   116   136   158
114   165   169   145          159.5        13
150   150   150   158          169.5         6
151   145   152   140          179.5         2
170   129   188   156          189.5         1
                           More              0
        ?




                                                 7
Variation
                                Bin     Frequency
155   142   149   130
                                  109.5         0
151   163   151   142
                                  119.5         2
156   133   138   161
                                  129.5         2
128   144   172   137
                                  139.5         5
151   166   147   163
                                  149.5         9
145   116   136   158
114   165   169   145             159.5        13
150   150   150   158             169.5         6
151   145   152   140             179.5         2
170   129   188   156             189.5         1
                              More              0
        ?




                                                    8
Distribution
                                   Bin     Frequency
155   142   149     130
                                     109.5         0
151   163   151     142
                                     119.5         2
156   133   138     161
                                     129.5         2
128   144   172     137
                                     139.5         5
151   166   147     163
                                     149.5         9
145   116   136     158
114   165   169     145              159.5        13
150   150   150     158              169.5         6
151   145   152     140              179.5         2
170   129   188     156              189.5         1
                                 More              0
        ?




                                                       9
Outliers                 Bin Frequency
                                            109.5      0
                                            119.5      1
                                            129.5      1
                                            139.5      5
155   142   149   130                       149.5      9
151   163   151   142                       159.5     13
156   133   138   161                       169.5      6
228   144   172   137                       179.5      2
151   166   147   163                       189.5      0
145   316   136   158                       199.5      0
114   165   169   145                       209.5      0
150   150   150   158                       219.5      0
151   145   152   140                       229.5      1
170   129   488   156                       239.5      0
        ?                                   249.5
                                            259.5
                                                       0
                                                       0
                                            269.5      0
                             Frequency      279.5      0
                        14                  289.5      0
                        12                  299.5      0
                        10               More          2
                        8
                        6
                                         Frequency
                        4
                        2
                        0



                                                     10
Change Over Time
        http://www.maps4kids.com/vizdata_pop.html



http://www.ted.com/talks/lang/en/hans_rosling_at_state.html




                       http://www.gapminder.org/              11
Summarizing Data




                   12
Frequency Table
• A frequency table shows how a data set is
  partitioned among all of several categories (or
  classes) by listing all of the categories along
  with the number of data values in each of the
  categories.




                                                13
Simple Frequency Table



                           Statistical Reasoning, Bennett, et.al., 3rd edition



        Cumulative                                Relative
Grade        Frequency            Grade                       Frequency
 A           4=4                      A                       4 / 25 = 0.16
 B           4 + 7 = 11               B                       7 / 25 = 0.28
 C           11 + 9 = 20              C                       9 / 25 = 0.36
 D           20 + 3 = 23              D                       3 / 25 = 0.12
  F          23 + 2 = 25               F                      2 / 25 = 0.08
                                                                                 14
Frequency Table Terms for
           Quantitative Categories
•   Lower class limits
•   Upper class limits
•   Class boundaries
•   Class midpoints
•   Class width
    – No gaps between classes



                                      15
Illustration of Terms




                        16
Constructing a Frequency Table
1.   Determine number of classes
2.   Calculate class width
3.   Choose first lower class limit
4.   List all lower class limits
5.   List all upper class limits
6.   Tally each data point next to appropriate
     class limits

                                                 17
18
Statistical Reasoning, Bennett, et.al., 3rd edition
Binned Data




       Statistical Reasoning, Bennett, et.al., 3rd edition


                                                             19
Other Frequencies
• Relative Frequency
• Cumulative Frequency




                              20
Histogram
• A histogram is a graph of bars of equal width
  drawn adjacent to each other (without gaps).
  The horizontal scale represents classes of
  quantitative data values. The vertical scale
  represents frequencies.
• What characteristic of a data set can be better
  understood by constructing a histogram?


                                                21
Histogram




            22
Frequency Polygon




    Elementary Statistics, Triola, 11th edition

                                                  23
Frequency Polygon
• DIY for BTU data




                               24
Ogive (“oh-jive”)
a.k.a. Cumulative Frequency Polygon




                                      25
Ogive
• DIY for BTU data




                             26
Dotplot




Animation Installed?

                       27
Stemplot




           28
Bar Graph




            29
Pareto Chart
• When we want to attract attention to more
  important data.
• Used for qualitative data, nominal not ordinal
  – WHY?
• Bars arranged in descending order by
  frequencies.



                                               30
Pareto Chart




               31
Pie Chart
• Also for qualitative data




              http://assistantvillageidiot.blogspot.com/2007/11/why-would-they-lie-huh.html
                                                                                              32
Scatter Plot
• Paired data




                               33
Time-Series Graph




                    34
Gaps in Data




               35
Gaps in Data
• [self-reported age graph?]




                               36
“Bad” Graphs
• Graphics can offer clear and meaningful
  summaries of statistical data.
  However, even well-made graphics can be
  misleading if we are not careful in
  interpreting them, and poorly made graphics
  are almost always misleading. Moreover, some
  people use graphics in deliberately misleading
  ways.

                                               37
Perceptual Distortion – 2D – BAD




                    Statistical Reasoning, Bennett, et.al., 3rd edition




                                                                          38
Perceptual Distortion – 3D – WORSE




                                  Statistical Reasoning, Bennett, et.al., 3rd edition




       What is the right thing to do here?
                                                                                        39
Stretching Axes




                  40
Manipulating Axes




                    41
Watch the Scales




             Statistical Reasoning, Bennett, et.al., 3rd edition
Partial Data




http://www.yale.edu/ynhti/curriculum/units/2008/6/08.06.06.x.html
Percent Change Graphs
Chart Junk
Help Wanted
• Statistical Graphics Designer




                                  46
End of Ch2
• We’ve discussed data and graphs.
• Next we’ll work on comparing data.
• Bring your calculator.




                                       47

More Related Content

Recently uploaded

“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
Muhammad Subhan
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
FIDO Alliance
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
FIDO Alliance
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
FIDO Alliance
 

Recently uploaded (20)

Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
“Iamnobody89757” Understanding the Mysterious of Digital Identity.pdf
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxCyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
 
How to Check GPS Location with a Live Tracker in Pakistan
How to Check GPS Location with a Live Tracker in PakistanHow to Check GPS Location with a Live Tracker in Pakistan
How to Check GPS Location with a Live Tracker in Pakistan
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 

Featured

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Ch2 graphs

  • 1. Chapter 2 Summarizing and Graphing Data Professor Mike Gilmore Middlesex Community College Spring 2012 1
  • 2. Why Graphs? • Describe data • Explore data • Compare data • The goal is to convey a message about the data, rather than to decorate… 2
  • 3. A Bunch of Data ACAAGATGCCATTGTCCCCCGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGCCACCGCTGCCCTGCCACGGCCACCGCTGCCCTGCC CCTGGAGGGTGGCCCCACCGGCCGAGACAGCGAGCATATGCAGGAAGCGGCAGGAATAAGGAAAAGCAGCACGGCCACCGCTGCCCT CTCCTGACTTTCCTCGCTTGGTGGTTTGAGTGGACCTCCCAGGCCAGTGCCGGGCCCCTCATAGGAGAGGACGGCCACCGCTGCCCTGC AAGCTCGGGAGGTGGCCAGGCGGCAGGAAGGCGCACCCCCCCAGCAATCCGCGCGCCGGGACAGAATGCCACGGCCACCGCTGCCTG CTGCAGGAACTTCTTCTGGAAGACCTTCTCCTCCTGCAAATAAAACCTCACCCATGAATGCTCACGCAAGACGGCCACCGCTGCCCTGCC TTTAATTACAGACCTGAAACAAGATGCCATTGTCCCCCGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGCCACCGCTGCCCTGCCAA CCTGGAGGGTGGCCCCACCGGCCGAGACAGCGAGCATATGCAGGAAGCGGCAGGAATAAGGAAAAGCAGCACGGCCACCGCTGCCCT CTCCTGACTTTCCTCGCTTGGTGGTTTGAGTGGACCTCCCAGGCCAGTGCCGGGCCCCTCATAGGAGAGGACGGCCACCGCTGCCCTGC AAGCTCGGGAGGTGGCCAGGCGGCAGGAAGGCGCACCCCCCCAGCAATCCGCGCGCCGGGACAGAATGCCACGGCCACCGCTGCCCTG CTGCAGGAACTTCTTCTGGAAGACCTTCTCCTCCTGCAAATAAAACCTCACCCATGAATGCTCACGCAAGACGGCCACCGCTGCCCTGCCA TTTAATTACAGACCTGAAACAAGATGCCATTGTCCCCCGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGCCACCGCTGCCCTGCTATT CCTGGAGGGTGGCCCCACCGGCCGAGACAGCGAGCATATGCAGGAAGCGGCAGGAATAAGGAAAAGCAGCACGGCCACCGCTGCCCT CTCCTGACTTTCCTCGCTTGGTGGTTTGAGTGGACCTCCCAGGCCAGTGCCGGGCCCCTCATAGGAGAGGACGGCCACCGCTGCCCTGC AAGCTCGGGAGGTGGCCAGGCGGCAGGAAGGCGCACCCCCCCAGCAATCCGCGCGCCGGGACAGAATGCCACGGCCACCGCTGCCCT CTGCAGGAACTTCTTCTGGAAGACCTTCTCCTCCTGCAAATAAAACCTCACCCATGAATGCTCACGCAAGACGGCCACCGCTGCCCTGCA TTTAATTACAGACCTGAAACAAGATGCCATTGTCCCCCGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGCCACCGCTGCCCTGCTGA CCTGGAGGGTGGCCCCACCGGCCGAGACAGCGAGCATATGCAGGAAGCGGCAGGAATAAGGAAAAGCAGCACGGCCACCGCTGCCCTG CTCCTGACTTTCCTCGCTTGGTGGTTTGAGTGGACCTCCCAGGCCAGTGCCGGGCCCCTCATAGGAGAGGACGGCCACCGCTGCCCTGCC AAGCTCGGGAGGTGGCCAGGCGGCAGGAAGGCGCACCCCCCCAGCAATCCGCGCGCCGGGACAGAATGCCACGGCCACCGCTGCCCTG CTGCAGGAACTTCTTCTGGAAGACCTTCTCCTCCTGCAAATAAAACCTCACCCATGAATGCTCACGCAAGACGGCCACCGCTGCCCTGCCA TTTAATTACAGACCTGAAACAAGATGCCATTGTCCCCCGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGCCACCGCTGCCCTGCCGTG CCTGGAGGGTGGCCCCACCGGCCGAGACAGCGAGCATATGCAGGAAGCGGCAGGAATAAGGAAAAGCAGCACGGCCACCGCTGCCCTG CTCCTGACTTTCCTCGCTTGGTGGTTTGAGTGGACCTCCCAGGCCAGTGCCGGGCCCCTCATAGGAGAGGACGGCCACCGCTGCCCTGCC AAGCTCGGGAGGTGGCCAGGCGGCAGGAAGGCGCACCCCCCCAGCAATCCGCGCGCCGGGACAGAATGCCACGGCCACCGCTGCCCTG CTGCAGGAACTTCTTCTGGAAGACCTTCTCCTCCTGCAAATAAAACCTCACCCATGAATGCTCACGCAAGACGGCCACCGCTGCCCTGCCA TTTAATTACAGACCTGAAACAAGATGCCATTGTCCCCCGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGCCACCGCTGCCCTGCCGTA CCTGGAGGGTGGCCCCACCGGCCGAGACAGCGAGCATATGCAGGAAGCGGCAGGAATAAGGAAAAGCAGCACGGCCACCGCTGCCCTG CTCCTGACTTTCCTCGCTTGGTGGTTTGAGTGGACCTCCCAGGCCAGTGCCGGGCCCCTCATAGGAGAGGACGGCCACCGCTGCCCTGCC AAGCTCGGGAGGTGGCCAGGCGGCAGGAAGGCGCACCCCCCCAGCAATCCGCGCGCCGGGACAGAATGCCACGGCCACCGCTGCCCTG CTGCAGGAACTTCTTCTGGAAGACCTTCTCCTCCTGCAAATAAAACCTCACCCATGAATGCTCACGCAAGACGGCCACCGCTGCCCTGCCT TTTAATTACAGACCTGAAACAAGATGCCATTGTCCCCCGGCCTCCTGCTGCTGCTGCTCTCCGGGGCCACGGCCACCGCTGCCCTGCCTAA CCTGGAGGGTGGCCCCACCGGCCGAGACAGCGAGCATATGCAGGAAGCGGCAGGAATAAGGAAAAGCAGCACGGCCACCGCTGCCCTG CTCCTGACTTTCCTCGCTTGGTGGTTTGAGTGGACCTCCCAGGCCAGTGCCGGGCCCCTCATAGGAGAGGACGGCCACCGCTGCCCTGCC AAGCTCGGGAGGTGGCCAGGCGGCAGGAAGGCGCACCCCCCCAGCAATCCGCGCGCCGGGACAGAATGCCACGGCCACCGCTGCCCTC CTGCAGGAACTTCTTCTGGAAGACCTTCTCCTCCTGCAAATAAAACCTCACCCATGAATGCTCACGCAAGTTTAATTACAGACCTGAACACG 3
  • 4. Frequency Table Nucleotide Frequency A 787 C 712 T 79 G 723 Nucleotide Frequency ATG 51 CCG 32 AATTG 1 GAGA 3 4
  • 5. Graph 5
  • 6. Characteristics of Data 1. Center 2. Variation 3. Distribution 4. Outliers 5. Change Over Time 6
  • 7. Center Bin Frequency 155 142 149 130 109.5 0 151 163 151 142 119.5 2 156 133 138 161 129.5 2 128 144 172 137 139.5 5 151 166 147 163 149.5 9 145 116 136 158 114 165 169 145 159.5 13 150 150 150 158 169.5 6 151 145 152 140 179.5 2 170 129 188 156 189.5 1 More 0 ? 7
  • 8. Variation Bin Frequency 155 142 149 130 109.5 0 151 163 151 142 119.5 2 156 133 138 161 129.5 2 128 144 172 137 139.5 5 151 166 147 163 149.5 9 145 116 136 158 114 165 169 145 159.5 13 150 150 150 158 169.5 6 151 145 152 140 179.5 2 170 129 188 156 189.5 1 More 0 ? 8
  • 9. Distribution Bin Frequency 155 142 149 130 109.5 0 151 163 151 142 119.5 2 156 133 138 161 129.5 2 128 144 172 137 139.5 5 151 166 147 163 149.5 9 145 116 136 158 114 165 169 145 159.5 13 150 150 150 158 169.5 6 151 145 152 140 179.5 2 170 129 188 156 189.5 1 More 0 ? 9
  • 10. Outliers Bin Frequency 109.5 0 119.5 1 129.5 1 139.5 5 155 142 149 130 149.5 9 151 163 151 142 159.5 13 156 133 138 161 169.5 6 228 144 172 137 179.5 2 151 166 147 163 189.5 0 145 316 136 158 199.5 0 114 165 169 145 209.5 0 150 150 150 158 219.5 0 151 145 152 140 229.5 1 170 129 488 156 239.5 0 ? 249.5 259.5 0 0 269.5 0 Frequency 279.5 0 14 289.5 0 12 299.5 0 10 More 2 8 6 Frequency 4 2 0 10
  • 11. Change Over Time http://www.maps4kids.com/vizdata_pop.html http://www.ted.com/talks/lang/en/hans_rosling_at_state.html http://www.gapminder.org/ 11
  • 13. Frequency Table • A frequency table shows how a data set is partitioned among all of several categories (or classes) by listing all of the categories along with the number of data values in each of the categories. 13
  • 14. Simple Frequency Table Statistical Reasoning, Bennett, et.al., 3rd edition Cumulative Relative Grade Frequency Grade Frequency A 4=4 A 4 / 25 = 0.16 B 4 + 7 = 11 B 7 / 25 = 0.28 C 11 + 9 = 20 C 9 / 25 = 0.36 D 20 + 3 = 23 D 3 / 25 = 0.12 F 23 + 2 = 25 F 2 / 25 = 0.08 14
  • 15. Frequency Table Terms for Quantitative Categories • Lower class limits • Upper class limits • Class boundaries • Class midpoints • Class width – No gaps between classes 15
  • 17. Constructing a Frequency Table 1. Determine number of classes 2. Calculate class width 3. Choose first lower class limit 4. List all lower class limits 5. List all upper class limits 6. Tally each data point next to appropriate class limits 17
  • 18. 18 Statistical Reasoning, Bennett, et.al., 3rd edition
  • 19. Binned Data Statistical Reasoning, Bennett, et.al., 3rd edition 19
  • 20. Other Frequencies • Relative Frequency • Cumulative Frequency 20
  • 21. Histogram • A histogram is a graph of bars of equal width drawn adjacent to each other (without gaps). The horizontal scale represents classes of quantitative data values. The vertical scale represents frequencies. • What characteristic of a data set can be better understood by constructing a histogram? 21
  • 22. Histogram 22
  • 23. Frequency Polygon Elementary Statistics, Triola, 11th edition 23
  • 24. Frequency Polygon • DIY for BTU data 24
  • 26. Ogive • DIY for BTU data 26
  • 28. Stemplot 28
  • 29. Bar Graph 29
  • 30. Pareto Chart • When we want to attract attention to more important data. • Used for qualitative data, nominal not ordinal – WHY? • Bars arranged in descending order by frequencies. 30
  • 32. Pie Chart • Also for qualitative data http://assistantvillageidiot.blogspot.com/2007/11/why-would-they-lie-huh.html 32
  • 36. Gaps in Data • [self-reported age graph?] 36
  • 37. “Bad” Graphs • Graphics can offer clear and meaningful summaries of statistical data. However, even well-made graphics can be misleading if we are not careful in interpreting them, and poorly made graphics are almost always misleading. Moreover, some people use graphics in deliberately misleading ways. 37
  • 38. Perceptual Distortion – 2D – BAD Statistical Reasoning, Bennett, et.al., 3rd edition 38
  • 39. Perceptual Distortion – 3D – WORSE Statistical Reasoning, Bennett, et.al., 3rd edition What is the right thing to do here? 39
  • 42. Watch the Scales Statistical Reasoning, Bennett, et.al., 3rd edition
  • 46. Help Wanted • Statistical Graphics Designer 46
  • 47. End of Ch2 • We’ve discussed data and graphs. • Next we’ll work on comparing data. • Bring your calculator. 47