This document discusses graphical descriptions of data through graphs and charts. It introduces frequency tables and relative frequency tables that organize qualitative data into categories and counts. A sample frequency table is created from data on types of cars students drive. This data is then displayed in a bar chart and pie chart to visualize the distribution. Bar charts are described as using category on the x-axis and frequency on the y-axis with rectangular bars proportional in height to the frequencies. Pie charts divide a circle into wedge-shaped sectors proportional to relative frequencies. Examples are provided of creating these graphs in StatCrunch software from the raw car type data.
SPSS is widely used program for statistical analysis in social sciences, particularly in education and research. However, because of its potential, it is also widely used by market researchers, health-care researchers, survey organizations, governments and, most notably, data miners and big data professionals.
Spss data analysis for univariate, bivariate and multivariate statistics by d...Dr. Sola Maitanmi
This chapter provides an overview of statistical principles and modeling. The goals of statistical modeling are to describe sample data and make inferences about the underlying population. Inferential statistics are used to estimate population parameters based on sample statistics. Statistical tests indicate if observed effects in a sample could plausibly occur by chance or suggest an effect in the population. The appropriate statistical model depends on the type of data, such as using t-tests and ANOVA for mean differences or correlation/regression for relationships between continuous variables. Overall, statistical analysis involves sampling data, applying a model, and evaluating model fit and inferences that can be made about the population.
SPSS is powerful to analyze nurses data. This paper intends to support hospital leaders the benefits of data analyzing with applied SPSS. This paper intends to support the hospital managers and its office managers to know whether hourly salary depends upon nurse experiences and nurse types such as hospital nurse and office nurse. Moreover it analyzes the interesting deviation condition of hospital nurses and office nurses salaries. As SPSS's background algorithms, it showed the means algorithm for tables and graph. And then Sample data hourly wage data.sav' was downloaded from Google and was analyzed and viewed. It used IBM SPSS statistics version 23 and PYTHON version 3.7. Aung Cho | Aung Si Thu "Nurses Data Analysis by Applied SPSS" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd25329.pdfPaper URL: https://www.ijtsrd.com/computer-science/data-miining/25329/nurses-data-analysis-by-applied-spss/aung-cho
This document provides an overview of the statistical software package SPSS (Statistical Package for the Social Sciences). It was originally developed in 1968 to facilitate statistical analysis in the social sciences and was later purchased by IBM in 2009 for over $1 billion. The document outlines SPSS's general capabilities for data management, analysis, and visualization including defining and coding variables, descriptive statistics, graphs, and other statistical analyses. It also defines different variable types, levels of measurement, and common descriptive statistics like measures of central tendency and dispersion.
Pivot tables allow users to summarize and analyze data in Excel by aggregating and reorganizing the data into a new format determined by the user. The document provides a step-by-step tutorial on how to create a pivot table using sample voter data. Key steps include selecting the data range, inserting a pivot table on a new worksheet, and dragging fields from the pivot table field list to rows, columns, and values areas to choose how the data should be organized and summarized. Advanced techniques like filtering, moving fields, and customizing pivot table options are also demonstrated.
This document provides an introduction to creating and using Excel PivotTables. It discusses appropriate source data types, how to create a basic PivotTable using the wizard or drag-and-drop method, formatting and updating PivotTables, and some advanced techniques. The presentation aims to help users understand how to use PivotTables for interactive data exploration and custom reporting using Excel's powerful summarization features.
This document describes how to calculate descriptive statistics using SPSS. It discusses entering data into SPSS, calculating frequencies, means, medians, modes, standard deviations and other measures. It provides three methods for computing descriptive statistics in SPSS: frequencies analysis, descriptives analysis, and explore analysis. Finally, it demonstrates how to create graphs like histograms, bar charts and pie charts to represent the data visually. The overall purpose is to introduce the key concepts and applications of descriptive statistics using the SPSS software package.
SPSS is widely used program for statistical analysis in social sciences, particularly in education and research. However, because of its potential, it is also widely used by market researchers, health-care researchers, survey organizations, governments and, most notably, data miners and big data professionals.
Spss data analysis for univariate, bivariate and multivariate statistics by d...Dr. Sola Maitanmi
This chapter provides an overview of statistical principles and modeling. The goals of statistical modeling are to describe sample data and make inferences about the underlying population. Inferential statistics are used to estimate population parameters based on sample statistics. Statistical tests indicate if observed effects in a sample could plausibly occur by chance or suggest an effect in the population. The appropriate statistical model depends on the type of data, such as using t-tests and ANOVA for mean differences or correlation/regression for relationships between continuous variables. Overall, statistical analysis involves sampling data, applying a model, and evaluating model fit and inferences that can be made about the population.
SPSS is powerful to analyze nurses data. This paper intends to support hospital leaders the benefits of data analyzing with applied SPSS. This paper intends to support the hospital managers and its office managers to know whether hourly salary depends upon nurse experiences and nurse types such as hospital nurse and office nurse. Moreover it analyzes the interesting deviation condition of hospital nurses and office nurses salaries. As SPSS's background algorithms, it showed the means algorithm for tables and graph. And then Sample data hourly wage data.sav' was downloaded from Google and was analyzed and viewed. It used IBM SPSS statistics version 23 and PYTHON version 3.7. Aung Cho | Aung Si Thu "Nurses Data Analysis by Applied SPSS" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd25329.pdfPaper URL: https://www.ijtsrd.com/computer-science/data-miining/25329/nurses-data-analysis-by-applied-spss/aung-cho
This document provides an overview of the statistical software package SPSS (Statistical Package for the Social Sciences). It was originally developed in 1968 to facilitate statistical analysis in the social sciences and was later purchased by IBM in 2009 for over $1 billion. The document outlines SPSS's general capabilities for data management, analysis, and visualization including defining and coding variables, descriptive statistics, graphs, and other statistical analyses. It also defines different variable types, levels of measurement, and common descriptive statistics like measures of central tendency and dispersion.
Pivot tables allow users to summarize and analyze data in Excel by aggregating and reorganizing the data into a new format determined by the user. The document provides a step-by-step tutorial on how to create a pivot table using sample voter data. Key steps include selecting the data range, inserting a pivot table on a new worksheet, and dragging fields from the pivot table field list to rows, columns, and values areas to choose how the data should be organized and summarized. Advanced techniques like filtering, moving fields, and customizing pivot table options are also demonstrated.
This document provides an introduction to creating and using Excel PivotTables. It discusses appropriate source data types, how to create a basic PivotTable using the wizard or drag-and-drop method, formatting and updating PivotTables, and some advanced techniques. The presentation aims to help users understand how to use PivotTables for interactive data exploration and custom reporting using Excel's powerful summarization features.
This document describes how to calculate descriptive statistics using SPSS. It discusses entering data into SPSS, calculating frequencies, means, medians, modes, standard deviations and other measures. It provides three methods for computing descriptive statistics in SPSS: frequencies analysis, descriptives analysis, and explore analysis. Finally, it demonstrates how to create graphs like histograms, bar charts and pie charts to represent the data visually. The overall purpose is to introduce the key concepts and applications of descriptive statistics using the SPSS software package.
This section discusses analyzing categorical data:
- It introduces categorical variables and how to construct frequency tables and graphs like bar graphs and pie charts to display categorical variable distributions.
- It explains how to construct and interpret two-way tables to analyze relationships between two categorical variables, and how to examine marginal and conditional distributions.
- It emphasizes organizing statistical problems using a four step approach of stating the question, planning an approach, doing calculations/graphs, and concluding.
This document provides an overview of using SPSS (Statistical Package for the Social Sciences) software. It introduces the main interfaces for working with data in SPSS, including the data view, variable view, output view, draft view, and syntax view. It also provides instructions for installing sample data files and demonstrates how to generate a basic cross-tabulation output of employment by gender using the automated features.
This document provides instructions for performing various statistical analyses and data management tasks in SPSS, including sorting data, selecting cases, splitting files, merging files, visual binning, frequencies analysis, descriptive statistics, cross tabulation and chi-square tests, independent samples t-tests, and one-way ANOVA. The document is authored by trainers from the Department of Applied Statistics at the University of Rwanda and dated December 6, 2014.
This document provides an introduction to Microsoft Excel by describing spreadsheets, workbooks, worksheets, and basic Excel functions. It discusses how to open and save workbooks, navigate and modify worksheets by inserting and deleting rows and columns, format cells and worksheets, print worksheets, and use basic formulas with relative and absolute cell references. The objectives are to get familiar with the Excel interface and basic functions to build the foundations for more advanced spreadsheet skills.
This document provides an overview of the Statistical Package for Social Sciences (SPSS) software. It describes the main components and windows in SPSS, including the data window, variable view window, output window, and chart editor window. It also outlines several statistical techniques that can be performed in SPSS, such as descriptive statistics, correlations, t-tests, and chi-square tests of independence. SPSS is a tool that allows users to manage and analyze data, as well as generate graphs and conduct a wide range of statistical procedures.
This document discusses how charts can be used to convey messages through visual representations of data. It describes different chart types like pie charts, column charts, and stacked column charts and explains how to create and modify charts using the Chart Wizard and toolbar tools in Excel. It also covers linking and embedding charts in other documents, as well as multitasking between applications using the taskbar. The key points are how to select the appropriate chart type to fit the data and convey the intended message, and how to create and modify charts to effectively communicate information visually.
SPSS is a statistical software package used for statistical analysis and data management. It was first released in 1968 and was later acquired by IBM in 2009. Some key uses of SPSS include statistical analysis for social sciences, market research, education, and health research. It allows users to perform various statistical analyses, manage data, and export outputs to other formats.
The document describes how to create and configure a basic pivot table in Excel. It explains that a pivot table allows you to sort and summarize data independently of the original layout. The steps include selecting a data range, choosing to create a pivot table, and using the pivot table field list to designate fields as report filters, column labels, or row labels. Configuring these fields allows the user to build a report to analyze relationships in the data.
This document provides an overview of descriptive and inferential statistical procedures for analyzing data, including summarizing data using descriptive statistics and graphs, assessing reliability, comparing groups using t-tests and ANOVA, and testing associations using non-parametric tests and regression. It also discusses analyzing customer satisfaction data through reliability analysis, chi-square tests, and comparing service across shops.
Interactivity on Excel Using Pivoting, Dashboards, “Index and Match,” and Glo...Shalin Hai-Jew
Data are not inert, but interpreting summary data from still data visualizations may be somewhat limited. Excel has various built-in enablements for interactive engagements with data, including pivoting, dashboarding, indexing and matching, and global mapping. This work shows some basic setups of data that can enable more interactivity with little effort.
The document discusses various techniques for handling data in Excel, including entering data manually or importing it, sorting and filtering data, using subtotals and pivot tables to summarize data, and formatting options. Key techniques covered include importing tab-delimited files, sorting data by clicking Data > Sort, filtering data using Data > Autofilter, creating pivot tables by selecting the data source and dragging field buttons, and formatting cells using conditional formats.
The document discusses key concepts related to data processing including data, variables, cases, information, the steps of data processing, elements of data processing such as coding and tabulation, common problems, and software used for processing such as SPSS, SAS, and Quantum. Data processing converts raw data into usable information through steps like coding, cleaning, validating, classifying, tabulating, and analyzing the data. Tables are an important output and must be clearly formatted and labeled.
The pivot tables are not created mechanically. In Microsoft excel the user should select the data first for which the pivot table should be created. The pivot table option is available on the insert tab. The user has the option of inserting the pivot table either in the existing sheet or creating the pivot table in the new sheet. Copy the link given below and paste it in new browser window to get more information on Pivot Table:- http://www.transtutors.com/homework-help/statistics/pivot-table.aspx
Statistical package for social science (SPSS) is a software package used for statistical analysis. It was created in 1968. SPSS is used for data collection, organization, and output, as well as performing statistical tests. When using SPSS, researchers must first create a code book to define their variables before entering data into SPSS. Data is entered into columns representing variables and rows representing individual cases. Researchers should review their data for issues like outliers, missing data, and normal distribution before conducting statistical analyses.
The document provides an overview of how to use pivot tables in Excel to efficiently summarize and analyze large datasets. It explains that pivot tables allow users to automatically sort and count data from thousands of rows and columns in seconds. The document then guides the reader through steps to set up their first pivot table using sample data, including arranging fields and values, formatting options, calculating new fields, conditional formatting, and creating pivot charts. The overall document serves as a tutorial to help users learn the key capabilities and benefits of using pivot tables in Excel.
Data processing involves converting raw data into meaningful information through activities like editing, coding, classifying, tabulating and diagramming data. It is concerned with preparing raw research data for analysis by organizing and managing data files. The goal is to check for errors or inconsistencies in the data and structure it in a way that allows for descriptive and inferential statistical analysis.
This document provides an overview of using Microsoft Excel to handle, graph, and analyze scientific data. It begins with basics of the Excel interface and entering data. It then demonstrates how to manipulate data through calculations, format cells, and use functions. The document shows how to create scatter plots and add regression lines to graphs. It also discusses interpolation, extrapolation, printing graphs, downloading internet data, and more advanced statistical analyses in Excel.
Statistical Package for Social Science (SPSS)sspink
This presentation includes the introduction of SPSS is basic features of Spss, how to input data manually, descriptive statistics and how to perform t-test, Anova and Chi-Square.
SPSS is a popular statistical analysis software that is known for its ease of use. It has strong graphical capabilities and supports a variety of statistical analyses. However, it lacks some more advanced statistical procedures and has limited data management tools. While suitable for many tasks, some users may outgrow it over time and require more specialized software like SAS or Stata for complex or cutting-edge analyses. Overall, SPSS is best suited for users performing basic to intermediate statistical analysis and reporting.
Chapter 2 Graphical Descriptions of Data 25 Chapter 2.docxcravennichole326
Chapter 2: Graphical Descriptions of Data
25
Chapter 2: Graphical Descriptions of Data
In chapter 1, you were introduced to the concepts of population, which again is a
collection of all the measurements from the individuals of interest. Remember, in most
cases you can’t collect the entire population, so you have to take a sample. Thus, you
collect data either through a sample or a census. Now you have a large number of data
values. What can you do with them? No one likes to look at just a set of numbers. One
thing is to organize the data into a table or graph. Ultimately though, you want to be able
to use that graph to interpret the data, to describe the distribution of the data set, and to
explore different characteristics of the data. The characteristics that will be discussed in
this chapter and the next chapter are:
1. Center: middle of the data set, also known as the average.
2. Variation: how much the data varies.
3. Distribution: shape of the data (symmetric, uniform, or skewed).
4. Qualitative data: analysis of the data
5. Outliers: data values that are far from the majority of the data.
6. Time: changing characteristics of the data over time.
This chapter will focus mostly on using the graphs to understand aspects of the data, and
not as much on how to create the graphs. There is technology that will create most of the
graphs, though it is important for you to understand the basics of how to create them.
Section 2.1: Qualitative Data
Remember, qualitative data are words describing a characteristic of the individual. There
are several different graphs that are used for qualitative data. These graphs include bar
graphs, Pareto charts, and pie charts.
Pie charts and bar graphs are the most common ways of displaying qualitative data. A
spreadsheet program like Excel can make both of them. The first step for either graph is
to make a frequency or relative frequency table. A frequency table is a summary of
the data with counts of how often a data value (or category) occurs.
Example #2.1.1: Creating a Frequency Table
Suppose you have the following data for which type of car students at a college
drive?
Ford, Chevy, Honda, Toyota, Toyota, Nissan, Kia, Nissan, Chevy, Toyota,
Honda, Chevy, Toyota, Nissan, Ford, Toyota, Nissan, Mercedes, Chevy,
Ford, Nissan, Toyota, Nissan, Ford, Chevy, Toyota, Nissan, Honda,
Porsche, Hyundai, Chevy, Chevy, Honda, Toyota, Chevy, Ford, Nissan,
Toyota, Chevy, Honda, Chevy, Saturn, Toyota, Chevy, Chevy, Nissan,
Honda, Toyota, Toyota, Nissan
Chapter 2: Graphical Descriptions of Data
26
A listing of data is too hard to look at and analyze, so you need to summarize it.
First you need to decide the categories. In this case it is relatively easy; just use
the car type. However, there are several cars that only have one car in the list. In
that case it is easier to make a category called other for the ones with low values.
Now ...
Week 2 Project - STAT 3001Student Name Type your name here.docxcockekeshia
Week 2 Project - STAT 3001
Student Name: <Type your name here>
Date: <Enter the date on which you began working on this assignment.>
Instructions: To complete this project, you will need the following materials:
· STATDISK User Manual (found in the classroom in DocSharing)
· Access to the Internet to download the STATDISK program.
This assignment is worth a total of 60 points.
Part I. Histograms and Frequency Tables
Instructions
Answers
1. Open the file Diamonds using menu option Datasets and then Elementary Stats, 9th Edition. This file contains some information about diamonds. What are the names of the variables in this file?
2. Create a histogram for the depth of the diamonds using the Auto-fit option. Paste the chart here. Once your histogram displays, click Turn on Labels to get the height of the bars.
3. Using the information in the above histogram, complete this table. Be sure to include frequency, relative frequency, and cumulative frequency.
Depth
Frequency
Relative Frequency
Cumulative Frequency
57-58.9
59-60.9
61-62.9
63-64.9
a. Using the frequency table above, how many of the diamonds have a depth of 60.9 or less? How do you know?
b. Using the frequency table above, how many of the diamonds have a depth between 59 and 62.9? Show your work.
c. What percent of the diamonds have a depth of 61 or more?
Part II. Comparing Datasets
Instructions
Answers
1. Create a boxplot that compares the color and clarity of the diamonds. Paste it here.
2. Describe the similarities and differences in the data sets. Please be specific to the graph created.
Part III. Finding Descriptive Numbers
Instructions
Answers
3. Open the file named Stowaway (using Datasets and then Elementary Stats, 9th Edition). This gives information on the number of stowaways going west vs east.List all the variables in the dataset.
4. Find the Mean, median, and midrange for the Data in Column 1.
5. Find the Range, variance, and standard deviation for the first column.
6. List any values for the first column that you think may be outliers. Why do you think that?
[Hint: You may want to sort the data and look at the smallest and largest values.]
7. Find the Mean, median, and midrange for the data in Column 2.
8. Find the Range, variance, and standard deviation for the data in Column 2.
9. List any values for the second column that you think may be outliers. Why do you think that?
10. Find the five-number summary for the stowaways data in Columns 1 and 2. You will need to label each of the columns with an appropriate measure in the top row for clarity.
11. Compare number of stowaways going west and east using a boxplot of Columns 1 and 2. Paste your boxplot here
12. Create a histogram for the
Column 1 data and paste it here.
13. Create a histogram for the
Column 2 data and paste it here.
Part IV. Interpreting Statistical Information
The Stowaway data contains two columns, both of which are mea.
This section discusses analyzing categorical data:
- It introduces categorical variables and how to construct frequency tables and graphs like bar graphs and pie charts to display categorical variable distributions.
- It explains how to construct and interpret two-way tables to analyze relationships between two categorical variables, and how to examine marginal and conditional distributions.
- It emphasizes organizing statistical problems using a four step approach of stating the question, planning an approach, doing calculations/graphs, and concluding.
This document provides an overview of using SPSS (Statistical Package for the Social Sciences) software. It introduces the main interfaces for working with data in SPSS, including the data view, variable view, output view, draft view, and syntax view. It also provides instructions for installing sample data files and demonstrates how to generate a basic cross-tabulation output of employment by gender using the automated features.
This document provides instructions for performing various statistical analyses and data management tasks in SPSS, including sorting data, selecting cases, splitting files, merging files, visual binning, frequencies analysis, descriptive statistics, cross tabulation and chi-square tests, independent samples t-tests, and one-way ANOVA. The document is authored by trainers from the Department of Applied Statistics at the University of Rwanda and dated December 6, 2014.
This document provides an introduction to Microsoft Excel by describing spreadsheets, workbooks, worksheets, and basic Excel functions. It discusses how to open and save workbooks, navigate and modify worksheets by inserting and deleting rows and columns, format cells and worksheets, print worksheets, and use basic formulas with relative and absolute cell references. The objectives are to get familiar with the Excel interface and basic functions to build the foundations for more advanced spreadsheet skills.
This document provides an overview of the Statistical Package for Social Sciences (SPSS) software. It describes the main components and windows in SPSS, including the data window, variable view window, output window, and chart editor window. It also outlines several statistical techniques that can be performed in SPSS, such as descriptive statistics, correlations, t-tests, and chi-square tests of independence. SPSS is a tool that allows users to manage and analyze data, as well as generate graphs and conduct a wide range of statistical procedures.
This document discusses how charts can be used to convey messages through visual representations of data. It describes different chart types like pie charts, column charts, and stacked column charts and explains how to create and modify charts using the Chart Wizard and toolbar tools in Excel. It also covers linking and embedding charts in other documents, as well as multitasking between applications using the taskbar. The key points are how to select the appropriate chart type to fit the data and convey the intended message, and how to create and modify charts to effectively communicate information visually.
SPSS is a statistical software package used for statistical analysis and data management. It was first released in 1968 and was later acquired by IBM in 2009. Some key uses of SPSS include statistical analysis for social sciences, market research, education, and health research. It allows users to perform various statistical analyses, manage data, and export outputs to other formats.
The document describes how to create and configure a basic pivot table in Excel. It explains that a pivot table allows you to sort and summarize data independently of the original layout. The steps include selecting a data range, choosing to create a pivot table, and using the pivot table field list to designate fields as report filters, column labels, or row labels. Configuring these fields allows the user to build a report to analyze relationships in the data.
This document provides an overview of descriptive and inferential statistical procedures for analyzing data, including summarizing data using descriptive statistics and graphs, assessing reliability, comparing groups using t-tests and ANOVA, and testing associations using non-parametric tests and regression. It also discusses analyzing customer satisfaction data through reliability analysis, chi-square tests, and comparing service across shops.
Interactivity on Excel Using Pivoting, Dashboards, “Index and Match,” and Glo...Shalin Hai-Jew
Data are not inert, but interpreting summary data from still data visualizations may be somewhat limited. Excel has various built-in enablements for interactive engagements with data, including pivoting, dashboarding, indexing and matching, and global mapping. This work shows some basic setups of data that can enable more interactivity with little effort.
The document discusses various techniques for handling data in Excel, including entering data manually or importing it, sorting and filtering data, using subtotals and pivot tables to summarize data, and formatting options. Key techniques covered include importing tab-delimited files, sorting data by clicking Data > Sort, filtering data using Data > Autofilter, creating pivot tables by selecting the data source and dragging field buttons, and formatting cells using conditional formats.
The document discusses key concepts related to data processing including data, variables, cases, information, the steps of data processing, elements of data processing such as coding and tabulation, common problems, and software used for processing such as SPSS, SAS, and Quantum. Data processing converts raw data into usable information through steps like coding, cleaning, validating, classifying, tabulating, and analyzing the data. Tables are an important output and must be clearly formatted and labeled.
The pivot tables are not created mechanically. In Microsoft excel the user should select the data first for which the pivot table should be created. The pivot table option is available on the insert tab. The user has the option of inserting the pivot table either in the existing sheet or creating the pivot table in the new sheet. Copy the link given below and paste it in new browser window to get more information on Pivot Table:- http://www.transtutors.com/homework-help/statistics/pivot-table.aspx
Statistical package for social science (SPSS) is a software package used for statistical analysis. It was created in 1968. SPSS is used for data collection, organization, and output, as well as performing statistical tests. When using SPSS, researchers must first create a code book to define their variables before entering data into SPSS. Data is entered into columns representing variables and rows representing individual cases. Researchers should review their data for issues like outliers, missing data, and normal distribution before conducting statistical analyses.
The document provides an overview of how to use pivot tables in Excel to efficiently summarize and analyze large datasets. It explains that pivot tables allow users to automatically sort and count data from thousands of rows and columns in seconds. The document then guides the reader through steps to set up their first pivot table using sample data, including arranging fields and values, formatting options, calculating new fields, conditional formatting, and creating pivot charts. The overall document serves as a tutorial to help users learn the key capabilities and benefits of using pivot tables in Excel.
Data processing involves converting raw data into meaningful information through activities like editing, coding, classifying, tabulating and diagramming data. It is concerned with preparing raw research data for analysis by organizing and managing data files. The goal is to check for errors or inconsistencies in the data and structure it in a way that allows for descriptive and inferential statistical analysis.
This document provides an overview of using Microsoft Excel to handle, graph, and analyze scientific data. It begins with basics of the Excel interface and entering data. It then demonstrates how to manipulate data through calculations, format cells, and use functions. The document shows how to create scatter plots and add regression lines to graphs. It also discusses interpolation, extrapolation, printing graphs, downloading internet data, and more advanced statistical analyses in Excel.
Statistical Package for Social Science (SPSS)sspink
This presentation includes the introduction of SPSS is basic features of Spss, how to input data manually, descriptive statistics and how to perform t-test, Anova and Chi-Square.
SPSS is a popular statistical analysis software that is known for its ease of use. It has strong graphical capabilities and supports a variety of statistical analyses. However, it lacks some more advanced statistical procedures and has limited data management tools. While suitable for many tasks, some users may outgrow it over time and require more specialized software like SAS or Stata for complex or cutting-edge analyses. Overall, SPSS is best suited for users performing basic to intermediate statistical analysis and reporting.
Chapter 2 Graphical Descriptions of Data 25 Chapter 2.docxcravennichole326
Chapter 2: Graphical Descriptions of Data
25
Chapter 2: Graphical Descriptions of Data
In chapter 1, you were introduced to the concepts of population, which again is a
collection of all the measurements from the individuals of interest. Remember, in most
cases you can’t collect the entire population, so you have to take a sample. Thus, you
collect data either through a sample or a census. Now you have a large number of data
values. What can you do with them? No one likes to look at just a set of numbers. One
thing is to organize the data into a table or graph. Ultimately though, you want to be able
to use that graph to interpret the data, to describe the distribution of the data set, and to
explore different characteristics of the data. The characteristics that will be discussed in
this chapter and the next chapter are:
1. Center: middle of the data set, also known as the average.
2. Variation: how much the data varies.
3. Distribution: shape of the data (symmetric, uniform, or skewed).
4. Qualitative data: analysis of the data
5. Outliers: data values that are far from the majority of the data.
6. Time: changing characteristics of the data over time.
This chapter will focus mostly on using the graphs to understand aspects of the data, and
not as much on how to create the graphs. There is technology that will create most of the
graphs, though it is important for you to understand the basics of how to create them.
Section 2.1: Qualitative Data
Remember, qualitative data are words describing a characteristic of the individual. There
are several different graphs that are used for qualitative data. These graphs include bar
graphs, Pareto charts, and pie charts.
Pie charts and bar graphs are the most common ways of displaying qualitative data. A
spreadsheet program like Excel can make both of them. The first step for either graph is
to make a frequency or relative frequency table. A frequency table is a summary of
the data with counts of how often a data value (or category) occurs.
Example #2.1.1: Creating a Frequency Table
Suppose you have the following data for which type of car students at a college
drive?
Ford, Chevy, Honda, Toyota, Toyota, Nissan, Kia, Nissan, Chevy, Toyota,
Honda, Chevy, Toyota, Nissan, Ford, Toyota, Nissan, Mercedes, Chevy,
Ford, Nissan, Toyota, Nissan, Ford, Chevy, Toyota, Nissan, Honda,
Porsche, Hyundai, Chevy, Chevy, Honda, Toyota, Chevy, Ford, Nissan,
Toyota, Chevy, Honda, Chevy, Saturn, Toyota, Chevy, Chevy, Nissan,
Honda, Toyota, Toyota, Nissan
Chapter 2: Graphical Descriptions of Data
26
A listing of data is too hard to look at and analyze, so you need to summarize it.
First you need to decide the categories. In this case it is relatively easy; just use
the car type. However, there are several cars that only have one car in the list. In
that case it is easier to make a category called other for the ones with low values.
Now ...
Week 2 Project - STAT 3001Student Name Type your name here.docxcockekeshia
Week 2 Project - STAT 3001
Student Name: <Type your name here>
Date: <Enter the date on which you began working on this assignment.>
Instructions: To complete this project, you will need the following materials:
· STATDISK User Manual (found in the classroom in DocSharing)
· Access to the Internet to download the STATDISK program.
This assignment is worth a total of 60 points.
Part I. Histograms and Frequency Tables
Instructions
Answers
1. Open the file Diamonds using menu option Datasets and then Elementary Stats, 9th Edition. This file contains some information about diamonds. What are the names of the variables in this file?
2. Create a histogram for the depth of the diamonds using the Auto-fit option. Paste the chart here. Once your histogram displays, click Turn on Labels to get the height of the bars.
3. Using the information in the above histogram, complete this table. Be sure to include frequency, relative frequency, and cumulative frequency.
Depth
Frequency
Relative Frequency
Cumulative Frequency
57-58.9
59-60.9
61-62.9
63-64.9
a. Using the frequency table above, how many of the diamonds have a depth of 60.9 or less? How do you know?
b. Using the frequency table above, how many of the diamonds have a depth between 59 and 62.9? Show your work.
c. What percent of the diamonds have a depth of 61 or more?
Part II. Comparing Datasets
Instructions
Answers
1. Create a boxplot that compares the color and clarity of the diamonds. Paste it here.
2. Describe the similarities and differences in the data sets. Please be specific to the graph created.
Part III. Finding Descriptive Numbers
Instructions
Answers
3. Open the file named Stowaway (using Datasets and then Elementary Stats, 9th Edition). This gives information on the number of stowaways going west vs east.List all the variables in the dataset.
4. Find the Mean, median, and midrange for the Data in Column 1.
5. Find the Range, variance, and standard deviation for the first column.
6. List any values for the first column that you think may be outliers. Why do you think that?
[Hint: You may want to sort the data and look at the smallest and largest values.]
7. Find the Mean, median, and midrange for the data in Column 2.
8. Find the Range, variance, and standard deviation for the data in Column 2.
9. List any values for the second column that you think may be outliers. Why do you think that?
10. Find the five-number summary for the stowaways data in Columns 1 and 2. You will need to label each of the columns with an appropriate measure in the top row for clarity.
11. Compare number of stowaways going west and east using a boxplot of Columns 1 and 2. Paste your boxplot here
12. Create a histogram for the
Column 1 data and paste it here.
13. Create a histogram for the
Column 2 data and paste it here.
Part IV. Interpreting Statistical Information
The Stowaway data contains two columns, both of which are mea.
The document provides an overview of a training on using SPSS. It is divided into three parts:
1) Introduction to SPSS, including background, objectives, and definitions.
2) Dealing with SPSS, covering getting started, key terms, creating a code book, and data entry.
3) Data management and analysis using SPSS, including exploratory, descriptive, and inferential analysis.
The training invites participants to properly learn how to use SPSS and makes time for questions.
Homework Assignment 9 Edited on 10272014 Due by Wednes.docxadampcarr67227
Homework Assignment 9
Edited on 10/27/2014
Due by Wednesday 11/05/2014
Part A. Exercise 8.2 Solve the Systems.
Part B. 1. Classify the critical point (0, 0) for the questions in Part A. (Similar questions can be
found on Section 10.2)
2. Provide one example of 2x2 matrix that has only one eigenvalue but with two linearly
independent eigenvectors.
Part C. Find all the critical points and, determine the types and classify their stability.
Exercise 10.3
BA 301 – Research & Analysis of Business Problems
Homework Assignment #3 – Fun With Excel
Format/Requirements
• Hand this in at the beginning of class, not by email.
• Use Times New Roman, 12-point font and the format provided in the Homework template.
• Make sure you show your name, course number, and section number on the page.
• Number the answers, and place them in the correct order.
.
Overview
Excel is a terrific tool for data and statistical analysis. This assignment involves working with a set of data containing information about different charity donors, which might be used to manage fundraising direct mail or promotional campaigns. You will learn a few simple tricks for analyzing this data such that you can extract some useful information and answer some questions. These instructions are written for Excel 2013. Excel 2010 should be quite similar. This exercise requires that you have the Data Analysis package installed for Excel. If you don’t, you may need your original Microsoft discs. Let me know if you need help with this installation. I suggest that you don’t wait until the last minute to complete this assignment.
Download the file “Fun With Excel Raw Data” found on the D2L course website. Open the file with Microsoft Excel and follow the instructions found with each of the following questions. Copy or take screenshots of the results/data, and ensure that you separately provide specific answers to the questions. Save all work in a separate Word file. Save as your_name_BA301-008.docx (*please put your actual name in the space that says “your_name”). Email me your answer your Word file by beginning of class. DO NOT print out all of the regression data for Question 1, only the basic r-squared and Sig of F information, and the X-Y graph. Be aware, the Mac version of Excel does not allow you to do Pivot Charts, only Pivot tables. So, you will need to use a Windows PC for Question 4.
Question 1: Among large donors (greater than or equal to $50,000), does the amount of giving tend to increase as the years of involvement with the organization increases? (i.e. is there a correlation between giving and years?). What number do you look at to determine this correlation?
Features: Data Sort, Regression
Instructions: Sort the data by amount of giving in ascending order by clicking on any cell in the table and selecting Data, Sort, select column E for Giving by choosing that in the Sort By drop-down menu, and sort in Smallest to .
The document provides an overview of the key components and terminology used in Tableau, including workbooks, sheets, dashboards, stories, containers, dimensions, measures, filters, parameters, groups, sets, hierarchies, actions, and shortcuts. It defines each component and provides examples. The summary also includes links to relevant tutorial videos to further explain concepts like building dashboards, using filters and parameters, creating groups and sets, and more.
Tableau is a business intelligence tool that allows users to create customizable data visualizations and dashboards with no coding required. It integrates with many data sources and can handle large datasets quickly. The main components in Tableau include worksheets, dashboards, and stories. Worksheets contain single views, dashboards consolidate multiple views, and stories describe data narratives through multiple dashboards. Tableau provides various visualizations like bar charts, line charts, maps, and more. Users can customize visualizations by filtering data, formatting colors and fonts, and aggregating measures.
**Tableau Basics Cheat Sheet**
*Introduction:*
Tableau is a powerful data visualization and business intelligence tool that allows users to easily analyze and present data in a visually compelling way. This Tableau Basics Cheat Sheet serves as a quick reference guide to help beginners get started with Tableau and perform common tasks efficiently.
*1. Data Connection:*
- Connect to data sources like Excel, CSV, databases, etc.
- Drag and drop data fields onto the "Rows" and "Columns" shelves to create a basic view.
*2. Visualizations:*
- Create various visualizations like bar charts, line graphs, scatter plots, maps, etc.
- Double-click on a field to create a basic visualization or drag fields onto the "Marks" shelf to customize.
*3. Filters:*
- Apply filters to focus on specific data subsets.
- Right-click on a field or drag it onto the "Filters" shelf to create a filter.
*4. Sorting:*
- Sort data based on specific fields.
- Right-click on a field or use the sort icon in the toolbar to sort data.
*5. Calculated Fields:*
- Create custom calculations using existing fields.
- Go to "Analysis" > "Create Calculated Field" to define formulas.
*6. Groups & Hierarchies:*
- Group data by combining individual data points.
- Create hierarchies to organize data into drill-down levels.
*7. Aggregation:*
- Aggregate data using functions like SUM, AVG, COUNT, etc.
- Drag fields to the "Measure Values" shelf to apply aggregation.
*8. Dual-Axis Charts:*
- Combine two different chart types on the same axes to compare data.
- Right-click on a axis and select "Dual-Axis" to enable this feature.
*9. Dashboard Creation:*
- Combine multiple visualizations into a single dashboard.
- Drag sheets onto the dashboard canvas and arrange as desired.
*10. Parameters:*
- Use parameters to create interactive dashboards.
- Right-click in the data pane and select "Create Parameter" to get started.
*11. Data Blending:*
- Blend data from multiple sources for comprehensive analysis.
- Go to "Data" > "Edit Relationships" to define data blending.
*12. Publishing:*
- Share your Tableau workbooks with others by publishing to Tableau Server or Tableau Public.
- Go to "Server" > "Publish Workbook" to share your insights.
*13. Formatting:*
- Customize the appearance of visualizations and dashboards.
- Use the "Format" options and "Dashboard" menu to fine-tune the design.
*14. Interactivity:*
- Add filters, actions, and tooltips to make visualizations interactive.
- Explore the "Dashboard" menu for interaction options.
*15. Exporting:*
- Export visualizations as images, PDFs, or data files.
- Use the "File" menu to access export options.
Remember, this Tableau Basics Cheat Sheet provides just a glimpse of Tableau's capabilities. As you gain more experience, you can explore advanced features and create more sophisticated data visualizations and insights to drive better decision-making. Happy data analyzing with Tableau!
Since the instructions for the final project are standardized and .docxedgar6wallace88877
Since the instructions for the final project are standardized and provided by the department, I thought you might appreciate some pointers and key areas of focus to help you navigate this project! Use this in conjunction with your syllabus instructions, which contain detailed content instructions.
Read the syllabus instructions VERY carefully, pay attention to the requirements embedded in the sentences. In fact, I would construct each heading and subheading (YES, use APA formatted subheadings) according to the required areas listed in your instructions. Here are some formatting directions for subheadings and a rough example for organization of your project with subheadings.
APA Headings Level Formatting Guidelines:
1 Centered, Boldface, Uppercase and Lowercase
2 Left-aligned, Boldface, Uppercase and Lowercase Heading
3 Indented, boldface, lowercase heading with a period. Begin body text after the period.
4 Indented, boldface, italicized, lowercase heading with a period. Begin body text after the period.
5 Indented, italicized, lowercase heading with a period. Begin body text after the period.
Example:
1st page is the TITLE PAGE with Running head- refer to APA guidelines
See how I used proper capitalization for my running head?
Running head: GENDER TRAINING IN THE WORKPLACE
2nd page and remaining pages... The body of your project training program and report. Look up APA formatting: double space entire body, font 12-Times New Roman, 1” margins all sides. Make sure you use proper APA citations in the text (McCarty, 2016) and that those resources are listed on the reference page, including the journal and website citations that you chose.
Introduction and Identification of Problems (1st level)
Participant Name and Problem #1
(You will do this 12 times) (2nd level subheading, left justified, paragraph begins
next line after heading-double space. I will not double space the rest of this example to save space, but don’t forget to do it! Make sure you check your settings to be a true double space-nothing less and nothing more)
Training Program: Session One (8 of these)
Session One Title (2nd level)
Gender and Hostility in the Workplace
Objectives (2nd level)
The goal of this course is to
· define gender,
· define hostility,
· identify areas of hostility…(You can use bullets and or level three subheadings to list/organize).
Problem (2nd level)
State the participants problem(s) you will address with this session
Journal articles and websites
Journal. List one or more peer review (ACADEMIC) article(s) that is relevant to the issue/problem using an APA formatted reference.
Website. List one website that is relevant to the issue/problem and put into APA formatted reference.
Activity
Create and describe an activity that will promote discussion and understanding among the participants.
Activity breakdown. (3rd level subheading). Start text after period. You might want to use this 3rd level subheadin.
Introduction to Business analytics unit3jayarellirs
This document discusses various methods for visualizing and summarizing data. It describes different types of charts like column charts, line charts, pie charts, and scatter plots that can be used to visualize quantitative data. It also discusses tools in Excel for filtering, sorting, and summarizing data in tables and how techniques like Pareto analysis can help identify key factors.
Scott Harvey, Registrar at Tri-County Technical College, presented on using Excel PivotTables to analyze student data. The presentation introduced PivotTables and how they can be used to summarize large amounts of student data from an Excel worksheet into concise reports. It covered preparing the source data, creating a PivotTable, adding filters, showing details, creating PivotCharts, and useful tips. The presentation concluded with a demonstration of PivotTables and how they can help answer questions about student enrollment numbers, majors, withdrawals and more from a data set in just a half hour.
社會網絡分析UCINET Quick Start Guide
This guide provides a quick introduction to UCINET. It assumes that the software has been installedwith the data in the folder C:\Program Files\Analytic Technologies\Ucinet 6\DataFiles and this hasbeen left as the default directory.
Source : https://sites.google.com/site/ucinetsoftware/home
Collect 50 or more paired quantitative data items. You may use a met.pdfivylinvaydak64229
Collect 50 or more paired quantitative data items. You may use a method similar to the Module 1
discussion to collect and enter data into StatCrunch. You will enter the explanatory variable (x-
value) in column var1. Then, enter the response variable (y-value) in column var2.
a.) Using StatCrunch, compute the sample linear correlation coefficient, R. The Technology
Step-by-Step box at the end of Section 4.1 (page 194) explains how to do so. Do not forget the
video explanation in the Module Notes, if you need it.
b.) Using StatCrunch, find the least-squares regression line equation and plot the scatter diagram,
along with the line. Page 207 (Technology Step-by-Step box) explains how to determine such a
linear equation using StatCrunch. Please note: In order to plot the scatter diagram along with the
line, before clicking Calculate in step 3 of page 207, scroll down to Graphs and make sure Fitted
line plot is selected. Then click Calculate. Then click the right-arrow at the very bottom right
hand side of the results page for the scatter diagram and regression line plot. For an example of
the steps taken and what to expect, click here.
c.) Paste your scatter diagram (with the regression line drawn) and StatCrunch results in the
discussion (by clicking on Options and then Copy. Use Ctrl V to paste it into the discussions).
Try not to use the same data set that another student in the class has used, so your results will be
unique. Make sure your data set is large enough (50 items).
d.) Then, answer the following two questions:
What type of correlation do you observe between the two variables? For ideas, see Figure 4 on
page 181 (Section 4.1).
Would you recommend using this linear model to make predictions about the y-value for a given
x-value? Why or why not?
Solution
Technology Step-by-Step Using StatCrunch:
------------------------------------------------------
Section 1.3 Simple Random Sampling
...........................................................
1. Select Data, highlight Simulate Data, then highlight
Discrete Uniform.
2. Fill in the following window with the appropriate
values. To obtain a simple random sample for the
situation in Example 2, we would enter the values
shown in the figure. The reason we generate 10 rows
of data (instead of 5) is in case any of the random
numbers repeat. Select Simulate, and the random
numbers will appear in the spreadsheet. Note: You
could also select the single dynamic seed radio
button, if you like, to set the seed.
Section 2.1 Drawing Bar Graphs and Pie Charts
Frequency or Relative Frequency Distributions from Raw Data
.....................................................................................................
1. Enter the raw data into the spreadsheet. Name the column variable.
2. Select Stat, highlight Tables, and select Frequency.
3. Click on the variable you wish to summarize and click Calculate.
Bar Graphs from Summarized Data
...............................................................
The three rules of data analysis are to make a picture from the data as pictures can reveal non-obvious patterns and features and are the best way to communicate findings to others. Frequency and relative frequency tables organize data into categories and show counts or percentages of cases in each category to describe the distribution of a categorical variable. Bar charts also display categorical variable distributions and adhere to the area principle by ensuring the area of each bar corresponds to the data value, making them a better choice than a ship display used for Titanic passenger data.
Excel provides tools for graphing, analyzing, and formatting data. Key capabilities include:
1) Creating scatter plots and adding trendlines to show regression. The layout ribbon customizes graphs by adding titles, labels, and legends.
2) Performing calculations using functions and applying formulas down columns. Formatting options include number of decimals and scientific notation.
3) Adding error bars to express uncertainty in data on graphs. Both horizontal and vertical error bars can be customized.
Minitab is a general-purpose statistical analysis software package developed in 1972 at Pennsylvania State University. It began as a lighter version of the NIST statistical program OMNITAB. Minitab provides tools for data management, statistical analysis, and graphing in a simple interface. Key functions include worksheet editing, data manipulation, statistical tests, graphs, and help/tutorials. Descriptive statistics in Minitab include measures of central tendency like the mean, median, and mode and measures of variability such as standard deviation and range.
This document discusses how to create and manipulate pivot table reports in Excel. Pivot tables allow users to analyze and manipulate numerical data in spreadsheets to answer questions. The document provides step-by-step instructions for creating a basic pivot table, adding filters, and moving or "pivoting" fields to view the data in different ways. It also describes how to create a pivot chart based on the data in a pivot table report.
De vry math 399 ilabs & discussions latest 2016lenasour
This document provides information and discussion questions for several weeks of a statistics course (MATH 399) at DeVry University. It includes discussion questions and assignments related to topics like descriptive statistics, regression, probability, confidence intervals, and hypothesis testing. For each week, it provides the discussion question, any relevant instructions, and sometimes a short summary of the statistical concept being covered. It also includes information about completing iLabs (interactive labs) and assignments in Excel to reinforce these statistical topics.
De vry math 399 ilabs & discussions latest 2016 novemberlenasour
This document provides materials and instructions for several weekly discussions and iLabs for a DeVry University MATH 399 course. It includes discussion prompts and questions for weeks 1 through 7 on topics such as descriptive statistics, regression, probability, confidence intervals, and hypothesis testing. It also provides instructions and questions for iLabs on related statistical concepts involving Excel, probability distributions, descriptive statistics, and confidence intervals. Students are asked to perform calculations, create graphs and charts, interpret results, and answer questions demonstrating their understanding of the statistical content.
1. The document discusses various methods for summarizing categorical and quantitative data through tables and graphs, including frequency distributions, relative frequency distributions, bar charts, pie charts, dot plots, histograms, and ogives.
2. An example using data on customer ratings from a hotel illustrates frequency distributions and pie charts.
3. Another example using costs of auto parts demonstrates frequency distributions, histograms, and ogives.
This chapter introduces the chi-square test, which can be used to determine if there is a relationship between two categorical variables or if sample data comes from a population with a specific distribution. The chi-square test for independence compares observed and expected frequencies in a contingency table to test if two variables are independent. The test statistic is calculated as the sum of the squared differences between observed and expected counts divided by the expected counts. A small p-value provides evidence to reject the null hypothesis of independence between the variables.
This document provides an overview of linear regression and correlation. It discusses using regression to find the linear relationship between two quantitative variables and using correlation to measure the strength of that linear relationship. An example uses beer alcohol content and calorie data to create a scatterplot and find the linear regression equation that best fits the data, allowing interpretation of the slope and y-intercept. Residuals are introduced as the vertical distances between the data points and the linear regression line, and minimizing their sum of squares is described as the criterion for the best fitting line.
This chapter discusses methods for hypothesis testing and constructing confidence intervals for two populations or groups. It provides examples comparing testosterone levels before and after having children, weight loss from a diet, and approval ratings between age groups. The chapter explores the processes and formulas for hypothesis tests and confidence intervals involving two proportions, including a worked example comparing reported rates of cheating between husbands and wives.
This document discusses confidence intervals for estimating population parameters. It provides examples of constructing point and interval estimates for the population mean and proportion from sample data. Confidence intervals allow us to estimate a range of plausible values for the true population parameter based on the sample results and desired confidence level, rather than just a single point value. The width of the confidence interval depends on the sample size and confidence level, with larger samples and lower confidence levels producing narrower intervals.
This document provides an example of conducting a one-sample hypothesis test to determine if the claimed mean battery life of 500 days by a manufacturer is accurate based on a sample of battery data. The null and alternative hypotheses are defined as H0: μ = 500 days and HA: μ < 500 days. The sample mean of 490 days is calculated and found to be 2.19 standard deviations below the claimed mean. The p-value is the probability of obtaining a sample mean this low or lower if the null hypothesis is true, which is calculated to be 0.0142, a small probability. This provides evidence to reject the manufacturer's claim and conclude the population mean is actually less than 500 days.
This document discusses continuous probability distributions and the normal distribution. It begins by introducing continuous random variables and the uniform distribution. It then covers graphs of the normal distribution, its key properties, and the empirical rule for approximating probabilities. The document emphasizes that to find probabilities for a continuous random variable, we calculate the area under the curve between two points on the x-axis. It presents examples of identifying properties of normal distributions from graphs and using the empirical rule to estimate probabilities.
The document discusses discrete probability distributions. It provides examples of discrete random variables that arise from counting, like the number of fleas on a prairie dog. It also gives the definition of a discrete probability distribution as a table, graph or formula that lists the possible values a discrete random variable can take and their corresponding probabilities. The document calculates the mean and standard deviation for the probability distribution of household size from US Census data, finding the average household has 2.525 people with a standard deviation of 1.422 people. It also gives an example of calculating expected value using a lottery game.
The document discusses theoretical and experimental probabilities. Experimental probabilities are calculated by performing an experiment and observing the relative frequency of outcomes. Theoretical probabilities assume all outcomes are equally likely and calculate probabilities as the number of desired outcomes divided by the total number of outcomes. For example, flipping a fair coin twice has a theoretical probability of 1/2 for getting exactly one head, as there are 2 ways to get one head out of the 4 total outcomes. As the number of experimental trials increases, the relative frequency approaches the true theoretical probability due to the law of large numbers.
This document discusses numerical descriptions of data, including measures of central tendency (mean, median, mode) and how to calculate them. It provides examples of finding the mean, median, and mode for data sets. It explains that the mean can be affected by outliers, while the median and mode are resistant measures. The document also discusses weighted averages and how to calculate them using technology like the TI-84 calculator or StatCrunch.
This document provides an overview of key concepts in statistics including:
- What statistics is and its two branches of descriptive and inferential statistics
- Key terms like population, sample, parameter, statistic, individual, and variable
- Types of variables including qualitative, quantitative discrete, and quantitative continuous
- Common sampling methods like simple random sampling, stratified sampling, systematic sampling, and cluster sampling
- Examples are provided to demonstrate how to identify and define the terms for different statistical studies
When the P-Value is less than the significance level (α), reject the null hypothesis (H0) as there is sufficient evidence to support the alternative hypothesis (H1). When the P-Value is greater than or equal to the significance level (α), do not reject the null hypothesis (H0) as there is not sufficient evidence to support the alternative hypothesis (H1). The significance level and comparing the P-Value to it determines whether to reject or fail to reject the null hypothesis.
This 3 sentence document provides 2 brief examples but does not include any details about those examples or what they are examples of. It mentions drawing P-values but does not explain what those are or how they relate to the examples. More context would be needed to fully understand or summarize the content.
Confidence Interval for Mean and Proportion (Methodology)MaryWall14
This document provides examples of how to calculate confidence intervals for the mean and proportion. It gives step-by-step instructions for calculating a 90% confidence interval for the population mean delivery time of pizza restaurants based on a sample. It also shows how to calculate a 95% confidence interval for the mean number of hours slept based on sample data. Finally, it demonstrates calculating a 95% confidence interval for the proportion of people who own tablets based on survey results.
The document discusses three examples of hypothesis tests: 1) Testing the mean of a normal distribution with known population standard deviation, which rejects the null hypothesis, supporting that new goggles improve swimming speed. 2) Testing the mean of a t-distribution with unknown population standard deviation, which rejects the null hypothesis that the mean test score is greater than 65. 3) Testing a proportion, which fails to reject the null hypothesis that 50% of brides are younger than grooms.
The document provides examples of hypothesis testing procedures for claims about population proportions and means. The first example tests a claim about improving chances of a boy being born and finds the results support the claim. The second example tests a claim about the mean age of graduate students and finds the results do not reject the claim. Both examples relate their conclusions back to whether the results support or do not reject the original claim.
This document provides examples of how to calculate confidence intervals for the mean and proportion. It gives step-by-step instructions for calculating a 90% confidence interval for the population mean delivery time of pizza restaurants based on a sample. It also shows how to calculate a 95% confidence interval for the mean number of hours slept based on sample data. Finally, it demonstrates calculating a 95% confidence interval for the proportion of people who own tablets based on survey results.
We are provided a general template for interpreting a confidence interval for a proportion. The template states that we can be __% confident that the __ is between __% and __%. An example then gives a 95% confidence interval between 7% and 9% for the proportion of individuals in Chesapeake that use public transportation.
This document discusses potential sources of bias in survey research studies. It provides examples of biases that can arise from poor study funding sources, question wording, sampling methods, sample size and response rates, potential confounding variables, and overgeneralization of results. Specifically, it notes that industry funded studies may produce biased results and that questions should be neutrally worded to avoid influencing responses. Random sampling helps minimize biases compared to methods like voluntary responses. Sample sizes of at least 100 for local surveys and 1000 for national are recommended. [END SUMMARY]
1.3 Experimental Design and Observational Studies MaryWall14
This document discusses experimental design and observational studies. It defines an experiment as a controlled study that establishes cause and effect by varying factors and comparing treatment groups to a control group. An observational study merely observes or collects existing data without influencing variables and cannot prove causation. The document provides guidelines for planning experiments and describes randomized, matched pairs, and rigorously controlled experimental designs. It also discusses placebos, replication, blinding, and examples of an experimental drug study and observational study types.
This document discusses different sampling methods for collecting data from a population. It describes simple random sampling as selecting individuals from a population where every individual has an equal chance of being selected. Stratified sampling divides the population into groups or "strata" and then takes random samples from each group. Cluster sampling randomly selects entire groups from the population and samples all individuals within those groups. Systematic sampling selects every nth individual from the population to avoid biases. A convenience sample is not representative since it uses easily accessible individuals like friends or voluntary respondents.
Executive Directors Chat Leveraging AI for Diversity, Equity, and InclusionTechSoup
Let’s explore the intersection of technology and equity in the final session of our DEI series. Discover how AI tools, like ChatGPT, can be used to support and enhance your nonprofit's DEI initiatives. Participants will gain insights into practical AI applications and get tips for leveraging technology to advance their DEI goals.
Main Java[All of the Base Concepts}.docxadhitya5119
This is part 1 of my Java Learning Journey. This Contains Custom methods, classes, constructors, packages, multithreading , try- catch block, finally block and more.
it describes the bony anatomy including the femoral head , acetabulum, labrum . also discusses the capsule , ligaments . muscle that act on the hip joint and the range of motion are outlined. factors affecting hip joint stability and weight transmission through the joint are summarized.
A workshop hosted by the South African Journal of Science aimed at postgraduate students and early career researchers with little or no experience in writing and publishing journal articles.
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Dr. Vinod Kumar Kanvaria
Exploiting Artificial Intelligence for Empowering Researchers and Faculty,
International FDP on Fundamentals of Research in Social Sciences
at Integral University, Lucknow, 06.06.2024
By Dr. Vinod Kumar Kanvaria
This presentation was provided by Steph Pollock of The American Psychological Association’s Journals Program, and Damita Snow, of The American Society of Civil Engineers (ASCE), for the initial session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session One: 'Setting Expectations: a DEIA Primer,' was held June 6, 2024.
This slide is special for master students (MIBS & MIFB) in UUM. Also useful for readers who are interested in the topic of contemporary Islamic banking.
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...PECB
Denis is a dynamic and results-driven Chief Information Officer (CIO) with a distinguished career spanning information systems analysis and technical project management. With a proven track record of spearheading the design and delivery of cutting-edge Information Management solutions, he has consistently elevated business operations, streamlined reporting functions, and maximized process efficiency.
Certified as an ISO/IEC 27001: Information Security Management Systems (ISMS) Lead Implementer, Data Protection Officer, and Cyber Risks Analyst, Denis brings a heightened focus on data security, privacy, and cyber resilience to every endeavor.
His expertise extends across a diverse spectrum of reporting, database, and web development applications, underpinned by an exceptional grasp of data storage and virtualization technologies. His proficiency in application testing, database administration, and data cleansing ensures seamless execution of complex projects.
What sets Denis apart is his comprehensive understanding of Business and Systems Analysis technologies, honed through involvement in all phases of the Software Development Lifecycle (SDLC). From meticulous requirements gathering to precise analysis, innovative design, rigorous development, thorough testing, and successful implementation, he has consistently delivered exceptional results.
Throughout his career, he has taken on multifaceted roles, from leading technical project management teams to owning solutions that drive operational excellence. His conscientious and proactive approach is unwavering, whether he is working independently or collaboratively within a team. His ability to connect with colleagues on a personal level underscores his commitment to fostering a harmonious and productive workplace environment.
Date: May 29, 2024
Tags: Information Security, ISO/IEC 27001, ISO/IEC 42001, Artificial Intelligence, GDPR
-------------------------------------------------------------------------------
Find out more about ISO training and certification services
Training: ISO/IEC 27001 Information Security Management System - EN | PECB
ISO/IEC 42001 Artificial Intelligence Management System - EN | PECB
General Data Protection Regulation (GDPR) - Training Courses - EN | PECB
Webinars: https://pecb.com/webinars
Article: https://pecb.com/article
-------------------------------------------------------------------------------
For more information about PECB:
Website: https://pecb.com/
LinkedIn: https://www.linkedin.com/company/pecb/
Facebook: https://www.facebook.com/PECBInternational/
Slideshare: http://www.slideshare.net/PECBCERTIFICATION
हिंदी वर्णमाला पीपीटी, hindi alphabet PPT presentation, hindi varnamala PPT, Hindi Varnamala pdf, हिंदी स्वर, हिंदी व्यंजन, sikhiye hindi varnmala, dr. mulla adam ali, hindi language and literature, hindi alphabet with drawing, hindi alphabet pdf, hindi varnamala for childrens, hindi language, hindi varnamala practice for kids, https://www.drmullaadamali.com
The simplified electron and muon model, Oscillating Spacetime: The Foundation...RitikBhardwaj56
Discover the Simplified Electron and Muon Model: A New Wave-Based Approach to Understanding Particles delves into a groundbreaking theory that presents electrons and muons as rotating soliton waves within oscillating spacetime. Geared towards students, researchers, and science buffs, this book breaks down complex ideas into simple explanations. It covers topics such as electron waves, temporal dynamics, and the implications of this model on particle physics. With clear illustrations and easy-to-follow explanations, readers will gain a new outlook on the universe's fundamental nature.
How to Fix the Import Error in the Odoo 17Celine George
An import error occurs when a program fails to import a module or library, disrupting its execution. In languages like Python, this issue arises when the specified module cannot be found or accessed, hindering the program's functionality. Resolving import errors is crucial for maintaining smooth software operation and uninterrupted development processes.
Walmart Business+ and Spark Good for Nonprofits.pdfTechSoup
"Learn about all the ways Walmart supports nonprofit organizations.
You will hear from Liz Willett, the Head of Nonprofits, and hear about what Walmart is doing to help nonprofits, including Walmart Business and Spark Good. Walmart Business+ is a new offer for nonprofits that offers discounts and also streamlines nonprofits order and expense tracking, saving time and money.
The webinar may also give some examples on how nonprofits can best leverage Walmart Business+.
The event will cover the following::
Walmart Business + (https://business.walmart.com/plus) is a new shopping experience for nonprofits, schools, and local business customers that connects an exclusive online shopping experience to stores. Benefits include free delivery and shipping, a 'Spend Analytics” feature, special discounts, deals and tax-exempt shopping.
Special TechSoup offer for a free 180 days membership, and up to $150 in discounts on eligible orders.
Spark Good (walmart.com/sparkgood) is a charitable platform that enables nonprofits to receive donations directly from customers and associates.
Answers about how you can do more with Walmart!"
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
Chapter 2
1. Chapter 2: Graphical Descriptions of Data
22
Chapter 2: Graphical Descriptions of Data
In chapter 1, you were introduced to the concept of a population, which again is the set of
all individuals of interest. Remember, in most cases you can’t collect data from the entire
population, so you have to take a sample. Thus, you collect data either through a sample
or a census. Now you have a large number of data values. What can you do with them?
No one likes to look at just a set of numbers. One thing is to organize the data into a
table or graph. Ultimately though, you want to be able to use that graph to interpret the
data, to describe the distribution of the data set, and to explore different characteristics of
the data. The characteristics that will be discussed in this chapter and the next chapter
are:
1. Center: middle of the data set, also known as the average.
2. Variation: how much the data varies.
3. Distribution: shape of the data (symmetric, uniform, or skewed).
4. Outliers: data values that are far from the majority of the data.
This chapter will focus mostly on using the graphs to understand aspects of the data, and
not as much on how to create the graphs. There is technology that will create most of the
graphs, though it is important for you to understand the basics of how to create them.
Section 2.1: Qualitative Data
Remember, qualitative data are words describing a characteristic of the individual
(including numbers that don’t count or measure anything about the individual). There are
several different graphs that are used for qualitative data.
Qualitative data can first be organized in a frequency or relative frequency table.
Frequency table – organizes collected data in table form using categories (or classes)
and frequencies (counts).
Relative frequency table – organizes raw data in table form using categories (or classes)
and proportions (or percentages). Relative frequency tables are useful when comparing
data sets where the sample sizes are not the same.
Example #2.1.1: Creating a Frequency Table for Qualitative Data
Suppose you have the following data for which type of car students at a campus
drive.
Ford, Chevy, Honda, Toyota, Toyota, Nissan, Kia, Nissan, Chevy, Toyota,
Honda, Chevy, Toyota, Nissan, Ford, Toyota, Nissan, Mercedes, Chevy,
Ford, Nissan, Toyota, Nissan, Ford, Chevy, Toyota, Nissan, Honda,
Porsche, Hyundai, Chevy, Chevy, Honda, Toyota, Chevy, Ford, Nissan,
Toyota, Chevy, Honda, Chevy, Saturn, Toyota, Chevy, Chevy, Nissan,
Honda, Toyota, Toyota, Nissan
2. Chapter 2: Graphical Descriptions of Data
23
First identify the individual, variable and type of variable.
Individual: a randomly selected student who drives a car to campus
Variable: type of car
Type of variable: qualitative
A listing of data is too hard to look at and analyze, so you need to summarize it.
First you need to decide the categories. In this case it is relatively easy; just use
the car type. However, there are several cars that only have one car in the list. In
that case it is easier to make a category called “other” for the ones with low
values. Now just count how many of each type of cars there are. For example,
there are 5 Fords, 12 Chevys, and 6 Hondas. This can be put in a frequency
distribution:
Table #2.1.1: Frequency Table for Type of Car Data
Category Frequency
Ford 5
Chevy 12
Honda 6
Toyota 12
Nissan 10
Other 5
Total 50
The total of the frequency column should be the number of observations in the
data. Typically, the counts are not what are reported. Instead, the relative
frequencies are used. This is just the frequency divided by the total. As an
example for the Ford category:
relative frequency =
5
50
= 0.10
This can be written as a decimal, fraction, or percent. You now have a relative
frequency distribution:
Table #2.1.2: Relative Frequency Table for Type of Car Data
Category Frequency Relative Frequency
Ford 5 0.10
Chevy 12 0.24
Honda 6 0.12
Toyota 12 0.24
Nissan 10 0.20
Other 5 0.10
Total 50 1.00
The relative frequency column should add up to 1.00. It might be off a little due
to rounding errors on certain problems.
3. Chapter 2: Graphical Descriptions of Data
24
TECHNOLOGY: ENTERING OR UPLOADING DATA INTO STATCRUNCH
Entering your own data that you do not have in a file:
Go to Statcrunch.com and login.
Click “Open StatCrunch”. A spreadsheet will open where you can rename the
columns using the variable names for your data. You can then enter the raw data
into the columns.
Entering data from a file:
For the examples and homework problems in this book, you have a file in
Blackboard called “Chapter 2 Data”. Save that file to your desktop.
Go to Statcrunch.com and login.
Click “MyStatCrunch”. Under “My Data”, click “Select a file from my computer”.
Then choose the file you just saved to your desktop.
Then scroll down and click “Load file” and you will see the data automatically load
into the columns of the spreadsheet.
This file is automatically saved under “My Data”. So the next time you login to
StatCrunch, you can click “MyStatCrunch” and then click “My Data” and this file
will be in the list to choose.
TECHNOLOGY: FREQUENCY AND RELATIVE FREQUENCY TABLES IN
STATCRUNCH
Enter the data into a column in the spreadsheet (see earlier instructions on entering
a list of data)
Click Stat, Tables, Frequency
In the popup window that opens, choose the variable name from “Select Columns”
Under “Statistics” Frequency and Relative Frequency are already chosen so you do
not need to click anything there.
Under “Order by” you can choose “Values ascending” to put the categories in ABC
order in the table, or “Count ascending” to put the categories in order by frequency,
or “Worksheet” to put the categories in order of appearance in the column of data.
Under “”Other*” if percent less than” you can enter a number like 10 to put all
categories with less than 10% into a combined category called “Other*”
Then click “Compute!”
4. Chapter 2: Graphical Descriptions of Data
25
If you follow the StatCrunch directions above for the list of data called “Car Data” from
the “Chapter02DataFile”, you will get the following:
Now that you have the frequency and relative frequency table, it would be good to
display this data using a graph. The most common graphs for qualitative data are bar
charts and pie charts.
Bar chart (or graph) – consist of the frequencies on one axis and the categories on the
other axis. Then you draw rectangles for each category with a height (if frequency is on
the vertical axis) or length (if frequency is on the horizontal axis) that is equal to the
frequency. All of the rectangles should be the same width, and there should be equally
width gaps between each bar.
Pie chart (or graph) – consists of a circle divided into sectors (pie shapes) that are
proportional to the size of the frequency or relative frequency of each category. All you
have to do to find the angle is to multiply the relative frequency by 360 degrees.
Remember that 180 degrees is half of a circle and 90 degrees is a quarter of a circle. We
will be using technology to make these, so you will not need to do these calculations.
Example #2.1.2: Drawing a Bar Chart
Draw a bar chart of the data in example #2.1.1.
Table #2.1.2: Frequency Table for Type of Car Data
Category Frequency
Relative
Frequency
Ford 5 0.10
Chevy 12 0.24
Honda 6 0.12
Toyota 12 0.24
Nissan 10 0.20
Other 5 0.10
Total 50 1.00
Put the frequency on the vertical axis and the category on the horizontal axis.
Then just draw a box above each category whose height is the frequency.
5. Chapter 2: Graphical Descriptions of Data
26
TECHNOLOGY: BAR CHARTS (BAR GRAPHS) FROM RAW DATA
Using StatCrunch:
Enter the data into a column in the spreadsheet (see earlier instructions on entering
a list of data)
Click Graph, Bar Plot, With Data
In the popup window that opens choose the variable name from “Select Columns”
and under “Type” choose frequency or relative frequency depending on what you
have been asked for.
Under “Order by” you can choose “Values ascending” to put the categories in ABC
order on the axis, or “Count ascending” to put the bars in order by height, or
“Worksheet” to put the categories in order of appearance in the column of data.
Under “”Other*” if percent less than” you can enter a number like 10
Under “Display” check next to “Value above bar”
Under “Graph properties” you can give your graph a title.
Then click “Compute!”
If you follow the StatCrunch directions above for the list of raw data called “Car Data” in
the “Chapter02DataFile” you will get the following frequency bar chart (value ascending
and any category with less than 10% was put into an “Other*” category):
Graph #2.1.1: Bar Chart for Type of Car Data
Notice from the graph, you can see that Toyota and Chevy are the more popular
car, with Nissan not far behind. Ford seems to be the type of car that you can tell
was the least liked, though the cars in the other category would be liked less than
a Ford.
6. Chapter 2: Graphical Descriptions of Data
27
Some key features of a bar graph:
Equal spacing on each axis.
Bars are the same width.
There should be labels on each axis and a title for the graph.
There should be an equal scaling on the frequency/relative frequency axis and the
categories should be listed on the category axis.
The bars don’t touch.
You can also draw a bar graph using relative frequency on the vertical axis. This is
useful when you want to compare two samples with different sample sizes. The relative
frequency graph and the frequency graph should look the same, except for the scaling on
the frequency axis.
If you follow the StatCrunch directions above for the list of data called “Car Data” in the
“Chapter02DataFile” you will get the following relative frequency bar chart (value
ascending and any category with less than 10% was put into an “Other*” category):
Graph #2.1.2: Relative Frequency Bar Chart for Type of Car Data
If instead you had chosen “Count descending” the bar plot would look as follows:
7. Chapter 2: Graphical Descriptions of Data
28
TECHNOLOGY: BAR CHARTS (BAR GRAPHS) FROM GROUPED DATA
Using StatCrunch:
Enter the data into a column in the spreadsheet (see earlier instructions on entering
a list of data)
Click Graph, Bar Plot, With Summary
In the popup window that opens choose the variable name from “Categories In”
and the column with the counts from “Counts in”
The rest of the steps are the same as for raw data (except don’t put anything next to
“Other if percent less than”).
If instead you had been given the grouped data from the start (instead of the list of 50
data points), you could have made the bar chart using the grouped data instead as follows:
In StatCrunch open the data file called “Chapter02DataFile”. You will see two lists that
look as follows:
Click Graph, Bar Plot, With Summary
Then choose “Car Category” for the categories and “Car Frequency” for the counts:
The rest of the steps are the same as for raw data (don’t put anything next to “Other if
percent less than”) and will yield the same bar chart as we got before:
Another type of graph for qualitative data is a pie chart. A pie chart is where you have a
circle and you divide pieces of the circle into pie shapes that are proportional to the size
of the relative frequency. There are 360 degrees in a full circle. Relative frequency is
just the percentage as a decimal. (We will be using technology to make these.)
8. Chapter 2: Graphical Descriptions of Data
29
Example #2.1.3: Drawing a Pie Chart
Draw a pie chart of the data in example #2.1.1.
First you need the relative frequencies.
Table #2.1.2: Frequency Table for Type of Car Data
Category Frequency Relative Frequency
Ford 5 0.10
Chevy 12 0.24
Honda 6 0.12
Toyota 12 0.24
Nissan 10 0.20
Other 5 0.10
Total 50 1.00
Then you multiply each relative frequency by 360° to obtain the angle
measure for each category.
Table #2.1.3: Pie Chart Angles for Type of Car Data
Category Relative Frequency Angle (in degrees (°))
Ford 0.10 36.0
Chevy 0.24 86.4
Honda 0.12 43.2
Toyota 0.24 86.4
Nissan 0.20 72.0
Other 0.10 36.0
Total 1.00 360.0
The computations above just give you an idea of how these angle sizes are
computed. We will be using technology to make these graphs.
TECHNOLOGY: PIE CHARTS (PIE GRAPHS) FROM RAW DATA
Using StatCrunch:
Enter the data into a column in the spreadsheet (see earlier instructions)
Then click Graph, Pie Chart, With Data
In the popup window that opens choose the variable name from “Select Columns”
Under “Display” Count and Percent of Total are already chosen so you do not need
to click anything there.
Under “Order by” you can choose “Values ascending” to put the categories in ABC
order on the axis, or “Count ascending” to put the bars in order by height, or
“Worksheet” to put the categories in order of appearance in the column of data.
Under “”Other*” if percent less than” you can enter a number like 10 to combine
all categories with less than 10% into one combined category called “Other*”
Under “Graph properties” you can give your graph a title.
Then click “Compute!”
9. Chapter 2: Graphical Descriptions of Data
30
If you follow the StatCrunch directions above for the list of raw data called “Car Data” in
the “Chapter02DataFile” you will get the following pie chart (value ascending and any
category with less than 10% was put into an “Other*” category):
Graph #2.1.3: Pie Chart for Type of Car Data
As you can see from the graph, Toyota and Chevy are more popular, while the
cars in the other category are liked the least. Of the cars that you can determine
from the graph, Ford is liked less than the others.
TECHNOLOGY: PIE CHARTS (PIE GRAPHS) FROM GROUPED DATA
Using StatCrunch:
Enter the data into a column in the spreadsheet (see earlier instructions on entering
a list of data)
Click Graph, Pie Chart, With Summary
In the popup window that opens choose the variable name from “Categories In”
and the column with the counts from “Counts in”
The rest of the steps are the same as for raw data (except don’t put anything next to
“Other if percent less than”).
If instead you had been given the grouped data from the start (instead of the list of 50
data points), you could have made the pie chart using the grouped data instead as follows:
In StatCrunch open the data file called “Chapter02DataFile”. You will see two lists that
look as follows:
10. Chapter 2: Graphical Descriptions of Data
31
Click Graph, Pie Chart, With Summary
Then choose “Car Category” for the categories and “Car Frequency” for the counts:
The rest of the steps are the same as for raw data (don’t put anything next to “Other if
percent less than”) and will yield the same pie chart as we got before:
Bar charts are more common than pie charts. It really doesn’t matter which one you use.
It really is a personal preference and also what information you are trying to address.
However, pie charts are best when you only have a few categories and the data can be
expressed as a percentage. If a data value can fit into multiple categories, you cannot use
a pie chart. As an example, if you asking people about what their favorite national park
is, and you say to pick the top three choices, then the total number of answers can add up
to more than 100% of the people involved. So you cannot use a pie chart to display the
favorite national park. However, if you asked people to name their favorite national park
and they could only choose one, then you could use a pie chart. In either case, a bar chart
can be used.
Many times data are collected to determine if there is a relationship between variables. Is
there a relationship between gender and success in statistics? Is there a relationship
between attendance and performance on exams? Is there a relationship between the
number of homework problems done and performance on exams? To begin to answer
such questions involving qualitative (categorical) variables, we need tables and graphs
where we can look at the two variables together.
11. Chapter 2: Graphical Descriptions of Data
32
Contingency table – a data table where each row represents categories of one of the
variables and each column represents categories of the other variable. Each cell count
represents the number of objects in both the row category and the column category.
Side-by-side bar chart – a bar graph in which there are sets of bars for each value of one
of the categorical variables where each bar in that set represents values for the other
categorical variable.
Segmented bar chart – a bar graph in which there is a single bar for each value of one of
the categorical variables segmented into parts for each value of the other categorical
variable within that category.
TECHNOLOGY: SIDE-BY-SIDE AND SEGMENTED BAR CHARTS
Using StatCrunch:
Enter the contingency table into the spreadsheet.
o The variable you want the data grouped by needs to be in the first column
with the variable name at the top of that column and the values of that
variable listed in column 1.
o The other columns should be named with category names for the other
variable
o Enter the counts into the correct cells.
o Below is a generic version for a problem where we want to group the data
by variable 1 which has two categories. The other variable also has two
categories.
Click Graph, Chart, Columns
In the popup window that opens, under “Select columns” choose all of the
categories listed there.
Under “Row labels in” choose the name of the first column (this is the variable that
you want the data grouped by).
Under “Plot” there are several options:
Vertical bars (split) and horizontal bars (split) will give you side-by-side
bar charts.
Vertical bars (stacked) and horizontal bars (stacked) will give you
segmented bar charts.
Under “Graph Properties” you can give your graph a title.
Then click “Compute!”
12. Chapter 2: Graphical Descriptions of Data
33
Example #2.1.4: Side-By-Side Bar Chart
In a city with a maximum-security prison, the residents have been polled to
determine if a relationship exists between marital status and a resident's stand on
capital punishment. The results are cross-classified into the following
contingency table. Make a side-by-side bar chart.
Stand on Capital Marital Status
Punishment Married Not Married Total
Favor 100 20 120
Oppose 50 30 80
Total 150 50 200
Enter the contingency table above into StatCrunch and then make a side-by-side
and segmented bar chart grouped by “Stand on Capital Punishment”.
The data would look as follows in StatCrunch:
Following the directions from the technology box on the previous page you would
get the following:
Graph #2.1.4: Side-By-Side and Segmented Bar Charts for Stand on Capital
Punishment Data:
(Vertical bars (split)) (Vertical bars (stacked))
We usually prefer the graphs with vertical bars, but you can also make them with
horizontal bars.
13. Chapter 2: Graphical Descriptions of Data
34
Section2.1:Homework
1.) Eyeglassomatic manufactures eyeglasses for different retailers. The number of
pairs of lenses for different activities is in table #2.1.4.
Table #2.1.4: Data for Eyeglassomatic
Activity Grind Multicoat Assemble Make
frames
Receive
finished
Unknown
Number
of lenses
18872 12105 4333 25880 26991 1508
Grind means that they ground the lenses and put them in frames, multicoat means
that they put tinting or scratch resistance coatings on lenses and then put them in
frames, assemble means that they receive frames and lenses from other sources
and put them together, make frames means that they make the frames and put
lenses in from other sources, receive finished means that they received glasses
from other source, and unknown means they do not know where the lenses came
from. Make a bar chart and a pie chart of this data. State any findings you can see
from the graphs.
2.) To analyze how Arizona workers ages 16 or older travel to work the percentage of
workers using carpool, private vehicle (alone), and public transportation was
collected. Create a bar chart and pie chart of the data in table #2.1.5. State any
findings you can see from the graphs.
Table #2.1.5: Data of Travel Mode for Arizona Workers
Transportation type Percentage
Carpool 11.6%
Private Vehicle (Alone) 75.8%
Public Transportation 2.0%
Other 10.6%
3.) The number of deaths in the US due to carbon monoxide (CO) poisoning from
generators from the years 1999 to 2011 are in table #2.1.6 (Hinatov, 2012).
Create a bar chart and pie chart of this data. State any findings you see from the
graphs.
Table #2.1.6: Data of Number of Deaths Due to CO Poisoning
Region Number of deaths from CO
while using a generator
Urban Core 401
Sub-Urban 97
Large Rural 86
Small Rural/Isolated 111
14. Chapter 2: Graphical Descriptions of Data
35
4.) In Connecticut households use gas, fuel oil, or electricity as a heating source.
Table #2.1.7 shows the percentage of households that use one of these as their
principle heating sources ("Electricity usage," 2013), ("Fuel oil usage," 2013),
("Gas usage," 2013). Create a bar chart and pie chart of this data. State any
findings you see from the graphs.
Table #2.1.7: Data of Household Heating Sources
Heating Source Percentage
Electricity 15.3%
Fuel Oil 46.3%
Gas 35.6%
Other 2.8%
5.) Eyeglassomatic manufactures eyeglasses for different retailers. They test to see
how many defective lenses they made during the time period of January 1 to
March 31. Table #2.1.8 gives the defect and the number of defects. Create a
Pareto chart of the data (bar chart with “Count descending”) and then describe
what this tells you about what causes the most defects.
Table #2.1.8: Data of Defect Type
Defect type Number of defects
Scratch 5865
Right shaped – small 4613
Flaked 1992
Wrong axis 1838
Chamfer wrong 1596
Crazing, cracks 1546
Wrong shape 1485
Wrong PD 1398
Spots and bubbles 1371
Wrong height 1130
Right shape – big 1105
Lost in lab 976
Spots/bubble – intern 976
15. Chapter 2: Graphical Descriptions of Data
36
6.) People in Bangladesh were asked to state what type of birth control method they
use. The percentages are given in table #2.1.9 ("Contraceptive use," 2013).
Create a Pareto chart of the data (bar chart with “Count descending”) and then
state any findings you can from the graph.
Table #2.1.9: Data of Birth Control Type
Method Percentage
Condom 4.50%
Pill 28.50%
Periodic Abstinence 4.90%
Injection 7.00%
Female Sterilization 5.00%
IUD 0.90%
Male Sterilization 0.70%
Withdrawal 2.90%
Other Modern Methods 0.70%
Other Traditional Methods 0.60%
None 44.3%
7.) In a study of 478 fourth, fifth and sixth graders, the following data were collected
on their gender and on their primary goal. Make a side-by-side and segmented
bar chart with the data grouped by gender.
Good Grades Popularity Good at Sports Total
Male 117 50 60 227
Female 130 91 30 251
Total 247 141 90 478
16. Chapter 2: Graphical Descriptions of Data
37
Section 2.2: Quantitative Data
The graph for quantitative data looks similar to a bar graph, except there are some major
differences. First, in a bar graph the categories can be put in any order on the horizontal
axis. There is no set order for these data values. You can’t say how the data is
distributed based on the shape, since the shape can change just by putting the categories
in different orders. With quantitative data, the data are in specific orders, since you are
dealing with numbers. Each bar on a bar graph just represents a specific category. The
bars on this next graph represent a range of numeric values. With quantitative data, you
can talk about a distribution, since the shape only changes a little bit depending on how
many categories you set up.
Another difference is that the bars touch with quantitative data, and there will be no gaps
in the graph (unless there is a big gap in the values in the data). Since the graph for
quantitative data is different from qualitative data, it is given a new name. The name of
the graph is a histogram. To create a histogram by hand, you must first create the
frequency distribution (we will see when using StatCrunch that we will not need to make
a frequency distribution first). The idea of a frequency distribution is to take the interval
that the data spans and divide it up into equal subintervals called classes.
Summary of the steps involved in making a frequency distribution for quantitative
data:
1. Compute the class width (this will be the bin width in StatCrunch)
Compute
( 𝑙𝑎𝑟𝑔𝑒𝑠𝑡 𝑑𝑎𝑡𝑎 𝑣𝑎𝑙𝑢𝑒−𝑠𝑚𝑎𝑙𝑙𝑒𝑠𝑡 𝑑𝑎𝑡𝑎 𝑣𝑎𝑙𝑢𝑒)
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑙𝑎𝑠𝑠𝑒𝑠
and then round as follows:
*If the data are whole numbers, round up to the next whole number
*If the data are tenths numbers, round up to the next tenth number
*If the data are hundredths numbers, round up to the next hundredth number
*etc.
2. Compute the lowest class boundary (this is where you will start bins at in
StatCrunch). The lowest class boundary is computed as follows: take the
smallest value in the data set and subtract…
*0.5 if the data are whole numbers
*0.05 if the data are tenths numbers
*0.005 if the data are hundredths numbers
3. Create the classes. Each class has limits that determine which values fall in
each class. Start with the lowest class boundary you just computed in step 2
and add the class width you compute in step 1 to get the lower class boundary
for the next class. Repeat until you get all the classes. Fill in the upper class
boundaries (the upper class boundary for the first class will be the lower class
boundary of the second class, etc.)
4. To figure out the number of data points that fall in each class, go through each
data value and see which class boundaries it is between. Utilizing tally marks
may be helpful in counting the data values. The frequency for a class is the
number of data values that fall in the class. (You can also make a histogram
in StatCrunch and click “values above bar” and let StatCrunch count how
many are in each class for you)
17. Chapter 2: Graphical Descriptions of Data
38
Example #2.2.1: Creating a Frequency Table for Quantitative Data
Table #2.21 contains the amount of rent paid every month for 24 students from a
statistics course. Make a relative frequency distribution using 7 classes.
Table #2.2.1: Data of Monthly Rent
1500 1350 350 1200 850 900
1500 1150 1500 900 1400 1100
1250 600 610 960 890 1325
900 800 2550 495 1200 690
Solution:
First identify the individual, variable and type of variable.
Individual: a randomly selected student from a statistics course
Variable: amount of monthly rent
Type of variable: quantitative-discrete
1) Compute the class width:
( 𝑙𝑎𝑟𝑔𝑒𝑠𝑡 𝑑𝑎𝑡𝑎 𝑣𝑎𝑙𝑢𝑒−𝑠𝑚𝑎𝑙𝑙𝑒𝑠𝑡 𝑑𝑎𝑡𝑎 𝑣𝑎𝑙𝑢𝑒)
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑙𝑎𝑠𝑠𝑒𝑠
=
(2550 −350)
7
=
2200
7
≈ 314.286
Since the data are whole numbers, round this up to the next whole number.
So the class width = 315
2) Compute the lowest class boundary:
Since the data are whole numbers,
Lowest class boundary = smallest data value – 0.5 = 350 – 0.5 = 349.5
3) Create the classes.
The lower class boundaries start at 349.5 and you keep adding 315 down
The upper class boundaries start at 664.5 and you keep adding 315 down
Class Boundaries Tally Frequency
349.5 – 664.5
664.5 – 979.5
979.5 – 1294.5
1294.5 – 1609.5
1609.5 – 1924.5
1924.5 – 2239.5
2239.5 – 2554.5
Here we now have 7 classes which is what was asked for and the
largest data value of 2550 is contained in the last class.
4) Tally and find the frequency of the data:
Go through the data and put a tally mark in the appropriate class for each
piece of data by looking to see which class boundaries the data value is
between. Fill in the frequency by changing each of the tallies into a
number. Each relative frequency is just the frequency divided by the total
number of data points. In this case each frequency would be divided by
24.
18. Chapter 2: Graphical Descriptions of Data
39
Table #2.2.2: Frequency Distribution for Monthly Rent
Class Boundaries Tally Frequency Relative Frequency
349.5 – 664.5 |||| 4 0.17
664.5 – 979.5 |||| ||| 8
0.33
979.5 – 1294.5 |||| 5
0.21
1294.5 – 1609.5 |||| | 6
0.25
1609.5 – 1924.5 0 0
1924.5 – 2239.5 0 0
2239.5 – 2554.5 | 1 0.04
TOTAL 24 1.00
It is difficult to determine the basic shape of the distribution by looking at the frequency
distribution. It would be easier to look at a graph. The graph of a frequency distribution
for quantitative data is called a histogram.
Histogram – a graph of the frequencies (or relative frequencies) on the vertical axis and
the class boundaries (or class midpoints) on the horizontal axis. Rectangles where the
height is the frequency (or relative frequency) and the width is the class width are draw
for each class.
TECHNOLOGY: HISTOGRAMS
Using StatCrunch:
Enter the data into a column in the spreadsheet (see earlier instructions on entering
a list of data)
Click Graph, Histogram
In the popup window that opens choose the variable name from “Select Columns”
Under “Type” choose Frequency or Relative Frequency depending on what you
have been asked for.
Under “Bins” start the bins at the lowest class boundary and use the class width as
the bin width.
Under “Graph properties” you can give your graph a title.
Then click “Compute!”
Example #2.2.2: Drawing a Histogram
Draw a histogram for the distribution from example #2.2.1.
Solution:
The class boundaries are plotted on the horizontal axis and the frequencies are
plotted on the vertical axis. In StatCrunch click on My Data and then click
Chapter02DataFile. Follow the directions above using the column called
“Monthly Rent”. Under “Type” you want to choose Frequency. Also from the
earlier example we computed the first lower class boundary to be 349.5 and the
class width to be 315. These are used for where we start the bins at and for the
bin width respectively.
19. Chapter 2: Graphical Descriptions of Data
40
Graph #2.2.1: Frequency Histogram for Monthly Rent
Reviewing the graph you can see that the rents that occur most often are between
$664.50 and $979.50 per month for rent. There is a large gap between $1609.50
and $2239.50. This seems to say that one student is paying a great deal more than
everyone else. This value could be considered an outlier. An outlier is a data
value that is far from the rest of the values. It may be an unusual value or a
mistake. It is a data value that should be investigated. In this case, the student
lives in a very expensive part of town, thus the value is not a mistake, and is just
very unusual. There are other aspects that can be discussed, but first some other
concepts need to be introduced.
Frequencies are helpful, but understanding the relative size each class is to the total is
also useful. To find this you can divide the frequency by the total to create a relative
frequency. If you have the relative frequencies for all of the classes, then you have a
relative frequency distribution. This gives you percentages of data that fall in each class.
Example #2.2.3: Creating a Relative Frequency Table
20. Chapter 2: Graphical Descriptions of Data
41
Find the relative frequency for the monthly rent data.
Solution:
From example #2.2.1, the frequency distribution is reproduced in table #2.2.2.
Table #2.2.2: Frequency Distribution for Monthly Rent
Class
Boundaries Frequency
349.5 – 664.5 4
664.5 – 979.5 8
979.5 – 1294.5 5
1294.5 – 1609.5 6
1609.5 – 1924.5 0
1924.5 – 2239.5 0
2239.5 – 2554.5 1
Divide each frequency by the number of data points to get relative frequency.
4
24
= 0.17,
8
24
= 0.33,
5
24
= 0.21, …
Table #2.2.3: Relative Frequency Distribution for Monthly Rent
Class
Boundaries Frequency
Relative
Frequency
349.5 – 664.5 4 0.17
664.5 – 979.5 8 0.33
979.5 – 1294.5 5 0.21
1294.5 – 1609.5 6 0.25
1609.5 – 1924.5 0 0
1924.5 – 2239.5 0 0
2239.5 – 2554.5 1 0.04
24 1
The relative frequencies should add up to 1 or 100%. (This might be off a little
due to rounding errors.)
The graph of the relative frequency is known as a relative frequency histogram. It looks
identical to the frequency histogram, but the vertical axis is relative frequency instead of
just frequencies.
Example #2.2.4: Drawing a Relative Frequency Histogram
21. Chapter 2: Graphical Descriptions of Data
42
Draw a relative frequency histogram for the monthly rent distribution from
example #2.2.1.
Solution:
The class boundaries are plotted on the horizontal axis and the relative
frequencies are plotted on the vertical axis. In StatCrunch click on My Data and
then click Chapter02DataFile. Follow the directions above using the column
called “Monthly Rent”. Under “Type” you want to choose Relative Frequency.
Also from the earlier example we computed the first lower class boundary to be
349.5 and the class width to be 315. These are used for where we start the bins at
and for the bin width respectively.
Graph #2.2.2: Relative Frequency Histogram for Monthly Rent
Notice the shape of the relative frequency distribution is the same as the shape of
the frequency distribution. The only difference is that the vertical axis now has
relative frequencies instead of frequencies.
Shapes of the distribution:
22. Chapter 2: Graphical Descriptions of Data
43
The point of this chapter is not just to be able to MAKE a frequency table or a graph of
the data. One of the characteristics we will be interested in later is the SHAPE of the
distribution. Before drawing inferences using the results of a set of sample data, often
you need to first look at the histogram and look at three things: shape, center and spread.
We will discuss shape here and we will discuss measures of center and spread in the next
chapter.
Below are some of the common distribution shapes we will see this semester.
bell-shaped
(symmetric)
uniform
(symmetric)
right-skewed
(not symmetric)
left-skewed
(not symmetric)
Some shapes are symmetric and some are not. Symmetric means that you can fold the
graph in half down the middle and the two sides will line up. You can think of the two
sides as being mirror images of each other. Skewed means one “tail” of the graph is
longer than the other. The graph is skewed in the direction of the longer tail.
Another interest is how many peaks a graph may have. Modal refers to the number of
peaks. Unimodal has one peak and bimodal has two peaks. Usually if a graph has more
than two peaks, the modal information is no longer of interest.
Other important features to consider are gaps between bars, a repetitive pattern, how
spread out the data are, and where the center of the graph is.
Examples of graphs:
Graph #2.2.6: Symmetric, Unimodal Graph
Graph #2.2.7: Symmetric, Bimodal Graph
23. Chapter 2: Graphical Descriptions of Data
44
Graph #2.2.8: Skewed Right Graph
Graph #2.2.9: Skewed Left Graph with a Gap
Graph #2.2.10: Uniform Graph
Example #2.2.7: Creating a Frequency Distribution and Histogram
24. Chapter 2: Graphical Descriptions of Data
45
The following data represent the percent change in tuition levels at public, four-
year colleges (inflation adjusted) from 2008 to 2013 (Weissmann, 2013). Create a
frequency distribution and histogram for the data using 8 classes.
Table #2.2.5: Data of Tuition Levels at Public, Four-Year Colleges
19.5% 40.8% 57.0% 15.1% 17.4% 5.2% 13.0% 15.6%
51.5% 15.6% 14.5% 22.4% 19.5% 31.3% 21.7% 27.0%
13.1% 26.8% 24.3% 38.0% 21.1% 9.3% 46.7% 14.5%
78.4% 67.3% 21.1% 22.4% 5.3% 17.3% 17.5% 36.6%
72.0% 63.2% 15.1% 2.2% 17.5% 36.7% 2.8% 16.2%
20.5% 17.8% 30.1% 63.6% 17.8% 23.2% 25.3% 21.4%
28.5% 9.4%
Solution:
First identify the individual, variable and type of variable.
Individual: a randomly selected public, four-year college
Variable: percent change in tuition level
Type of variable: quantitative-continuous
1) Compute the class width:
( 𝑙𝑎𝑟𝑔𝑒𝑠𝑡 𝑑𝑎𝑡𝑎 𝑣𝑎𝑙𝑢𝑒−𝑠𝑚𝑎𝑙𝑙𝑒𝑠𝑡 𝑑𝑎𝑡𝑎 𝑣𝑎𝑙𝑢𝑒)
𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑐𝑙𝑎𝑠𝑠𝑒𝑠
=
(78.4−2.2)
8
=
76.2
8
≈ 9.525
Since the data values have one decimal place, round this up to the next
tenth number. So the class width = 9.6
2) Compute the lowest class boundary:
Since the data are tenths numbers,
Lowest class boundary = smallest data value – 0.05 = 2.2 – 0.05 = 2.15
3) Create the classes.
The lower class boundaries start at 2.15 and you keep adding 9.6 down
The upper class boundaries start at 11.75 and you keep adding 9.6 down
Class
Boundaries Tally Frequency
2.15 – 11.75
11.75 – 21.35
21.35 – 30.95
30.95 – 40.55
40.55 – 50.15
50.15 – 59.75
59.75 – 69.35
69.35 – 78.95
Here we now have 8 classes which is what was asked for and the
largest data value of 78.4 is contained in the last class.
25. Chapter 2: Graphical Descriptions of Data
46
4) Tally and find the frequency of the data:
Go through the data and put a tally mark in the appropriate class for each
piece of data by looking to see which class boundaries the data value is
between. Fill in the frequency by changing each of the tallies into a
number.
Table #2.2.6: Frequency Distribution for Tuition Levels at Public, Four-Year
Colleges
Class
Boundaries Tally Frequency
2.15 – 11.75 |||| | 6
11.75 – 21.35 |||| |||| |||| |||| 20
21.35 – 30.95 |||| |||| | 11
30.95 – 40.55 |||| 4
40.55 – 50.15 || 2
50.15 – 59.75 || 2
59.75 – 69.35 ||| 3
69.35 – 78.95 || 2
Make sure the total of the frequencies is the same as the number of data points.
To make the frequency histogram, the class boundaries are plotted on the
horizontal axis and the frequencies are plotted on the vertical axis. In StatCrunch
click on My Data and then click Chapter02DataFile. Follow the directions above
using the column called “Tuition Change”. Under “Type” you want to choose
Frequency. The lowest class boundary was 2.15 and the class width was 9.6.
These are used for where we start the bins and for the bin width respectively.
You can also check the box next to “value above the bar” to get the count for each
class.
26. Chapter 2: Graphical Descriptions of Data
47
Graph #2.2.11: Histogram for Tuition Levels at Public, Four-Year Colleges
If you want your x-axis to have the class boundaries on the sides of
the bars instead of the numbers above, click on down in the
lower left corner of the graph and then pick X-axis.
Next to “Tick marks” enter the list of class boundaries from your frequency table
separated by commas as shown below:
When you click , your graph will now look as follows:
This graph is skewed right, with no gaps. This says that the most frequent percent
increases in tuition were between 11.75% and 21.35%.
27. Chapter 2: Graphical Descriptions of Data
48
There are other types of graphs for quantitative data. They will be explored in the next
section.
Section2.2:Homework
1.) The median incomes of males in each state of the United States, including the
District of Columbia and Puerto Rico, are given in table #2.2.9 ("Median income
of," 2013). Create a frequency distribution and a relative frequency distribution
using 7 classes.
Table #2.2.9: Data of Median Income for Males
$42,951 $52,379 $42,544 $37,488 $49,281 $50,987 $60,705
$50,411 $66,760 $40,951 $43,902 $45,494 $41,528 $50,746
$45,183 $43,624 $43,993 $41,612 $46,313 $43,944 $56,708
$60,264 $50,053 $50,580 $40,202 $43,146 $41,635 $42,182
$41,803 $53,033 $60,568 $41,037 $50,388 $41,950 $44,660
$46,176 $41,420 $45,976 $47,956 $22,529 $48,842 $41,464
$40,285 $41,309 $43,160 $47,573 $44,057 $52,805 $53,046
$42,125 $46,214 $51,630
2.) The median incomes of females in each state of the United States, including the
District of Columbia and Puerto Rico, are given in table #2.2.10 ("Median income
of," 2013). Create a frequency distribution and a relative frequency distribution
using 7 classes.
Table #2.2.10: Data of Median Income for Females
$31,862 $40,550 $36,048 $30,752 $41,817 $40,236 $47,476 $40,500
$60,332 $33,823 $35,438 $37,242 $31,238 $39,150 $34,023 $33,745
$33,269 $32,684 $31,844 $34,599 $48,748 $46,185 $36,931 $40,416
$29,548 $33,865 $31,067 $33,424 $35,484 $41,021 $47,155 $32,316
$42,113 $33,459 $32,462 $35,746 $31,274 $36,027 $37,089 $22,117
$41,412 $31,330 $31,329 $33,184 $35,301 $32,843 $38,177 $40,969
$40,993 $29,688 $35,890 $34,381
3.) The density of people per square kilometer for African countries is in table
#2.2.11 ("Density of people," 2013). Create a frequency distribution and a
relative frequency distribution using 8 classes.
Table #2.2.11: Data of Density of People per Square Kilometer
15 16 81 3 62 367 42 123
8 9 337 12 29 70 39 83
26 51 79 6 157 105 42 45
72 72 37 4 36 134 12 3
630 563 72 29 3 13 176 341
415 187 65 194 75 16 41 18
69 49 103 65 143 2 18 31
28. Chapter 2: Graphical Descriptions of Data
49
4.) The Affordable Care Act created a market place for individuals to purchase health
care plans. In 2014, the premiums for a random sample of 36 twenty-seven year
olds signed up for the bronze level health insurance are given in table #2.2.12
("Health insurance marketplace," 2013). Create a frequency distribution and a
relative frequency distribution using 5 classes.
Table #2.2.12: Data of Health Insurance Premiums
$114 $119 $121 $125 $132 $139
$139 $141 $143 $145 $151 $153
$156 $159 $162 $163 $165 $166
$170 $170 $176 $177 $181 $185
$185 $186 $186 $189 $190 $192
$196 $203 $204 $219 $254 $286
5.) Create a histogram and relative frequency histogram for the data in table #2.2.9.
Describe the shape and any findings you can from the graph.
6.) Create a histogram and relative frequency histogram for the data in table #2.2.10.
Describe the shape and any findings you can from the graph.
7.) Create a histogram and relative frequency histogram for the data in table #2.2.11.
Describe the shape and any findings you can from the graph.
8.) Create a histogram and relative frequency histogram for the data in table #2.2.12.
Describe the shape and any findings you can from the graph.
9.) Students in a statistics class took their first test. The following are the scores they
earned. Create a frequency distribution and histogram for the data using a lower
class limit of 59.5 and a class width of 10. Describe the shape of the distribution.
Table #2.2.13: Data of Test 1 Grades
80 79 89 74 73 67 79
93 70 70 76 88 83 73
81 79 80 85 79 80 79
58 93 94 74
10.) Students in a statistics class took their first test. The following are the scores they
earned. Create a frequency distribution and histogram for the data using a lower
class limit of 59.5 and a class width of 10. Describe the shape of the distribution.
Compare to the graph in question 9.
Table #2.2.14: Data of Test 1 Grades
67 67 76 47 85 70
87 76 80 72 84 98
84 64 65 82 81 81
88 74 87 83
29. Chapter 2: Graphical Descriptions of Data
50
Section 2.3: Other Graphical Representations of Data
There are many other types of graphs. The following is a description of the stem-and-leaf
plot and the scatter plot.
Stem-and-Leaf Plots
Stem-and-leaf plots are a quick and easy way to look at small samples of numerical data.
You can look for any patterns or any strange data values. It is easy to compare two
samples using stem plots.
The first step is to divide each number into 2 parts, the stem (such as the leftmost digit)
and the leaf (such as the rightmost digit). There are no set rules, you just have to look at
the data and see what makes sense.
Example #2.3.1: Stem-and-Leaf Plot for Grade Distribution
The following are the percentage grades of 25 students from a statistics course.
Draw a stem-and-leaf plot of the data.
Table #2.3.1: Data of Test Grades
62 87 81 69 87 62 45 95 76 76
62 71 65 67 72 80 40 77 87 58
84 73 93 64 89
Solution:
First identify the individual, variables and type of variables.
Individual: a randomly selected student from a statistics course
Variable: percentage grade
Type of variable: quantitative
Divide each number so that the tens digit is the stem and the ones digit is the leaf.
62 becomes 6|2.
Make a vertical chart with the stems on the left of a vertical bar. Be sure to fill in
any missing stems. In other words, the stems should have equal spacing (for
example, count by ones or count by tens). The graph #2.3.1 shows the stems for
this example.
Graph #2.3.1: Stem-and-Leaf plot for Test Grades Step 1
4
5
6
7
8
9
Now go through the list of data and add the leaves. Put each leaf next to its
corresponding stem. Don’t worry about order yet just get all the leaves down.
30. Chapter 2: Graphical Descriptions of Data
51
When the data value 62 is placed on the plot it looks like the plot in graph #2.3.2.
Graph #2.3.2: Stem-and-Leaf for Test Grades Step 2
4
5
6 2
7
8
9
When the data value 87 is placed on the plot it looks like the plot in graph #2.3.3.
Graph #2.3.3: Stem-and-Leaf for Test Grades Step 3
4
5
6 2
7
8 7
9
Filling in the rest of the leaves to obtain the plot in graph #2.3.4.
Graph #2.3.4: Stem-and-Leaf for Test Grades Step 4
4 5 0
5 8
6 2 9 2 2 5 7 4
7 6 6 1 2 7 3
8 7 1 7 0 7 4 9
9 5 3
Now you have to add labels and make the graph look pretty. You need to add a
label and sort the leaves into increasing order. You also need to tell people what
the stems and leaves mean by inserting a legend. Be careful to line the leaves up
in columns. You need to be able to compare the lengths of the rows when you
interpret the graph. The final stem plot for the test grade data is in graph #2.3.5.
31. Chapter 2: Graphical Descriptions of Data
52
Graph #2.3.5: Stem-and-Leaf for Test Grades
Test Scores
4 0 = 40%
4 0 5
5 8
6 2 2 2 4 5 7 9
7 1 2 3 6 6 7
8 0 1 4 7 7 7 9
9 3 5
Now you can interpret the stem-and-leaf display. The data is bimodal and
somewhat symmetric. There are no gaps in the data. The center of the
distribution is around 70.
TECHNOLOGY: STEM-AND-LEAF PLOT
Using StatCrunch:
Enter the data into a column in the spreadsheet (see earlier instructions on entering
a list of data)
Click Graph, Stem and Leaf
In the popup window that opens choose the variable name from “Select Columns”
Under “Leaf Unit” choose the place value that will be in the leaf (for example if the
data set has 2-digit whole numbers, the number in the ones place would be the leaf
so you would choose “1” from the drop-down list).
Under “Outlier trimming” choose None.
Then click “Compute!”
If you follow the above directions to use StatCrunch to make the stem-and-leaf
plot for the data in Example #2.3.1, you would get the following:
Notice that StatCrunch will split all of the stems into two stems if there are too
many leaves for a given stem.
32. Chapter 2: Graphical Descriptions of Data
53
Scatter Plot
Sometimes you have two different quantitative variables and you want to see if they are
related in any way. A scatter plot helps you to see what the relationship would look like.
A scatter plot is just a plotting of the ordered pairs.
TECHNOLOGY: SCATTERPLOT
Using StatCrunch:
Enter the data into a column in the spreadsheet (see earlier instructions on entering
a list of data)
Click Graph, Scatter Plot
In the popup window that opens choose the X Variable and Y Variable from the
drop-down menus
Under “Graph properties” you can put a title
Then click “Compute!”
Using your TI84:
First push STAT 1 and enter the data into L1 and L2
Push 2nd Y= to open the STAT PLOTS menu. Then push 1 to select Plot1
You need to make your input screen look like the screen below (you may have
different list names depending on where you put your data)
Then push ZOOM 9 to see the scatterplot
Example #2.3.2: Scatter Plot
Is there any relationship between elevation and high temperature on a given day?
The following data are the high temperatures at various cities on a single day and
the elevation of the city.
Table #2.3.2: Data of Temperature versus Elevation
Elevation (in feet) 7000 4000 6000 3000 7000 4500 5000
Temperature (°F) 50 60 48 70 55 55 60
Solution:
First identify the individual, variables and type of variables.
Individual: a randomly selected city
Variable 1: elevation
Variable 2: temperature
Both variables are quantitative-continuous
In StatCrunch click on My Data and then click Chapter02DataFile. Follow the
directions above using the column called “Elevation (ft)” as the X variable and
the column called “Temperature (F)” as the Y variable to get the following:
33. Chapter 2: Graphical Descriptions of Data
54
Graph #2.3.6: Scatter Plot of Temperature versus Elevation
Looking at the graph, it appears as elevation increases, the temperature decreases.
Section2.3:Homework
1.) Students in a statistics class took their first test. The data in table #2.3.4 are the
scores they earned. Create a stem-and-leaf plot.
Table #2.3.4: Data of Test 1 Grades
80 79 89 74 73 67 79
93 70 70 76 88 83 73
81 79 80 85 79 80 79
58 93 94 74
2.) Students in a statistics class took their first test. The data in table #2.3.5 are the
scores they earned. Create a stem-and-leaf plot. Compare to the graph in
question 1.
Table #2.3.5: Data of Test 1 Grades
67 67 76 47 85 70
87 76 80 72 84 98
84 64 65 82 81 81
88 74 87 83
34. Chapter 2: Graphical Descriptions of Data
55
3.) When an anthropologist finds skeletal remains, they need to figure out the height
of the person. The height of a person (in cm) and the length of one of their
metacarpal bones (in cm) were recorded for a random sample of 9 living adults
and are in table #2.4.6 ("Prediction of height," 2013). Create a scatter plot and
state if there is a relationship between the height of a person and the length of
their metacarpal.
Table #2.3.6: Data of Metacarpal versus Height
Length of
Metacarpal
Height of
Person
45 171
51 178
39 157
41 163
48 172
49 183
46 173
43 175
47 173
4.) Table #2.3.7 contains the value of the house and the amount of annual rental
income in a year that the house brings in ("Capital and rental," 2013). Create a
scatter plot and state if there is a relationship between the value of the house and
the annual rental income.
Table #2.3.7: Data of House Value versus Rental
Value Rental Value Rental Value Rental Value Rental
81000 6656 77000 4576 75000 7280 67500 6864
95000 7904 94000 8736 90000 6240 85000 7072
121000 12064 115000 7904 110000 7072 104000 7904
135000 8320 130000 9776 126000 6240 125000 7904
145000 8320 140000 9568 140000 9152 135000 7488
165000 13312 165000 8528 155000 7488 148000 8320
178000 11856 174000 10400 170000 9568 170000 12688
200000 12272 200000 10608 194000 11232 190000 8320
214000 8528 208000 10400 200000 10400 200000 8320
240000 10192 240000 12064 240000 11648 225000 12480
289000 11648 270000 12896 262000 10192 244500 11232
325000 12480 310000 12480 303000 12272 300000 12480
5.) The World Bank collects information on the life expectancy of a person in each
country ("Life expectancy at," 2013) and the fertility rate (number of births per
woman) in the country ("Fertility rate," 2013). The data for 24 randomly selected
countries for the year 2011 are in table #2.3.8. Create a scatter plot of the data
and state if there appears to be a relationship between fertility rate and life
expectancy.
35. Chapter 2: Graphical Descriptions of Data
56
Table #2.3.8: Data of Life Expectancy versus Fertility Rate
Fertility
Rate
Life
Expectancy
Fertility
Rate
Life
Expectancy
1.7 77.2 3.9 72.3
5.8 55.4 1.5 76.0
2.2 69.9 4.2 66.0
2.1 76.4 5.2 55.9
1.8 75.0 6.8 54.4
2.0 78.2 4.7 62.9
2.6 73.0 2.1 78.3
2.8 70.8 2.9 72.1
1.4 82.6 1.4 80.7
2.6 68.9 2.5 74.2
1.5 81.0 1.5 73.3
6.9 54.2 2.4 67.1
6.) The World Bank collected data on the percentage of gross domestic product
(GDP) that a country spends on health expenditures ("Health expenditure," 2013)
and the percentage of woman receiving prenatal care ("Pregnant woman
receiving," 2013). The data for the countries where this information is available
for the year 2011 is in table #2.3.9. Create a scatter plot of the data and state if
there appears to be a relationship between percentage spent on health expenditure
and the percentage of woman receiving prenatal care.
Table #2.3.9: Data of Prenatal Care versus Health Expenditure
Prenatal
Care (%)
Health
Expenditure
(% of GDP)
47.9 9.6
54.6 3.7
93.7 5.2
84.7 5.2
100.0 10.0
42.5 4.7
96.4 4.8
77.1 6.0
58.3 5.4
95.4 4.8
78.0 4.1
93.3 6.0
93.3 9.5
93.7 6.8
89.8 6.1
36. Chapter 2: Graphical Descriptions of Data
57
7.) State everything that makes graph #2.3.9 a misleading or poor graph.
Graph #2.3.9: Example of a Poor Graph
8.) State everything that makes graph #2.3.10 a misleading or poor graph (Benen,
2011).
Graph #2.3.10: Example of a Poor Graph
37. Chapter 2: Graphical Descriptions of Data
58
9.) State everything that makes graph #2.3.11 a misleading or poor graph ("United
States unemployment," 2013).
Graph #2.3.11: Example of a Poor Graph
10.) State everything that makes graph #2.3.12 a misleading or poor graph.
Graph #2.3.12: Example of a Poor Graph
38. Chapter 2: Graphical Descriptions of Data
59
Data Sources:
B1 assets of financial institutions. (2013, June 27). Retrieved from
www.rba.gov.au/statistics/tables/xls/b01hist.xls
Benen, S. (2011, September 02). [Web log message]. Retrieved from
http://www.washingtonmonthly.com/political-
animal/2011_09/gop_leaders_stop_taking_credit031960.php
Capital and rental values of Auckland properties. (2013, September 26). Retrieved from
http://www.statsci.org/data/oz/rentcap.html
Contraceptive use. (2013, October 9). Retrieved from
http://www.prb.org/DataFinder/Topic/Rankings.aspx?ind=35
Deaths from firearms. (2013, September 26). Retrieved from
http://www.statsci.org/data/oz/firearms.html
DeNavas-Walt, C., Proctor, B., & Smith, J. U.S. Department of Commerce, U.S. Census
Bureau. (2012). Income, poverty, and health insurance coverage in the United States:
2011 (P60-243). Retrieved from website: www.census.gov/prod/2012pubs/p60-243.pdf
Density of people in Africa. (2013, October 9). Retrieved from
http://www.prb.org/DataFinder/Topic/Rankings.aspx?ind=30&loc=249,250,251,252,253,
254,34227,255,257,258,259,260,261,262,263,264,265,266,267,268,269,270,271,272,274,
275,276,277,278,279,280,281,282,283,284,285,286,287,288,289,290,291,292,294,295,29
6,297,298,299,300,301,302,304,305,306,307,308
Department of Health and Human Services, ASPE. (2013). Health insurance marketplace
premiums for 2014. Retrieved from website:
http://aspe.hhs.gov/health/reports/2013/marketplacepremiums/ib_premiumslandscape.pdf
Electricity usage. (2013, October 9). Retrieved from
http://www.prb.org/DataFinder/Topic/Rankings.aspx?ind=162
Fertility rate. (2013, October 14). Retrieved from
http://data.worldbank.org/indicator/SP.DYN.TFRT.IN
Fuel oil usage. (2013, October 9). Retrieved from
http://www.prb.org/DataFinder/Topic/Rankings.aspx?ind=164
Gas usage. (2013, October 9). Retrieved from
http://www.prb.org/DataFinder/Topic/Rankings.aspx?ind=165
Health expenditure. (2013, October 14). Retrieved from
http://data.worldbank.org/indicator/SH.XPD.TOTL.ZS
39. Chapter 2: Graphical Descriptions of Data
60
Hinatov, M. U.S. Consumer Product Safety Commission, Directorate of Epidemiology.
(2012). Incidents, deaths, and in-depth investigations associated with non-fire carbon
monoxide from engine-driven generators and other engine-driven tools, 1999-2011.
Retrieved from website: http://www.cpsc.gov/PageFiles/129857/cogenerators.pdf
Life expectancy at birth. (2013, October 14). Retrieved from
http://data.worldbank.org/indicator/SP.DYN.LE00.IN
Median income of males. (2013, October 9). Retrieved from
http://www.prb.org/DataFinder/Topic/Rankings.aspx?ind=137
Median income of males. (2013, October 9). Retrieved from
http://www.prb.org/DataFinder/Topic/Rankings.aspx?ind=136
Prediction of height from metacarpal bone length. (2013, September 26). Retrieved from
http://www.statsci.org/data/general/stature.html
Pregnant woman receiving prenatal care. (2013, October 14). Retrieved from
http://data.worldbank.org/indicator/SH.STA.ANVC.ZS
United States unemployment. (2013, October 14). Retrieved from
http://www.tradingeconomics.com/united-states/unemployment-rate
Weissmann, J. (2013, March 20). A truly devastating graph on state higher education
spending. The Atlantic. Retrieved from
http://www.theatlantic.com/business/archive/2013/03/a-truly-devastating-graph-on-state-
higher-education-spending/274199/