Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data collection and presentation

4,777 views

Published on

A presentation on data collection and presentation and their basic techniques.

Published in: Education
  • Be the first to comment

Data collection and presentation

  1. 1. Presented by: Nasif Hassan Khan Abir ………… ID # 61531-24-007 Md. Ferdaus Alam ………… ID # 61531-24-010 Zakir Husain ………… ID # 61325-18-058 Md. Faruqul Islam ............ ID # 61325-18-029
  2. 2. Data Collection The collection, organization, and presentation of data are basic background material for learning descriptive and inferential statistics and their applications Method of Collecting Data On the basis of the source of collection data may be classified as:  Primary data  Secondary data Types of Data There are two types of data.They are:  Numerical Data  Categorical Data
  3. 3. Collection of Data The data which are originally collected for the first time for the purpose of the survey are called primary data. For example facts or data collected regarding the habit of taking tea or coffee in a village by an investigator. Method of Collecting Primary Data There are several methods for collecting primary data. Some of them are:  Direct personal investigation  Indirect investigations  Through correspondent  By mailed questionnaire  Through schedules
  4. 4. Secondary Data When we use the data, which have already been collected by others, the data are called secondary data.This data is said to be primary for the agency which collects it first, and it becomes secondary for all the other users. Method of Collecting Secondary Data  Published reports of newspapers, RBI and periodicals.  Publication from trade associations  Financial data reported in annual reports  Information from official publications  Publication of international bodies such as UNO,World Bank etc.  Internal reports of the government departments  Records maintained by the institutions  Research reports prepared by students in the universities
  5. 5. Categorical Data Categorical data is the statistical data type consisting of categorical variables or of data that has been converted into that form, for example as grouped data. For example- Marital Status, Political Party, Eye Color, etc. NumericalData Numerical values or observations can be measured. And these numbers can be placed in ascending or descending order. Numerical data can be divided into two groups:  Discrete(Counted Items such as- number of children, defects per hour etc.)  Continuous(Measured Characteristics such as- weight, voltage etc.)
  6. 6. Interval Data Ordinal Data Nominal Data Height,Age,Weekly Food Spending Service quality rating, Standard & Poor’s bond rating, Student letter grades Marital status,Type of car owned Ratio Data Temperature in Fahrenheit, Standardized exam score Categories (no ordering or direction) Ordered Categories (rankings, order, or scaling) Differences between measurements but no true zero Differences between measurements, true zero exists EXAMPLES:
  7. 7. Presentation of Data Data collected in the form of schedules and questionnaires are not self explanatory. These are in the form of raw data. In order to make them meaningful, these are to be made presentable. Presentation of Categorical Data Categorical Data can be presented by two ways:  Tabulating Data(SummaryTable)  Graphing Data (Bar Chart, Pie Chart, Pareto Diagram)
  8. 8. The summary table is a visualization that summarizes statistical information about data in table form. Example: Current Investment Portfolio Investment Amount Percentage Type (in thousands $) (%) Stocks 46.5 42.27 Bonds 32.0 29.09 CD 15.5 14.09 Savings 16.0 14.55 Total110.0 100.0
  9. 9. Bar charts are often used for qualitative data (categories or nominal scale). Height of bar shows the frequency or percentage for each category. Bar Chart for the previous summary table is 0 10 20 30 40 50 Stocks Bonds CD Savings Amount in $1000's Investor's Portfolio
  10. 10. Pie charts are often used for qualitative data (categories or nominal scale). Size of pie slice shows the frequency or percentage for each category. Pie Chart for the previous summary table is shown below
  11. 11.  Used to portray categorical data  A bar chart, where categories are shown in descending order of frequency  A cumulative polygon is often shown in the same graph  Used to separate the “vital few” from the “trivial many” 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% Stocks Bonds Savings CD cumulative % invested (line graph) % invested in each category (bar graph) Current Investment Portfolio Series1 Series2
  12. 12. Categorical Data can be presented by two ways:  Ordered Array (Stem-and-Leaf Display)  Frequency/Cumulative Distributions (Histogram, Polygon, Ogive) Ordered Array  A sequence of data in rank order:  Shows range (min to max)  Provides some signals about variability within the range  May help identify outliers (unusual observations)  If the data set is large, the ordered array is less useful  Example- Data in raw form (as collected): 24, 26, 24, 21, 27, 27, 30, 41, 32, 38  Data in ordered array from smallest to largest:21, 24, 24, 26, 27, 27, 30, 32, 38, 41
  13. 13. A simple way to see distribution details in a data set.To make this diagram first We have to separate the sorted data series into leading digits (the stem) and the trailing digits (the leaves). Stem and Leaves of 21, 38 and 41 is, Stem Leaf 2 1 3 8 4 1
  14. 14. What is a Frequency Distribution?  A frequency distribution is a list or a table  Containing class groupings (ranges within which the data fall)  The corresponding frequencies with which data fall within each grouping or category. The reasons for using Frequency Distributions are:  It is a way to summarize numerical data  It condenses the raw data into a more useful form  It allows for a quick visual interpretation of the data
  15. 15. Class Intervals and Class Boundaries  Each class grouping has the same width  Determine the width of each interval by  Usually at least 5 but no more than 15 groupings  Class boundaries never overlap  Round up the interval width to get desirable endpoints groupingsclassdesiredofnumber range intervalofWidth 
  16. 16. A manufacturer of insulation randomly selects 20 winter days and records the daily high temperature 24, 35, 17, 21, 24, 37, 26, 46, 58, 30, 32, 13, 12, 38, 41, 43, 44, 27, 53, 27 For frequency distribution we need to follow the following steps:  Sort raw data in ascending order: 12, 13, 17, 21, 24, 24, 26, 27, 27, 30, 32, 35, 37, 38, 41, 43, 44, 46, 53, 58  Find range: 58 - 12 = 46  Select number of classes: 5 (usually between 5 and 15)  Compute class interval (width): 10 (46/5 then round up)  Determine class boundaries (limits): 10, 20, 30, 40, 50, 60  Compute class midpoints: 15, 25, 35, 45, 55  Count observations & assign to classes
  17. 17.  A graph of the data in a frequency distribution is called a histogram  The class boundaries (or class midpoints) are shown on the horizontal axis  the vertical axis is either frequency, relative frequency, or percentage  Bars of the appropriate heights are used to represent the number of observations within each class Example-For previous data the Histogram should be like this.There will be no gap between bars. 0 1 2 3 4 5 6 7 5 15 25 35 45 55 65 Frequency Class Midpoints Histogram: Daily High Temperature
  18. 18. In a percentage polygon the vertical axis would be defined to show the percentage of observations per class. Example-For previous data the Frequency Polygon should be like this, 0 1 2 3 4 5 6 7 5 15 25 35 45 55 65 Frequency Class Midpoins Frequency Polygon: Daily High Temperature
  19. 19. It is also known as the cumulative percent polygon. Example-For previous data the Ogive or Cumulative percent Polygon should be like this, 0 10 20 30 40 50 60 70 80 90 100 10 20 30 40 50 60 CumulativePercentage Class Boundaries (Not Midpoints) Ogive: Daily High Temperature
  20. 20.  Not distorting the data  Avoiding unnecessary adornments (no “chart junk”)  Using a scale for each axis on a two-dimensional graph  The vertical axis scale should begin at zero  Properly labeling all axes  The graph should contain a title  Using the simplest graph for a given set of data

×