Chapter 3
1
Introduction to Statistics
prepared by Ms Aida Idawati
Objectives
Successful students will be able to:
 Define the meaning of Statistics and other terms
 Describe the types of statistics
 Describe the sources of data, the types of data and variable
 Understand the level of measurement
 Become familiar with SPSS
2
prepared by Ms Aida Idawati
1.1 Definition
3
 Statistics is a science that involves the efficient use of numerical data relating to group of
individuals (or trial).
 As widely known, ‘statistics’ is defined as the science of:
◦ Collecting
◦ Organizing
◦ Summarizing
◦ Analyzing
◦ Interpreting numerical data
to efficiently help the process of making decisions.
prepared by Ms Aida Idawati
1.2 Populations & Samples
 Another important aspect of statistics that needs to be dealt with is to be
able to differentiate a sample from a population
◦ Population → a very large amount of data, where making a complete sampling of all
of the population would be impractical or impossible.
◦ Sample →a subset of the population . Samples are collected and statistics are
calculated from the samples in order to make the conclusions about the populations.
4
prepared by Ms Aida Idawati
5
prepared by Ms Aida Idawati
1.3 Types of Statistics
6
 There are 2 types of Statistics
 Descriptive Statistics
 Describes the sample data
 Inferential Statistics
 Reach conclusions that go beyond the existing data
prepared by Ms Aida Idawati
Definition of Descriptive Statistics
7
• The statistical methods used to describe the basic features of the data that have been collected in a
study.
• Consists of organizing and summarizing the information collected.
• Describes the information collected through numerical measurements, charts, graphs and tables.
prepared by Ms Aida Idawati
Definition of Inferential Statistics
8
• Uses data that have been collected from a small group to draw conclusions about the larger group.
• These methods are used to make decisions
• These includes the t – test, Analysis of variance, etc.
prepared by Ms Aida Idawati
1.4 Sources of Data
9
There are two sources of data
1. Primary data: the specific information collected by the person who is doing the research. They
collect data through surveys, interviews, direct observations and experiments.
2. Secondary data : any material that has been collected Eg. Data from Statistical department of
Malaysia
prepared by Ms Aida Idawati
Primary Data
10
• Data is collected by researchers, for a specific
research.
1. Surveys:
describing, recording, analyzing and interpreting conditions that exist
or existed by asking from respondents.
i. Face-to-face interview
ii. Phone interview
iii. Questionaire
2. Observations:
the information is sought by way of investigators own direct observation
without asking from respondents.
3. Experiments:
investigators manipulate variable to study the effects on respondents.
prepared by Ms Aida Idawati
Face-to-face interview
11
Two-way communication.
Researcher asks question directly to respondent.
Advantages:
 Precise answer.
 Minimizes non-responses.
 Allows for in-depth questioning.
Disadvantages:
 Expensive.
 Interviewer might influence respondent’s responses.
 Respondent may refuse to answer sensitive or personal question.
prepared by Ms Aida Idawati
Phone Interview
12
Advantages:
 Fast.
 Less costly.
 Wider respondent coverage.
 less interviewer bias than personal interview
Disadvantages:
 Information obtained might not represent the whole population.
 Limited interview duration.
 Not appropriate for long and contemplate question.
 Low response rate (unanswered calls).
prepared by Ms Aida Idawati
Questionnaire
13
• A set of questions to obtain related information for a
conducted study.
 posted to respondents either by postal service or email or
website.
Advantages:
 Wider respondent coverage.
 Respondent have enough time to answer questions.
 Minimizes interviewer bias
 Cost effective.
Disadvantages:
 One-way interaction.
 Low response rate.
 Not suitable for numerous and hard questions.
 Time consuming (faster on internet).
 Questionnaire may be answered by unqualified respondent.
prepared by Ms Aida Idawati
Observation
14
Observing and measuring specific characteristics without
attempting to modify the subjects being studied.
Record human behaviour, objects and situations without
asking the respondent.
 E.g. In a study relating to consumer behaviour, the investigator
instead of asking the make of car used by the respondent, look at
the car directly.
 Advantages:
 Direct observation of actual situation.
 Minimizes response bias.
 Disadvantages:
 Limited to specific observable subjects or behaviour.
 May be time consuming.
prepared by Ms Aida Idawati
EXPERIMENT
investigators manipulate variable to study the effects on
respondents.
e.g. A bank may conduct an experiment to know what
attracts depositors: profit or security or liquidity.
Advantages:
 Designed to suit purpose.
 May use actual clients (field) or volunteers (lab).
Disadvantages:
 May be costly.
15
prepared by Ms Aida Idawati
1.5 Types of data
16
 There are two types of data
1. Qualitative
 provide items in a variety of quality or categories.
 classification or characteristic.
 Such as gender, age, occupation or courses.
2. Quantitative
 Data that measures or identifies based on a numerical scale
 Quantitative data can be further classified as either discrete or continuous.
a. Discrete variable
b. Continuous variables
prepared by Ms Aida Idawati
17
prepared by Ms Aida Idawati
Exercise 1
1. Identify each of the following as an example of (1) attribute
(qualitative) or (2) numerical (quantitative) variables.
 The number of stop signs in town of less than 500 people.
 quantitative
 Whether or not a camera is defective.
 Qualitative
 The number of questions answered correctly on a standardized test.
 quantitative
 The length of time required to answer a telephone call at a certain real
estate office.
 quantitative
18
18
prepared by Ms Aida Idawati
• 1. 6 Variables
19
prepared by Ms Aida Idawati
1.7 Levels of measurement
20
1. Nominal : categorizes responses (eg. Gender, favorite color)
2. Ordinal : allow comparisons of the degree, Class Rank (eg. poor, average, good & excellent)
3. Interval : numerical scales in which intervals have the same interpretation (eg. 30 degrees & 40
degrees)
4. Ratio : have a value of zero (eg. the number of clients in past six months)
prepared by Ms Aida Idawati
Levels of measurement
21
Scale of Measurement Criteria Examples
Nominal Categories Types of flower, gender,
colors, car brands, races
Ordinal Categories, Rank Likert scales responses,
class rank
Interval Categories, Rank, Equal
interval
Time periods, Temperatures
Ratio Categories, Rank, Equal
interval, True zero point
Age, Weight, height, time to
complete a task
prepared by Ms Aida Idawati
22
prepared by Ms Aida Idawati
Likert Scale
23
prepared by Ms Aida Idawati
Nominal
24
A qualitative variable that characterizes (or
describes/names) an element of a population.
Arithmetic operations are not meaningful for such data.
Order or rank cannot be assigned to the categories.
Examples:
Survey responses:- Yes, No.
Gender: Male, Female.
prepared by Ms Aida Idawati
Ordinal
25
A qualitative variable that incorporates an ordered
position, or ranking.
Differences between data values either cannot be
determined or are meaningless.
Examples:
Level of satisfaction: “very satisfied”, “satisfied”,
“somewhat satisfied”.
Course grades:- A, B, C, D, or F.
prepared by Ms Aida Idawati
Interval
26
Involves a quantitative variable.
A scale where distances between data are
meaningful.
Differences make sense, but ratios do not :
e.g., 30°-20°C = 20C°-10C°, but 20°C/10°C
is not twice as hot!).
No natural zero
prepared by Ms Aida Idawati
Interval
27
Examples:
Temperature scales are interval data with 25oC
warmer than 20oC and a 5oC difference has
some physical meaning.
Note that 0oC is arbitrary, so that it does not make sense
to say that 20oC is twice as hot as 10oC.
The year 0 is arbitrary and it is not sensible to
say that the year 2000 is twice as old as the
year 1000.
prepared by Ms Aida Idawati
Ratio
28
A scale in which both intervals between values
and ratios of values are meaningful.
A real zero point.
Examples:
- Temperature measured in degrees Kelvin is
a ratio scale because we know a meaningful
zero point (absolute zero).
- Physical measurements of height, weight,
length are typically ratio variables. It is now
meaningful to say that 10 m is twice as long
as 5 m. This is because there is a natural
zero. prepared by Ms Aida Idawati
Exercise 2
29
Classify each type of data:
a. Ratings of newscasts in Malaysia.
(poor, fair, good, excellent)
b. Temperature of automatic popcorn
poppers.
c. Marital status of respondents to a
survey on saving accounts.
d. Age of students enrolled in a martial
arts course.
e. Salaries of cashiers of C-Mart stores.
ordinal
interval
nominal
ratio
ratio
prepared by Ms Aida Idawati
Introduction to SPSS
 What is SPSS?
 SPSS (Statistical Package for the Social Sciences) is a versatile and responsive
program designed to undertake a range of statistical procedures.
 There are many others that you may come across if you pursue a career that
requires you to work with data. Stata and SAS (and there are many others).
 SPSS is a Windows based program that can be used to perform data entry and
analysis and to create tables and graphs.
 SPSS is capable of handling large amounts of data and can perform all of the
analyses covered in the text and much more.
 SPSS is commonly used in the Social Sciences and in the business world, so
familiarity with this program should serve you well in the future.
 SPSS is updated often.
30
prepared by Ms Aida Idawati
Layout of SPSS
 The Data Editor window has two views that can be selected from the lower left hand side
of the screen.
 Data View is where you see the data you are using.
 Variable View is where you can specify the format of your data when you are creating a file
or where you can check the format of a pre-existing file.
31
prepared by Ms Aida Idawati
 The other most commonly used SPSS window is the SPSS Viewer window which displays the output
from any analyses that have been run and any error messages.
32
prepared by Ms Aida Idawati
Exercise 3
1. Explain the differences between primary and secondary data.
2. Determine whether each of the following statements is TRUE or FALSE.
◦ If a researcher uses descriptive statistics, the researcher will be able to conclude about the
population based on a sample.
◦ Marital status is an example of a qualitative data.
◦ The highest level of measurement is the ratio level.
3. Determine whether each of the following variables is qualitative or quantitative.
o Number of children in a group families
o Monthly amount spend on electricity for the last 12 months.
o Favorite foods
o Most likely waiting period at a clinic (morning, afternoon, or evening)
o The temperatures for 31 consecutive days in January
o Height of students in a classroom
o Monthly earnings of employees.
33
prepared by Ms Aida Idawati
34
prepared by Ms Aida Idawati

Chapter 3 Introduction to Statisticss.pptx

  • 1.
    Chapter 3 1 Introduction toStatistics prepared by Ms Aida Idawati
  • 2.
    Objectives Successful students willbe able to:  Define the meaning of Statistics and other terms  Describe the types of statistics  Describe the sources of data, the types of data and variable  Understand the level of measurement  Become familiar with SPSS 2 prepared by Ms Aida Idawati
  • 3.
    1.1 Definition 3  Statisticsis a science that involves the efficient use of numerical data relating to group of individuals (or trial).  As widely known, ‘statistics’ is defined as the science of: ◦ Collecting ◦ Organizing ◦ Summarizing ◦ Analyzing ◦ Interpreting numerical data to efficiently help the process of making decisions. prepared by Ms Aida Idawati
  • 4.
    1.2 Populations &Samples  Another important aspect of statistics that needs to be dealt with is to be able to differentiate a sample from a population ◦ Population → a very large amount of data, where making a complete sampling of all of the population would be impractical or impossible. ◦ Sample →a subset of the population . Samples are collected and statistics are calculated from the samples in order to make the conclusions about the populations. 4 prepared by Ms Aida Idawati
  • 5.
    5 prepared by MsAida Idawati
  • 6.
    1.3 Types ofStatistics 6  There are 2 types of Statistics  Descriptive Statistics  Describes the sample data  Inferential Statistics  Reach conclusions that go beyond the existing data prepared by Ms Aida Idawati
  • 7.
    Definition of DescriptiveStatistics 7 • The statistical methods used to describe the basic features of the data that have been collected in a study. • Consists of organizing and summarizing the information collected. • Describes the information collected through numerical measurements, charts, graphs and tables. prepared by Ms Aida Idawati
  • 8.
    Definition of InferentialStatistics 8 • Uses data that have been collected from a small group to draw conclusions about the larger group. • These methods are used to make decisions • These includes the t – test, Analysis of variance, etc. prepared by Ms Aida Idawati
  • 9.
    1.4 Sources ofData 9 There are two sources of data 1. Primary data: the specific information collected by the person who is doing the research. They collect data through surveys, interviews, direct observations and experiments. 2. Secondary data : any material that has been collected Eg. Data from Statistical department of Malaysia prepared by Ms Aida Idawati
  • 10.
    Primary Data 10 • Datais collected by researchers, for a specific research. 1. Surveys: describing, recording, analyzing and interpreting conditions that exist or existed by asking from respondents. i. Face-to-face interview ii. Phone interview iii. Questionaire 2. Observations: the information is sought by way of investigators own direct observation without asking from respondents. 3. Experiments: investigators manipulate variable to study the effects on respondents. prepared by Ms Aida Idawati
  • 11.
    Face-to-face interview 11 Two-way communication. Researcherasks question directly to respondent. Advantages:  Precise answer.  Minimizes non-responses.  Allows for in-depth questioning. Disadvantages:  Expensive.  Interviewer might influence respondent’s responses.  Respondent may refuse to answer sensitive or personal question. prepared by Ms Aida Idawati
  • 12.
    Phone Interview 12 Advantages:  Fast. Less costly.  Wider respondent coverage.  less interviewer bias than personal interview Disadvantages:  Information obtained might not represent the whole population.  Limited interview duration.  Not appropriate for long and contemplate question.  Low response rate (unanswered calls). prepared by Ms Aida Idawati
  • 13.
    Questionnaire 13 • A setof questions to obtain related information for a conducted study.  posted to respondents either by postal service or email or website. Advantages:  Wider respondent coverage.  Respondent have enough time to answer questions.  Minimizes interviewer bias  Cost effective. Disadvantages:  One-way interaction.  Low response rate.  Not suitable for numerous and hard questions.  Time consuming (faster on internet).  Questionnaire may be answered by unqualified respondent. prepared by Ms Aida Idawati
  • 14.
    Observation 14 Observing and measuringspecific characteristics without attempting to modify the subjects being studied. Record human behaviour, objects and situations without asking the respondent.  E.g. In a study relating to consumer behaviour, the investigator instead of asking the make of car used by the respondent, look at the car directly.  Advantages:  Direct observation of actual situation.  Minimizes response bias.  Disadvantages:  Limited to specific observable subjects or behaviour.  May be time consuming. prepared by Ms Aida Idawati
  • 15.
    EXPERIMENT investigators manipulate variableto study the effects on respondents. e.g. A bank may conduct an experiment to know what attracts depositors: profit or security or liquidity. Advantages:  Designed to suit purpose.  May use actual clients (field) or volunteers (lab). Disadvantages:  May be costly. 15 prepared by Ms Aida Idawati
  • 16.
    1.5 Types ofdata 16  There are two types of data 1. Qualitative  provide items in a variety of quality or categories.  classification or characteristic.  Such as gender, age, occupation or courses. 2. Quantitative  Data that measures or identifies based on a numerical scale  Quantitative data can be further classified as either discrete or continuous. a. Discrete variable b. Continuous variables prepared by Ms Aida Idawati
  • 17.
    17 prepared by MsAida Idawati
  • 18.
    Exercise 1 1. Identifyeach of the following as an example of (1) attribute (qualitative) or (2) numerical (quantitative) variables.  The number of stop signs in town of less than 500 people.  quantitative  Whether or not a camera is defective.  Qualitative  The number of questions answered correctly on a standardized test.  quantitative  The length of time required to answer a telephone call at a certain real estate office.  quantitative 18 18 prepared by Ms Aida Idawati
  • 19.
    • 1. 6Variables 19 prepared by Ms Aida Idawati
  • 20.
    1.7 Levels ofmeasurement 20 1. Nominal : categorizes responses (eg. Gender, favorite color) 2. Ordinal : allow comparisons of the degree, Class Rank (eg. poor, average, good & excellent) 3. Interval : numerical scales in which intervals have the same interpretation (eg. 30 degrees & 40 degrees) 4. Ratio : have a value of zero (eg. the number of clients in past six months) prepared by Ms Aida Idawati
  • 21.
    Levels of measurement 21 Scaleof Measurement Criteria Examples Nominal Categories Types of flower, gender, colors, car brands, races Ordinal Categories, Rank Likert scales responses, class rank Interval Categories, Rank, Equal interval Time periods, Temperatures Ratio Categories, Rank, Equal interval, True zero point Age, Weight, height, time to complete a task prepared by Ms Aida Idawati
  • 22.
    22 prepared by MsAida Idawati
  • 23.
  • 24.
    Nominal 24 A qualitative variablethat characterizes (or describes/names) an element of a population. Arithmetic operations are not meaningful for such data. Order or rank cannot be assigned to the categories. Examples: Survey responses:- Yes, No. Gender: Male, Female. prepared by Ms Aida Idawati
  • 25.
    Ordinal 25 A qualitative variablethat incorporates an ordered position, or ranking. Differences between data values either cannot be determined or are meaningless. Examples: Level of satisfaction: “very satisfied”, “satisfied”, “somewhat satisfied”. Course grades:- A, B, C, D, or F. prepared by Ms Aida Idawati
  • 26.
    Interval 26 Involves a quantitativevariable. A scale where distances between data are meaningful. Differences make sense, but ratios do not : e.g., 30°-20°C = 20C°-10C°, but 20°C/10°C is not twice as hot!). No natural zero prepared by Ms Aida Idawati
  • 27.
    Interval 27 Examples: Temperature scales areinterval data with 25oC warmer than 20oC and a 5oC difference has some physical meaning. Note that 0oC is arbitrary, so that it does not make sense to say that 20oC is twice as hot as 10oC. The year 0 is arbitrary and it is not sensible to say that the year 2000 is twice as old as the year 1000. prepared by Ms Aida Idawati
  • 28.
    Ratio 28 A scale inwhich both intervals between values and ratios of values are meaningful. A real zero point. Examples: - Temperature measured in degrees Kelvin is a ratio scale because we know a meaningful zero point (absolute zero). - Physical measurements of height, weight, length are typically ratio variables. It is now meaningful to say that 10 m is twice as long as 5 m. This is because there is a natural zero. prepared by Ms Aida Idawati
  • 29.
    Exercise 2 29 Classify eachtype of data: a. Ratings of newscasts in Malaysia. (poor, fair, good, excellent) b. Temperature of automatic popcorn poppers. c. Marital status of respondents to a survey on saving accounts. d. Age of students enrolled in a martial arts course. e. Salaries of cashiers of C-Mart stores. ordinal interval nominal ratio ratio prepared by Ms Aida Idawati
  • 30.
    Introduction to SPSS What is SPSS?  SPSS (Statistical Package for the Social Sciences) is a versatile and responsive program designed to undertake a range of statistical procedures.  There are many others that you may come across if you pursue a career that requires you to work with data. Stata and SAS (and there are many others).  SPSS is a Windows based program that can be used to perform data entry and analysis and to create tables and graphs.  SPSS is capable of handling large amounts of data and can perform all of the analyses covered in the text and much more.  SPSS is commonly used in the Social Sciences and in the business world, so familiarity with this program should serve you well in the future.  SPSS is updated often. 30 prepared by Ms Aida Idawati
  • 31.
    Layout of SPSS The Data Editor window has two views that can be selected from the lower left hand side of the screen.  Data View is where you see the data you are using.  Variable View is where you can specify the format of your data when you are creating a file or where you can check the format of a pre-existing file. 31 prepared by Ms Aida Idawati
  • 32.
     The othermost commonly used SPSS window is the SPSS Viewer window which displays the output from any analyses that have been run and any error messages. 32 prepared by Ms Aida Idawati
  • 33.
    Exercise 3 1. Explainthe differences between primary and secondary data. 2. Determine whether each of the following statements is TRUE or FALSE. ◦ If a researcher uses descriptive statistics, the researcher will be able to conclude about the population based on a sample. ◦ Marital status is an example of a qualitative data. ◦ The highest level of measurement is the ratio level. 3. Determine whether each of the following variables is qualitative or quantitative. o Number of children in a group families o Monthly amount spend on electricity for the last 12 months. o Favorite foods o Most likely waiting period at a clinic (morning, afternoon, or evening) o The temperatures for 31 consecutive days in January o Height of students in a classroom o Monthly earnings of employees. 33 prepared by Ms Aida Idawati
  • 34.
    34 prepared by MsAida Idawati