SlideShare a Scribd company logo
1 of 60
IS 4800 Empirical Research Methods
for Information Science
Class Notes Feb 3, 2012
Instructor: Prof. Carole Hafner, 446 WVH
hafner@ccs.neu.edu Tel: 617-373-5116
Course Web site: www.ccs.neu.edu/course/is4800sp12/
Outline
■ First exam postponed until Friday Feb. 10
■ (covers thru descriptive statistics – review Tues.)
■ Review/finish descriptive statistics
■ Survey methods
1. Survey administration
2. Constructing Questionnaires
3. Types of Questionnaire Items
4. Composite measures
5. Sampling
■ Discuss Team Project 1
Review Measurement Scales
■ Nominal – color, make/model of a car,
race/ethnicity, telephone number (!)
■ Ordinal – grades (4.0, 3.0 . . ); high, med, low
■ Not many found in natural world
■ Interval – a date, a time
■ Ratio – distance (height, length) in space or
time; weight, amt of money (cost, income)
4
Factors Affecting Your
Choice of a Scale of Measurement
■ Information Yielded
■ A nominal scale yields the least information.
■ An ordinal scale adds some crude information.
■ Interval and ratio scales yield the most information.
■ Statistical Tests Available
■ The statistical tests available for nominal and ordinal data
(nonparametric) are less powerful than those available for
interval and ratio data (parametric)
■ Use the scale that allows you to use the most powerful
statistical test
Descriptive Statistics
■ Frequency distributions, and bar charts or
histograms (covered last time)
■ Bar charts vs. histograms
■ Bar chart: categorial x-variable
• Exs: color vs. frequency; states in NE vs. population
■ Histogram: numeric x-variable
• Exs: height vs. frequency; family income vs. lifespan
■ Measure of central tendency and spread
■ Normal Distribution; Skewness
6
Measures of Center: Definition
■ Mode
■ Most frequent score in a distribution
■ Simplest measure of center
■ Scores other than the most frequent not considered
■ Limited application and value
■ Median
■ Central score in an ordered distribution
■ More information taken into account than with the mode
■ Relatively insensitive to outliers
■ Prefer when data is skewed
■ Used primarily when the mean cannot be used
■ Mean
■ Numerical average of all scores in a distribution
■ Value dependent on each score in a distribution
■ Most widely used and informative measure of center
7
Measures of Center: Use
■ Mode
■ Used if data are measured along a nominal scale
■ Median
■ Used if data are measured along an ordinal scale
■ Used if interval data do not meet requirements for using the
mean (skewed but unimodal), or if significant outliers
■ Mean
■ Used if data are measured along an interval or ratio scale
■ Most sensitive measure of center
■ Used if scores are normally distributed
8
Measures of Spread: Definitions
■ Range
■ Subtract the lowest from the highest score in a distribution
of scores
■ Simplest and least informative measure of spread
■ Scores between extremes are not taken into account
■ Very sensitive to extreme scores
■ Interquartile Range
■ Less sensitive than the range to extreme scores
■ Used when you want a simple, rough estimate of spread
■ Variance
■ Average squared distance of scores from the mean
■ Standard Deviation
■ Square root of the variance
■ Most widely used measure of spread
9
Measures of Spread: Use
■ The range and standard deviation are sensitive to
extreme scores
■ In such cases the interquartile range is best
■ When your distribution of scores is skewed, the
standard deviation does not provide a good index of
spread
■ use the interquartile range
10
Which measures of center and spread?
Red
Blue
Purple
Yellow
Pink
Orange
Favorite Color
Green
Black
Grey
Tan
11
Which measures of center and spread?
Happiness
12
Which measures of center and spread?
Salary
13
Which measures of center and spread?
Student Year
Freshman
Sophmore
Middler
Junior
Senior
14
Which measures of center and
spread?
Performance
15
Which measures of center and spread?
Attitude Towards Computers
16
Example of a Boxplot
What is this?
0
50
100
150
IQ
17
Calculating Mean and Variance
N
X
M


 
 2
)
( M
X
SS
N
SS
SD 
2
18
Z-scores
• Measures that have been normalized to make
comparisons easier.
• Z-scores descriptives
– Mean?
– SD?
– Variance?
SD
M
X
Z


Summary
■ Frequency distribution
■ Categorial data: Nominal and ordinal
■ Mode sometimes useful
■ Measure of central tendency
■ Scale data: Interval and ratio
■ Mean and median
■ Measure of dispersion
■ Scale data
■ Variance, standard deviation
■ The important of presenting data graphically
20
Overview – Using Survey Research
1. Survey administration
2. Constructing Questionnaires
3. Types of Questionnaire Items
4. Composite measures
5. Sampling
21
Terminology Soup
■ Questionnaire = Self-Report
Measure = Instrument
■ Survey Instrument vs. Lab
Instrument
■ Composite Measure ~ Index
~ Scale
22
Using Survey Research
I. Survey administration
23
■ MAIL SURVEY
■ A questionnaire is mailed directly to participants
■ Mail surveys are very convenient
■ Nonresponse bias is a serious problem resulting in an
unrepresentative sample
■ INTERNET SURVEY
■ Survey distributed via e-mail or on a Web site
■ Large samples can be acquired quickly
■ Biased samples are possible because of uneven computer
ownership across demographic groups
■Check out surveygizmo.com
Administering Your Questionnaire
24
■ TELEPHONE SURVEY
■ Participants are contacted by telephone and asked questions
directly
■ Questions must be asked carefully
■ The plethora of “junk calls” may make participants
suspicious
■ GROUP ADMINISTRATION
■ A questionnaire is distributed to a group of participants
at once (e.g., a class)
■ Completed by participants at the same time
■ Ensuring anonymity may be a problem
Administering Your Questionnaire
25
■ INTERVIEW
■ Participants are asked questions in a face-to-face structured
or unstructured format
■ Characteristics or behavior of the interviewer may affect the
participants’ responses
Administering Your Questionnaire
26
Administering Your Questionnaire
■ In general
■ Personal techniques (interview, phone) provide
higher response rates, but are more expensive and
may suffer from bias problems.
27
2. Overview of Questionnaire
Construction
28
Parts of a Questionnaire
■ In any study you normally want to collect
demographics – usually done through
questionnaire
■ Single items
■ Composite items
29
Questionnaire Construction
■ Items can be optional. Flow often depicted
verbally and/or pictorially.
14. Have you ever participated in the
Model Cities program?
[ ] Yes
[ ] No
If Yes: When did you last attend
attend a meeting?
_________________
30
Questionnaire Construction
■ Many heuristics for ordering questions, length
of surveys, etc. For example:
■ Put interesting questions first
■ Demonstrate relevance to what you’ve told
participants
■ Group questions in to coherent groups
31
Questionnaire Construction
• Additional heuristics
– Organize questions into a coherent, visually
pleasing format
– Do not present demographic items first
– Place sensitive or objectionable items after less
sensitive/objectionable items
– Establish a logical navigational path
32
3. Types of Questionnaire Items
• Restricted (close-ended)
– Respondents are given a list of alternatives and
check the desired alternative
• Open-Ended
– Respondents are asked to answer a question in
their own words
• Partially Open-Ended
– An “Other” alternative is added to a restricted
item, allowing the respondent to write in an
alternative
33
Types of Questionnaire Items
• Rating Scale
– Respondents circle a number on a scale (e.g., 0 to
10) or check a point on a line that best reflects their
opinions
– Two factors need to be considered
• Number of points on the scale
• How to label (“anchor”) the scale (e.g., endpoints only or
each point)
34
Types of Questionnaire Items
– A Likert Scale is a scale used to assess attitudes
• Respondents indicate the degree of agreement or
disagreement to a series of statements
• I am happy.
Disagree 1 2 3 4 5 6 7 Agree
– A Semantic Differential Scale allows
participate to provide a rating within a bipolar
space
• How are you feeling right now?
Sad 1 2 3 4 5 6 7 Happy
35
Writing Good Items
■ Use simple words
■ Avoid vague questions
■ Don’t ask for too much information in one question
■ Avoid “check all that apply” items
■ Avoid questions that ask for more than one thing
■ Soften impact of sensitive questions
■ Avoid negative statements (usually)
36
Two Most Important Rules in
Designing Questionnaires?
■ Use an existing validated questionnaire if you
can find one.
■ If you must develop your own questionnaire,
pilot test it!
37
Acquiring A Survey Sample
■ You should obtain a representative sample
■ The sample closely matches the characteristics of
the population
■ A biased sample occurs when your sample
characteristics don’t match population
characteristics
■ Biased samples often produce misleading or
inaccurate results
■ Usually stem from inadequate sampling procedures
38
Sampling
■ Sometimes you really can measure the entire
population (e.g., workgroup, company), but
this is rare…
■ “Convenience sample”
■ Cases are selected only on the basis of feasibility
or ease of data collection.
39
■Simple Random Sampling
■Randomly select a sample from the
population
■Random digit dialing is a variant used with
telephone surveys
■Reduces systematic bias, but does not
guarantee a representative sample
• Some segments of the population may be over-
or underrepresented
Sampling Techniques
40
Sampling Techniques
■ Systematic Sampling
■ Every kth element is sampled after a randomly
selected starting point
• Sample every fifth name in the telephone book after
a random page and starting point selected, for
example
■ Empirically equivalent to random sampling
(usually)
• May still result in a non-representative sample
■ Easier than random sampling
41
■ Stratified Sampling
■ Used to obtain a representative sample
■ Population is divided into (demographic) strata
• Focus also on variables that are related to other variables of interest
in your study (e.g., relationship between age and computer literacy)
■ A random sample of a fixed size is drawn from each
stratum
■ May still lead to over- or underrepresentation of certain
segments of the population
■ Proportionate Sampling
■ Same as stratified sampling except that the proportions of
different groups in the population are reflected in the
samples from the strata
Sampling Techniques
42
Sampling Example:
■ You want to conduct a survey of job
satisfaction of all employees but can only
afford to contact 100 of them.
■ Personnel breakdown:
■ 50% Engineering
■ 25% Sales & Marketing
■ 15% Admin
■ 10% Management
■ Examples of
■ Stratified sampling?
■ Proportionate sampling?
43
■ Cluster Sampling
■ Used when populations are very large
■ The unit of sampling is a group rather than
individuals
■ Groups are randomly sampled from the population
(e.g., ten universities selected randomly, then
students are sampled at those schools)
Sampling Techniques
44
■ Multistage Sampling
■ Variant of cluster sampling
■ First, identify large clusters (e.g., US all
univeritites) and randomly sample from that
population
■ Second, sample individuals from randomly selected
clusters
■ Can be used along with stratified sampling to
ensure a representative sample (e.g. small vs. large,
liberal arts college vs. research university)
Sampling Techniques
Sampling and Statistics
■ If you select a random sample, the mean of that
sample will (in general) not be exactly the same as
the population mean. However, it represents an
estimate of the population mean
■ If you take two samples, one of males and one of
females, and compute the two sample means (let’s
say, of hourly pay), the difference between the two
sample means is an estimate of the difference
between the population means.
■ This is the basis of inferential statistics based on
samples
Sampling and Statistics (cont.)
■ If larger the sample, the better estimate (more
likely it is close to the population mean)
■ The variance/SD of the sample means is
related to the variance/SD of the population.
However, it is likely to be LESS (!) than the
population variance.
June 9, 2008 47
47
Inference with a Single Observation
• Each observation Xi in a random sample is a representative
of unobserved variables in population
• How different would this observation be if we took a
different random sample?
Population
Observation Xi
Parameter: 
Sampling Inference
?
June 9, 2008 48
Normal Distribution
• The normal distribution is a model for our overall
population
• Can calculate the probability of getting observations
greater than or less than any value
• Usually don’t have a single observation, but
instead the mean of a set of observations
June 9, 2008 49
Inference with Sample Mean
• Sample mean is our estimate of population mean
• How much would the sample mean change if we took a
different sample?
• Key to this question: Sampling Distribution of x
Population
Sample
Parameter: 
Statistic: x
Sampling Inference
Estimation
?
June 9, 2008 50
Sampling Distribution of Sample Mean
• Distribution of values taken by statistic in all possible
samples of size n from the same population
• Model assumption: our observations xi are sampled from a
population with mean  and variance 2
Population
Unknown
Parameter:

Sample 1 of size n x
Sample 2 of size n x
Sample 3 of size n x
Sample 4 of size n x
Sample 5 of size n x
Sample 6 of size n x
Sample 7 of size n x
Sample 8 of size n x
.
.
.
Distribution
of these
values?
June 9, 2008 51
Mean of Sample Mean
• First, we examine the center of the sampling distribution of
the sample mean.
• Center of the sampling distribution of the sample mean is
the unknown population mean:
mean( X ) = μ
• Over repeated samples, the sample mean will, on average,
be equal to the population mean
– no guarantees for any one sample!
June 9, 2008 52
Variance of Sample Mean
• Next, we examine the spread of the sampling distribution
of the sample mean
• The variance of the sampling distribution of the sample
mean is
variance( X ) = 2/n
• As sample size increases, variance of the sample mean
decreases!
• Averaging over many observations is more accurate than just
looking at one or two observations
June 9, 2008 53
• Comparing the sampling distribution of the sample
mean when n = 1 vs. n = 10
June 9, 2008 54
Law of Large Numbers
• Remember the Law of Large Numbers:
• If one draws independent samples from a population
with mean μ, then as the sample size (n) increases, the
sample mean x gets closer and closer to the population
mean μ
• This is easier to see now since we know that
mean(x) = μ
variance(x) = 2/n 0 as n gets large
June 9, 2008 55
Example
• Population: seasonal home-run totals for 7032
baseball players from 1901 to 1996
• Take different samples from this population and
compare the sample mean we get each time
• In real life, we can’t do this because we don’t usually
have the entire population!
Sample Size Mean Variance
100 samples of size n = 1 3.69 46.8
100 samples of size n = 10 4.43 4.43
100 samples of size n = 100 4.42 0.43
100 samples of size n = 1000 4.42 0.06
Population Parameter  = 4.42
June 9, 2008 56
Distribution of Sample Mean
• We now know the center and spread of the
sampling distribution for the sample mean.
• What about the shape of the distribution?
• If our data x1,x2,…, xn follow a Normal
distribution, then the sample mean x will also
follow a Normal distribution!
June 9, 2008 57
Example
• Mortality in US cities (deaths/100,000 people)
• This variable seems to approximately follow a Normal
distribution, so the sample mean will also approximately
follow a Normal distribution
June 9, 2008 58
Central Limit Theorem
• What if the original data doesn’t follow a Normal
distribution?
• HR/Season for sample of baseball players
• If the sample is large enough, it doesn’t matter!
June 9, 2008 59
Central Limit Theorem
• If the sample size is large enough, then the
sample mean x has an approximately Normal
distribution
• This is true no matter what the shape of the
distribution of the original data!

June 9, 2008 60
Example: Home Runs per Season
• Take many different samples from the seasonal HR totals
for a population of 7032 players
• Calculate sample mean for each sample
n = 1
n = 10
n = 100

More Related Content

Similar to classfeb03.pptx

Data collection and analysis
Data collection and analysisData collection and analysis
Data collection and analysisAndres Baravalle
 
Methods of Data Collection and sources of data
Methods of Data Collection and sources of dataMethods of Data Collection and sources of data
Methods of Data Collection and sources of dataAnup Suchak
 
Research Methodology Workshop - Quantitative and Qualitative
Research Methodology Workshop - Quantitative and QualitativeResearch Methodology Workshop - Quantitative and Qualitative
Research Methodology Workshop - Quantitative and QualitativeHanna Stahlberg
 
Research methodology
Research methodologyResearch methodology
Research methodologyMohit Chauhan
 
Sampling techniques in research
Sampling techniques in researchSampling techniques in research
Sampling techniques in researchJulie Atwebembeire
 
lecture 4.pptx
lecture 4.pptxlecture 4.pptx
lecture 4.pptxKwekuJnr
 
Presentation on research methodologies
Presentation on research methodologiesPresentation on research methodologies
Presentation on research methodologiesBilal Naqeeb
 
Unit 2 data_collection
Unit 2 data_collectionUnit 2 data_collection
Unit 2 data_collectionAshish Awasthi
 
3.1. methods of data collection.pptx
3.1. methods of data collection.pptx3.1. methods of data collection.pptx
3.1. methods of data collection.pptxAxmedXBullaale
 
Survey Research- Khurram 13.5.23.pptx
Survey Research- Khurram 13.5.23.pptxSurvey Research- Khurram 13.5.23.pptx
Survey Research- Khurram 13.5.23.pptxKhurramKhan225536
 
Sampling Design, Questionnaire Design & Data ib
Sampling Design, Questionnaire Design & Data  ibSampling Design, Questionnaire Design & Data  ib
Sampling Design, Questionnaire Design & Data ibIndraneel Bhowmik
 
BRS SA 2.0 (2021) - Part 3 of 3.pptx
BRS SA 2.0 (2021) - Part 3 of 3.pptxBRS SA 2.0 (2021) - Part 3 of 3.pptx
BRS SA 2.0 (2021) - Part 3 of 3.pptxHajiRock
 

Similar to classfeb03.pptx (20)

Quan res designs
Quan res designsQuan res designs
Quan res designs
 
classfeb08and10.ppt
classfeb08and10.pptclassfeb08and10.ppt
classfeb08and10.ppt
 
Sampling Techniques
Sampling TechniquesSampling Techniques
Sampling Techniques
 
Data collection and analysis
Data collection and analysisData collection and analysis
Data collection and analysis
 
Methods of Data Collection and sources of data
Methods of Data Collection and sources of dataMethods of Data Collection and sources of data
Methods of Data Collection and sources of data
 
Research Methodology Workshop - Quantitative and Qualitative
Research Methodology Workshop - Quantitative and QualitativeResearch Methodology Workshop - Quantitative and Qualitative
Research Methodology Workshop - Quantitative and Qualitative
 
Research methodology
Research methodologyResearch methodology
Research methodology
 
Sampling techniques in research
Sampling techniques in researchSampling techniques in research
Sampling techniques in research
 
Survey Design
Survey DesignSurvey Design
Survey Design
 
lecture 4.pptx
lecture 4.pptxlecture 4.pptx
lecture 4.pptx
 
Presentation on research methodologies
Presentation on research methodologiesPresentation on research methodologies
Presentation on research methodologies
 
Surveys
SurveysSurveys
Surveys
 
Unit 2 data_collection
Unit 2 data_collectionUnit 2 data_collection
Unit 2 data_collection
 
3.1. methods of data collection.pptx
3.1. methods of data collection.pptx3.1. methods of data collection.pptx
3.1. methods of data collection.pptx
 
Survey Research- Khurram 13.5.23.pptx
Survey Research- Khurram 13.5.23.pptxSurvey Research- Khurram 13.5.23.pptx
Survey Research- Khurram 13.5.23.pptx
 
Sampling Design, Questionnaire Design & Data ib
Sampling Design, Questionnaire Design & Data  ibSampling Design, Questionnaire Design & Data  ib
Sampling Design, Questionnaire Design & Data ib
 
ResearchDesignppt.pptx
ResearchDesignppt.pptxResearchDesignppt.pptx
ResearchDesignppt.pptx
 
Lecture 1
Lecture 1Lecture 1
Lecture 1
 
Lecture 1
Lecture 1Lecture 1
Lecture 1
 
BRS SA 2.0 (2021) - Part 3 of 3.pptx
BRS SA 2.0 (2021) - Part 3 of 3.pptxBRS SA 2.0 (2021) - Part 3 of 3.pptx
BRS SA 2.0 (2021) - Part 3 of 3.pptx
 

More from RangothriSreenivasaS

literary-theories_session-1_leaders-and-ideas-compatibility-mode.pdf
literary-theories_session-1_leaders-and-ideas-compatibility-mode.pdfliterary-theories_session-1_leaders-and-ideas-compatibility-mode.pdf
literary-theories_session-1_leaders-and-ideas-compatibility-mode.pdfRangothriSreenivasaS
 
literary-theories_session-6_psychology-of-literature-compatibility-mode.pdf
literary-theories_session-6_psychology-of-literature-compatibility-mode.pdfliterary-theories_session-6_psychology-of-literature-compatibility-mode.pdf
literary-theories_session-6_psychology-of-literature-compatibility-mode.pdfRangothriSreenivasaS
 
10 FAM - Harmony in the Family.ppt
10 FAM - Harmony in the Family.ppt10 FAM - Harmony in the Family.ppt
10 FAM - Harmony in the Family.pptRangothriSreenivasaS
 
Students Induction Program Overview.pptx
 Students Induction Program Overview.pptx Students Induction Program Overview.pptx
Students Induction Program Overview.pptxRangothriSreenivasaS
 
1 About this Workshop or Course.ppt
1 About this Workshop or Course.ppt1 About this Workshop or Course.ppt
1 About this Workshop or Course.pptRangothriSreenivasaS
 
Unit II 2.3 Body language- Non verbal communication.ppt
Unit II 2.3 Body language- Non verbal communication.pptUnit II 2.3 Body language- Non verbal communication.ppt
Unit II 2.3 Body language- Non verbal communication.pptRangothriSreenivasaS
 

More from RangothriSreenivasaS (20)

Ch12 south asia for cd.ppt
Ch12 south asia for cd.pptCh12 south asia for cd.ppt
Ch12 south asia for cd.ppt
 
classapr06.ppt
classapr06.pptclassapr06.ppt
classapr06.ppt
 
literary-theories_session-1_leaders-and-ideas-compatibility-mode.pdf
literary-theories_session-1_leaders-and-ideas-compatibility-mode.pdfliterary-theories_session-1_leaders-and-ideas-compatibility-mode.pdf
literary-theories_session-1_leaders-and-ideas-compatibility-mode.pdf
 
literary-theories_session-6_psychology-of-literature-compatibility-mode.pdf
literary-theories_session-6_psychology-of-literature-compatibility-mode.pdfliterary-theories_session-6_psychology-of-literature-compatibility-mode.pdf
literary-theories_session-6_psychology-of-literature-compatibility-mode.pdf
 
classJan11.ppt
classJan11.pptclassJan11.ppt
classJan11.ppt
 
classfeb24.ppt
classfeb24.pptclassfeb24.ppt
classfeb24.ppt
 
classmar2.ppt
classmar2.pptclassmar2.ppt
classmar2.ppt
 
classmar16.ppt
classmar16.pptclassmar16.ppt
classmar16.ppt
 
Literary Criticism Notes.ppt
Literary Criticism Notes.pptLiterary Criticism Notes.ppt
Literary Criticism Notes.ppt
 
12 FAM - Respect.ppt
12 FAM - Respect.ppt12 FAM - Respect.ppt
12 FAM - Respect.ppt
 
10 FAM - Harmony in the Family.ppt
10 FAM - Harmony in the Family.ppt10 FAM - Harmony in the Family.ppt
10 FAM - Harmony in the Family.ppt
 
9 HB - Prosperity _ Health.ppt
9 HB - Prosperity _ Health.ppt9 HB - Prosperity _ Health.ppt
9 HB - Prosperity _ Health.ppt
 
8 HB - Self.ppt
8 HB - Self.ppt8 HB - Self.ppt
8 HB - Self.ppt
 
7 HB - Body as an Instrument.ppt
7 HB - Body as an Instrument.ppt7 HB - Body as an Instrument.ppt
7 HB - Body as an Instrument.ppt
 
Students Induction Program Overview.pptx
 Students Induction Program Overview.pptx Students Induction Program Overview.pptx
Students Induction Program Overview.pptx
 
1 About this Workshop or Course.ppt
1 About this Workshop or Course.ppt1 About this Workshop or Course.ppt
1 About this Workshop or Course.ppt
 
Unit II 2.3 Body language- Non verbal communication.ppt
Unit II 2.3 Body language- Non verbal communication.pptUnit II 2.3 Body language- Non verbal communication.ppt
Unit II 2.3 Body language- Non verbal communication.ppt
 
UNIT III 3.1 Soft skills RS.ppt
UNIT III 3.1 Soft skills RS.pptUNIT III 3.1 Soft skills RS.ppt
UNIT III 3.1 Soft skills RS.ppt
 
classapr04.ppt
classapr04.pptclassapr04.ppt
classapr04.ppt
 
classmar16.ppt
classmar16.pptclassmar16.ppt
classmar16.ppt
 

Recently uploaded

Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppCeline George
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17Celine George
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 

Recently uploaded (20)

Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
URLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website AppURLs and Routing in the Odoo 17 Website App
URLs and Routing in the Odoo 17 Website App
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17How to Configure Email Server in Odoo 17
How to Configure Email Server in Odoo 17
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 

classfeb03.pptx

  • 1. IS 4800 Empirical Research Methods for Information Science Class Notes Feb 3, 2012 Instructor: Prof. Carole Hafner, 446 WVH hafner@ccs.neu.edu Tel: 617-373-5116 Course Web site: www.ccs.neu.edu/course/is4800sp12/
  • 2. Outline ■ First exam postponed until Friday Feb. 10 ■ (covers thru descriptive statistics – review Tues.) ■ Review/finish descriptive statistics ■ Survey methods 1. Survey administration 2. Constructing Questionnaires 3. Types of Questionnaire Items 4. Composite measures 5. Sampling ■ Discuss Team Project 1
  • 3. Review Measurement Scales ■ Nominal – color, make/model of a car, race/ethnicity, telephone number (!) ■ Ordinal – grades (4.0, 3.0 . . ); high, med, low ■ Not many found in natural world ■ Interval – a date, a time ■ Ratio – distance (height, length) in space or time; weight, amt of money (cost, income)
  • 4. 4 Factors Affecting Your Choice of a Scale of Measurement ■ Information Yielded ■ A nominal scale yields the least information. ■ An ordinal scale adds some crude information. ■ Interval and ratio scales yield the most information. ■ Statistical Tests Available ■ The statistical tests available for nominal and ordinal data (nonparametric) are less powerful than those available for interval and ratio data (parametric) ■ Use the scale that allows you to use the most powerful statistical test
  • 5. Descriptive Statistics ■ Frequency distributions, and bar charts or histograms (covered last time) ■ Bar charts vs. histograms ■ Bar chart: categorial x-variable • Exs: color vs. frequency; states in NE vs. population ■ Histogram: numeric x-variable • Exs: height vs. frequency; family income vs. lifespan ■ Measure of central tendency and spread ■ Normal Distribution; Skewness
  • 6. 6 Measures of Center: Definition ■ Mode ■ Most frequent score in a distribution ■ Simplest measure of center ■ Scores other than the most frequent not considered ■ Limited application and value ■ Median ■ Central score in an ordered distribution ■ More information taken into account than with the mode ■ Relatively insensitive to outliers ■ Prefer when data is skewed ■ Used primarily when the mean cannot be used ■ Mean ■ Numerical average of all scores in a distribution ■ Value dependent on each score in a distribution ■ Most widely used and informative measure of center
  • 7. 7 Measures of Center: Use ■ Mode ■ Used if data are measured along a nominal scale ■ Median ■ Used if data are measured along an ordinal scale ■ Used if interval data do not meet requirements for using the mean (skewed but unimodal), or if significant outliers ■ Mean ■ Used if data are measured along an interval or ratio scale ■ Most sensitive measure of center ■ Used if scores are normally distributed
  • 8. 8 Measures of Spread: Definitions ■ Range ■ Subtract the lowest from the highest score in a distribution of scores ■ Simplest and least informative measure of spread ■ Scores between extremes are not taken into account ■ Very sensitive to extreme scores ■ Interquartile Range ■ Less sensitive than the range to extreme scores ■ Used when you want a simple, rough estimate of spread ■ Variance ■ Average squared distance of scores from the mean ■ Standard Deviation ■ Square root of the variance ■ Most widely used measure of spread
  • 9. 9 Measures of Spread: Use ■ The range and standard deviation are sensitive to extreme scores ■ In such cases the interquartile range is best ■ When your distribution of scores is skewed, the standard deviation does not provide a good index of spread ■ use the interquartile range
  • 10. 10 Which measures of center and spread? Red Blue Purple Yellow Pink Orange Favorite Color Green Black Grey Tan
  • 11. 11 Which measures of center and spread? Happiness
  • 12. 12 Which measures of center and spread? Salary
  • 13. 13 Which measures of center and spread? Student Year Freshman Sophmore Middler Junior Senior
  • 14. 14 Which measures of center and spread? Performance
  • 15. 15 Which measures of center and spread? Attitude Towards Computers
  • 16. 16 Example of a Boxplot What is this? 0 50 100 150 IQ
  • 17. 17 Calculating Mean and Variance N X M      2 ) ( M X SS N SS SD  2
  • 18. 18 Z-scores • Measures that have been normalized to make comparisons easier. • Z-scores descriptives – Mean? – SD? – Variance? SD M X Z  
  • 19. Summary ■ Frequency distribution ■ Categorial data: Nominal and ordinal ■ Mode sometimes useful ■ Measure of central tendency ■ Scale data: Interval and ratio ■ Mean and median ■ Measure of dispersion ■ Scale data ■ Variance, standard deviation ■ The important of presenting data graphically
  • 20. 20 Overview – Using Survey Research 1. Survey administration 2. Constructing Questionnaires 3. Types of Questionnaire Items 4. Composite measures 5. Sampling
  • 21. 21 Terminology Soup ■ Questionnaire = Self-Report Measure = Instrument ■ Survey Instrument vs. Lab Instrument ■ Composite Measure ~ Index ~ Scale
  • 22. 22 Using Survey Research I. Survey administration
  • 23. 23 ■ MAIL SURVEY ■ A questionnaire is mailed directly to participants ■ Mail surveys are very convenient ■ Nonresponse bias is a serious problem resulting in an unrepresentative sample ■ INTERNET SURVEY ■ Survey distributed via e-mail or on a Web site ■ Large samples can be acquired quickly ■ Biased samples are possible because of uneven computer ownership across demographic groups ■Check out surveygizmo.com Administering Your Questionnaire
  • 24. 24 ■ TELEPHONE SURVEY ■ Participants are contacted by telephone and asked questions directly ■ Questions must be asked carefully ■ The plethora of “junk calls” may make participants suspicious ■ GROUP ADMINISTRATION ■ A questionnaire is distributed to a group of participants at once (e.g., a class) ■ Completed by participants at the same time ■ Ensuring anonymity may be a problem Administering Your Questionnaire
  • 25. 25 ■ INTERVIEW ■ Participants are asked questions in a face-to-face structured or unstructured format ■ Characteristics or behavior of the interviewer may affect the participants’ responses Administering Your Questionnaire
  • 26. 26 Administering Your Questionnaire ■ In general ■ Personal techniques (interview, phone) provide higher response rates, but are more expensive and may suffer from bias problems.
  • 27. 27 2. Overview of Questionnaire Construction
  • 28. 28 Parts of a Questionnaire ■ In any study you normally want to collect demographics – usually done through questionnaire ■ Single items ■ Composite items
  • 29. 29 Questionnaire Construction ■ Items can be optional. Flow often depicted verbally and/or pictorially. 14. Have you ever participated in the Model Cities program? [ ] Yes [ ] No If Yes: When did you last attend attend a meeting? _________________
  • 30. 30 Questionnaire Construction ■ Many heuristics for ordering questions, length of surveys, etc. For example: ■ Put interesting questions first ■ Demonstrate relevance to what you’ve told participants ■ Group questions in to coherent groups
  • 31. 31 Questionnaire Construction • Additional heuristics – Organize questions into a coherent, visually pleasing format – Do not present demographic items first – Place sensitive or objectionable items after less sensitive/objectionable items – Establish a logical navigational path
  • 32. 32 3. Types of Questionnaire Items • Restricted (close-ended) – Respondents are given a list of alternatives and check the desired alternative • Open-Ended – Respondents are asked to answer a question in their own words • Partially Open-Ended – An “Other” alternative is added to a restricted item, allowing the respondent to write in an alternative
  • 33. 33 Types of Questionnaire Items • Rating Scale – Respondents circle a number on a scale (e.g., 0 to 10) or check a point on a line that best reflects their opinions – Two factors need to be considered • Number of points on the scale • How to label (“anchor”) the scale (e.g., endpoints only or each point)
  • 34. 34 Types of Questionnaire Items – A Likert Scale is a scale used to assess attitudes • Respondents indicate the degree of agreement or disagreement to a series of statements • I am happy. Disagree 1 2 3 4 5 6 7 Agree – A Semantic Differential Scale allows participate to provide a rating within a bipolar space • How are you feeling right now? Sad 1 2 3 4 5 6 7 Happy
  • 35. 35 Writing Good Items ■ Use simple words ■ Avoid vague questions ■ Don’t ask for too much information in one question ■ Avoid “check all that apply” items ■ Avoid questions that ask for more than one thing ■ Soften impact of sensitive questions ■ Avoid negative statements (usually)
  • 36. 36 Two Most Important Rules in Designing Questionnaires? ■ Use an existing validated questionnaire if you can find one. ■ If you must develop your own questionnaire, pilot test it!
  • 37. 37 Acquiring A Survey Sample ■ You should obtain a representative sample ■ The sample closely matches the characteristics of the population ■ A biased sample occurs when your sample characteristics don’t match population characteristics ■ Biased samples often produce misleading or inaccurate results ■ Usually stem from inadequate sampling procedures
  • 38. 38 Sampling ■ Sometimes you really can measure the entire population (e.g., workgroup, company), but this is rare… ■ “Convenience sample” ■ Cases are selected only on the basis of feasibility or ease of data collection.
  • 39. 39 ■Simple Random Sampling ■Randomly select a sample from the population ■Random digit dialing is a variant used with telephone surveys ■Reduces systematic bias, but does not guarantee a representative sample • Some segments of the population may be over- or underrepresented Sampling Techniques
  • 40. 40 Sampling Techniques ■ Systematic Sampling ■ Every kth element is sampled after a randomly selected starting point • Sample every fifth name in the telephone book after a random page and starting point selected, for example ■ Empirically equivalent to random sampling (usually) • May still result in a non-representative sample ■ Easier than random sampling
  • 41. 41 ■ Stratified Sampling ■ Used to obtain a representative sample ■ Population is divided into (demographic) strata • Focus also on variables that are related to other variables of interest in your study (e.g., relationship between age and computer literacy) ■ A random sample of a fixed size is drawn from each stratum ■ May still lead to over- or underrepresentation of certain segments of the population ■ Proportionate Sampling ■ Same as stratified sampling except that the proportions of different groups in the population are reflected in the samples from the strata Sampling Techniques
  • 42. 42 Sampling Example: ■ You want to conduct a survey of job satisfaction of all employees but can only afford to contact 100 of them. ■ Personnel breakdown: ■ 50% Engineering ■ 25% Sales & Marketing ■ 15% Admin ■ 10% Management ■ Examples of ■ Stratified sampling? ■ Proportionate sampling?
  • 43. 43 ■ Cluster Sampling ■ Used when populations are very large ■ The unit of sampling is a group rather than individuals ■ Groups are randomly sampled from the population (e.g., ten universities selected randomly, then students are sampled at those schools) Sampling Techniques
  • 44. 44 ■ Multistage Sampling ■ Variant of cluster sampling ■ First, identify large clusters (e.g., US all univeritites) and randomly sample from that population ■ Second, sample individuals from randomly selected clusters ■ Can be used along with stratified sampling to ensure a representative sample (e.g. small vs. large, liberal arts college vs. research university) Sampling Techniques
  • 45. Sampling and Statistics ■ If you select a random sample, the mean of that sample will (in general) not be exactly the same as the population mean. However, it represents an estimate of the population mean ■ If you take two samples, one of males and one of females, and compute the two sample means (let’s say, of hourly pay), the difference between the two sample means is an estimate of the difference between the population means. ■ This is the basis of inferential statistics based on samples
  • 46. Sampling and Statistics (cont.) ■ If larger the sample, the better estimate (more likely it is close to the population mean) ■ The variance/SD of the sample means is related to the variance/SD of the population. However, it is likely to be LESS (!) than the population variance.
  • 47. June 9, 2008 47 47 Inference with a Single Observation • Each observation Xi in a random sample is a representative of unobserved variables in population • How different would this observation be if we took a different random sample? Population Observation Xi Parameter:  Sampling Inference ?
  • 48. June 9, 2008 48 Normal Distribution • The normal distribution is a model for our overall population • Can calculate the probability of getting observations greater than or less than any value • Usually don’t have a single observation, but instead the mean of a set of observations
  • 49. June 9, 2008 49 Inference with Sample Mean • Sample mean is our estimate of population mean • How much would the sample mean change if we took a different sample? • Key to this question: Sampling Distribution of x Population Sample Parameter:  Statistic: x Sampling Inference Estimation ?
  • 50. June 9, 2008 50 Sampling Distribution of Sample Mean • Distribution of values taken by statistic in all possible samples of size n from the same population • Model assumption: our observations xi are sampled from a population with mean  and variance 2 Population Unknown Parameter:  Sample 1 of size n x Sample 2 of size n x Sample 3 of size n x Sample 4 of size n x Sample 5 of size n x Sample 6 of size n x Sample 7 of size n x Sample 8 of size n x . . . Distribution of these values?
  • 51. June 9, 2008 51 Mean of Sample Mean • First, we examine the center of the sampling distribution of the sample mean. • Center of the sampling distribution of the sample mean is the unknown population mean: mean( X ) = μ • Over repeated samples, the sample mean will, on average, be equal to the population mean – no guarantees for any one sample!
  • 52. June 9, 2008 52 Variance of Sample Mean • Next, we examine the spread of the sampling distribution of the sample mean • The variance of the sampling distribution of the sample mean is variance( X ) = 2/n • As sample size increases, variance of the sample mean decreases! • Averaging over many observations is more accurate than just looking at one or two observations
  • 53. June 9, 2008 53 • Comparing the sampling distribution of the sample mean when n = 1 vs. n = 10
  • 54. June 9, 2008 54 Law of Large Numbers • Remember the Law of Large Numbers: • If one draws independent samples from a population with mean μ, then as the sample size (n) increases, the sample mean x gets closer and closer to the population mean μ • This is easier to see now since we know that mean(x) = μ variance(x) = 2/n 0 as n gets large
  • 55. June 9, 2008 55 Example • Population: seasonal home-run totals for 7032 baseball players from 1901 to 1996 • Take different samples from this population and compare the sample mean we get each time • In real life, we can’t do this because we don’t usually have the entire population! Sample Size Mean Variance 100 samples of size n = 1 3.69 46.8 100 samples of size n = 10 4.43 4.43 100 samples of size n = 100 4.42 0.43 100 samples of size n = 1000 4.42 0.06 Population Parameter  = 4.42
  • 56. June 9, 2008 56 Distribution of Sample Mean • We now know the center and spread of the sampling distribution for the sample mean. • What about the shape of the distribution? • If our data x1,x2,…, xn follow a Normal distribution, then the sample mean x will also follow a Normal distribution!
  • 57. June 9, 2008 57 Example • Mortality in US cities (deaths/100,000 people) • This variable seems to approximately follow a Normal distribution, so the sample mean will also approximately follow a Normal distribution
  • 58. June 9, 2008 58 Central Limit Theorem • What if the original data doesn’t follow a Normal distribution? • HR/Season for sample of baseball players • If the sample is large enough, it doesn’t matter!
  • 59. June 9, 2008 59 Central Limit Theorem • If the sample size is large enough, then the sample mean x has an approximately Normal distribution • This is true no matter what the shape of the distribution of the original data! 
  • 60. June 9, 2008 60 Example: Home Runs per Season • Take many different samples from the seasonal HR totals for a population of 7032 players • Calculate sample mean for each sample n = 1 n = 10 n = 100