SlideShare a Scribd company logo
1
UNIVERSITY OF PRETORIA
Session Four:
Techniques for Data Gathering and
Analysis
Professor David Walwyn
Data-Gathering Techniques
• Perusal (Library research)
– data mining (Existing published data, databases,
statistics, records, etc.)
• Measurement (Experimental and field research)
• Observation (Field research)
– case studies
• Questioning (Field research)
– interviews, surveys, questionnaires.
2
Desirable Attributes of Techniques
• Epistemic
– Reliability, validity, objectivity
• Constituent
– Feasibility, sensitivity
• Teleological
– Purposeful, coherent, Ethics and Integrity
• Axiological
– Ethics, integrity, non-maleficence,
beneficence
3
Data Gathering Spectrum
• The closer the research is to the human subject, the
more likely that the research will be subjective
• QUANT/positivists are concerned with the
description of phenomena and causality
– a priori research design
– internet survey
• QUAL/phenomenologists are interested in the
participants experience of these phenomena
– emergent design is possible (adaptive process in response
to the progress and results of the research)
– interview, focus groups and direct observation
4
Nature of Measurement
• Measurement involves the construction of variables
• Very often, what we are attempting to measure
cannot be measured directly
– as a consequence we have to use indicators, proxy or
surrogate variables (same meaning)
• Different levels of measurement
– nominal, ordinal, interval, absolute, etc.
– important for statistical analysis
5
Numerical Representation
• Representation of variables in coded form
• Nominal Scale
– Numbers used for representation without implication of
importance/order/scale (1 = female; 2 = male)
• Ordinal Scale
– Numbers are used for classification and does convey order
but not exact values (low = 1, medium = 2, high = 3)
• Interval Scale
– Numbers are equidistant but do not necessarily have a true
zero point (deg C)
• Ratio Scale
– Numbers are equidistant and have a true zero point (deg K)
value).
6
Summary of Numerical
Representations
7
Ratio
Interval
Even
Proportions
Property
Ordinal
Equal
Differences
Equal
Differences
Nominal Hierarchy Hierarchy Hierarchy
Distinction Distinction Distinction Distinction
Warm or Cold
Cold < Mild < Warm
< Hot
Centigrade Kelvin
Descriptor
Measurement 2
• Variables must first be defined conceptually (the concept
the variable is attempting to capture) and operationally
(how the variable will be measured in practise)
• All measuring and data collection procedures must be
based on systematic observation replicable by
independent observers (unless phenomenological)
Variable Conceptual definition Operational definition
Industrial
innovativeness
Introduction of new and
improved products,
processes and services to
the market
Number of
innovations/annum
Percentage of sales from
innovated products or
services
Questionnaire Design
• The first step in designing a questionnaire is to create a
conceptual model
– your framework of analysis which has been derived from your
synthesis of the literature and your understanding of the
important variables/the relationships between these
• The second step is to produce the questionnaire
– the introduction, the statement of informed consent, the
questions and responses, and the ‘look and feel’ format
– you could use SurveyMonkey (www.surveymonkey.com), Google
Sheets or http://kwiksurveys.com/ or
http://freeonlinesurveys.com/)
• The third step is to pre-test the questionnaire.
Process for Questionnaire Design
Research Problem
Research Questions
and Hypotheses
Research Variables
Questionnaire
questions
Welman’s Considerations
• Maintain neutrality
• Choose judiciously between open-ended and closed-
ended (multiple choice variety) questions
– closed ended are easier to analyse and have generally
higher repeatability
• Be brief and focused
• Take the respondent’s literacy level into
consideration
– if necessary define the key terms in the introduction to the
questionnaire
11
Welman 2
• Use a logical and justified sequence
• Use branching to ensure that respondents do not
have to read through irrelevant material
• Layout is important; strive for a clear and
aesthetically pleasing layout
• Be careful not to offend
• PRESENT A WELL-CRAFTED COVER LETTER
12
Interviews
• Structured Interviews
– Standardised fixed-format questionnaire
– Can be conducted by trained interviewers
• Semi-Structured Interviews
• Unstructured Interviews
– Collection of qualitative information
• Focus Group Interviews
– Interview of a small (3-12 person) group
– Open-ended questions
• Preparing for the interview
– analyse the research problem and questions
– understand what information must be obtained
– identify those who can provide the information
13
Types of Surveys
• Postal mail surveys
• Telephone surveys
• Electronic surveys
– E-mail surveys
– Internet surveys
• Direct personal surveys
Improving the Confidence Level of Low
Response Rate Surveys
• The problem with a low response rate (<30%) is that the
survey becomes a self-selected volunteer (non-probabilistic)
survey
• Non-response survey
– Select a random sample of non-respondents
– Measure only a few selected key variables
– Ensure high response rate (>70%)
– Comparison of results and statistical testing can confirm validity of
main survey.
– alternatively compare the immediate and the late responses; should
be the same in terms of content if there is no bias
Coding the Data
• Data analysis is most commonly done with statistical
tests
• Statistical testing can only be done with numerical
data
• Textual data has to be converted to numerical data
to enable statistical testing
• This is done by means of a process called Data
Coding
Steps in Data Management
• Prepare the data collection instrument and collect the
data
• Prepare the data codebook
• Prepare the data matrix worksheets
• Prepare instructions for data entry and data analysis
• For each variable provide the following information in
the codebook:
– Variable name (e.g. “PARTICIP”)
– Variable definition (e.g., “Participant name”)
– Position (Column number in worksheet, e.g., “1” )
– Measurement Level (e.g., “Nominal”)
– Code for missing values (e.g., “-1”)
– Coding (e.g., “Value – Label” Table)
The Data Codebook
Steps in Data Entry
• Prepare the data matrix worksheets
• Provide one column for each variable and one row for
each record
• Label the columns and rows
• Enter data using instructions for data entry and code book
Class Exercise
• Listen to the podcast from This American Life
– (18.00 to 28.00)
• Analyse the interview based on this framework
– was informed consent considered?
– was the interview structured or unstructured?
– did the interviewer let the informant lead?
– was the researcher ethical?
– did the researcher explain the purpose of the interview?
– other comments?
• Another interview:
– https://www.youtube.com/watch?v=6PhcglOGFg8
21
Data Analysis and Hypothesis Testing
• Qualitative data analysis
• Quantitative data analysis
– Overview of statistical analysis
– Descriptive statistical analysis
– Inferential statistical analysis
• includes hypothesis testing
– Statistical modelling
• Parametric vs. Non-Parametric
Qualitative Data Analysis
• Qualitative data analysis is defined as “the non-
numerical examination and interpretation of
observations, for the purpose of discovering
underlying meanings and patterns of relationships”.
– more common in humanities and social sciences
– based on language and observation
• Some techniques
– data ‘coding’ and display (ATLAS.ti)
• network display
– theme identification
• discourse and content analysis
22
Defining the Codes (CAQDAS)
• Creating the codes and sub-codes, including the
hierarchy and networks, is the difficult part (coding
itself is quite mechanical)
• Must start with your conceptual model and the
questionnaire
– example of behavioural change
• Categories of codes can be
– organisational
– substantive or descriptive
– theoretical or abstract
23
Determinants of Behavioural Change
24
Stage 1
Unaware of Issue
Stage 5
Decided to Act
Stage 6
Acting
Stage 4
Decided Not to Act
Stage 2
Aware of, but Unengaged, by Issue
Stage 3
Undecided about Acting
Stage 7
Maintenance
ATLAS.ti Codes
• 1st step ; creation of a ‘hermeneutic unit’
• 2nd step; importing text files
• 3rd step; defining the codes
• 4th step; coding the data
• 5th step; collating the coded material
• 6th step; constructing the results section based on
the actual material
• 7th step; discussion and interpretation of the results
25
ATLAS.ti Network
26
Adoption of
Specific
Measures
Risk and
Vulnerability
Main Disease
Threats
Introduce New
Measures
Barriers to
Response
Camp Facilities
Mitigation and
Response
Policy Fit
Challenge to
Specific
Measures
Example of Coding
27
P 7: 2.1.docx - 7:10 [It’s very good, we’ve got pre-..] (107:107)
(Super)
Codes: [Camp Facilities]
It’s very good, we’ve got pre-fab accommodation. Concrete floors,
porcelain toilets and hand basins, running hot and cold water, glass-
enclosed showers, air-conditioning everywhere, windows with fly
netting on, lockable doors, just like a little bungalow at a university
hostel. Double bed, file cabinet, lamp.
Quantitative Data Analysis
• Quantitative data analysis is defined as the
“numerical representation and manipulation of
observations for the purpose of describing and
explaining the phenomena that those observations
reflect”
28
Content Analysis of Inaugural
Address
• See Times analysis of Trump’s speech
– what type of analysis has been used?
• secondary or primary?
• content or syntax?
• Is the speech plagiarism?
• Do you agree that “protectionism will make America
great again?”
29
Initial or Exploratory Data Analysis
• Code the data for each variable (previous lecture)
• Enter the data into a spreadsheet
• Define the measurement scale for each variable
(normalised, adjusted, etc.)
• Check the data and correct errors
• ‘Illustrate’ the data (descriptive statistics) if possible
• Produce range of graphs to check for correlations
30
Descriptive Statistical Analysis
• Objectives: observational decision making
• Outcome: condenses large volumes of data into a few
summary measures
– such as mean, variance, frequency
• Interpreted by an inspection of the summary measures
– leads to decisions by observations
– guide to inferential statistics
31
Inferential Statistics
• Objective: proof of causality
• For a causal relationship to exist between two
variables, four conditions must exist:
– time order: the cause must exist before the effect
– co-variation: a change in the cause produces a change in
the effect
– rationale: reasonable explanation of why they are related
– non-spuriousness: no other (rival) cause for the effect
found
• Causality can never be proven absolutely, we can
only show to what degree it is probable!
– e.g. global warming debate and IPCC
32
Correlation and Causation
33
Statistical Modelling
• Objective: extrapolate and predict
• Method: builds models showing relationships
between variables
• Many different types of models
– continuous
– discreet
– stochastic
34
Statistics and Rules
• Rutherford: “if you need statistics, you did the
wrong experiment”
• “Always consult a statistician; but know enough
about statistics to view the advice critically”
• My objective …
– give an overview with reference to detailed information
– more details available from “Fundamentals of Biostatistics”
(Rosner) and other voluminous texts
35
Summary Table of Statistical Tests
(Plonskey, 2001)
Type of
Variables
Sample Characteristics (Bivariate)
Correlation
(Multivariate)
1 Sample
Population
2 Samples K Samples
Independent Dependent Independent Dependent
Categorical or
Nominal
chi2 or
binomial
chi2 Macnarmar’s
chi2 chi2 Cochran’s Q
Rank or
Ordinal
Mann Whitney
U
Wilcoxin
Matched Pairs
Signed Ranks
Kruskal Wallis
H
Friendman’s
ANOVA
Spearman’s
rho
Interval &
Ratio
Z test or t test
t test between
groups
t test within
groups
1 Way ANOVA
between
groups
1 Way ANOVA Pearson’s r
Factorial (2 Way) ANOVA
Numerical Representation 2
• Response scales (types of ordinal scales):
– Rating scales
– Attitudinal scales
– Ranking scales
– Likert scale (!)
• Strongly disagree
• Disagree somewhat
• Neutral
• Agree somewhat
• Strongly agree
37
Types of Data Analysis
• Univariate
– an expression, equation, function or polynomial of only one
variable (e.g. age distribution)
– mostly descriptive (frequency, standard deviation, median)
• Multivariate
– objects of any of these types but involving more than one
variable is called multivariate (e.g. x/y/z)
– mostly explanatory
• Very familiar to natural scientists/engineers!
38
Wikipedia, 2013
Stats Software
• For example, Moonstats
• Load Health.Mon and Study.Mon
– Variables
• Sex, matric mark, hours of study per week and attitude
– Univariate analysis
• Frequency distribution
• Descriptives (mean, std dev, min/max, median)
– Bivariate analysis
• Correlation (Pearson and Spearman)
• Chi Squared
• T Test
– Are you less likely to fail if you have a positive attitude?
39
Uses for Inferential Statistics
• Determining differences between experimental and
control groups in experimental research
• In descriptive research when comparisons are made
between different groups
• Enable the researcher to evaluate the effects of an
independent variable on a dependent variable
• Differences between a sample statistic and a
population parameter because the sample is not
perfectly representative of the population
Chi Square and t Tests
• Chi Square is a nonparametric test used with
nominally scaled data which are common with
survey research
– statistic is used when the researcher is interested in the
number of responses, objects, or people that fall in two or
more categories
• Similarly t test is used to determine if two sets of
data are significantly different from each other
Summary
• Qualitative vs quantitative
• In quantitative analysis, it is:
– important to categorise variables
– undertake an initial exploration using descriptive statistics
– use the appropriate technique for inferential statistics
– consult a statistician if possible, especially for complex
datasets containing variables of different types
42
Class Work:
• Listen to the following YouTube clip on QUAL
methods:
– https://www.youtube.com/watch?v=opp5tH4uD-w
• Answer the following questions:
– what does it mean “code structure evolves inductively”?
– why is coding an iterative process?
43

More Related Content

Similar to Lecture_4_Data_Gathering_and_Analysis.pdf

Finalreview
FinalreviewFinalreview
Finalreview
Muhammad Ameen
 
Final spss hands on training (descriptive analysis) may 24th 2013
Final spss  hands on training (descriptive analysis) may 24th 2013Final spss  hands on training (descriptive analysis) may 24th 2013
Final spss hands on training (descriptive analysis) may 24th 2013
Tin Myo Han
 
Advanced Research Methodology Session-4.pptx
Advanced Research Methodology Session-4.pptxAdvanced Research Methodology Session-4.pptx
Advanced Research Methodology Session-4.pptx
HarariMki1
 
Systematic Literature Review
Systematic Literature ReviewSystematic Literature Review
Systematic Literature Review
Tiago Silva da Silva
 
zero.pptx
zero.pptxzero.pptx
steps in geographical research.pptx
steps in geographical research.pptxsteps in geographical research.pptx
steps in geographical research.pptx
Asim Pt
 
Introduction research(1).pptx
Introduction research(1).pptxIntroduction research(1).pptx
Introduction research(1).pptx
AlmaAtakeluargaAA
 
Ch04 business research process
Ch04 business research processCh04 business research process
Ch04 business research process
Syed Osama Rizvi
 
e3_chapter__5_evaluation_technics_HCeVpPLCvE.ppt
e3_chapter__5_evaluation_technics_HCeVpPLCvE.ppte3_chapter__5_evaluation_technics_HCeVpPLCvE.ppt
e3_chapter__5_evaluation_technics_HCeVpPLCvE.ppt
appstore15
 
Research methodology
Research methodologyResearch methodology
Research methodology
Nyirenda Junior
 
Research methods
Research methodsResearch methods
Research methods
agribusinessclass
 
Research methodology
Research methodologyResearch methodology
Research methodology
Mohit Chauhan
 
Research methodology for behavioral research
Research methodology for behavioral researchResearch methodology for behavioral research
Research methodology for behavioral research
rip1971
 
Qualitative data analysis_ Software_ Quality issues _ 2023.pptx
Qualitative data analysis_ Software_ Quality issues _ 2023.pptxQualitative data analysis_ Software_ Quality issues _ 2023.pptx
Qualitative data analysis_ Software_ Quality issues _ 2023.pptx
TayeDosane
 
Lecture_1_-_Research_Methods_-_Introduction.pdf
Lecture_1_-_Research_Methods_-_Introduction.pdfLecture_1_-_Research_Methods_-_Introduction.pdf
Lecture_1_-_Research_Methods_-_Introduction.pdf
Jeffreys Togelang
 
Marketing research
Marketing researchMarketing research
Marketing research
Dr. C. V. Raman University
 
Research design
Research designResearch design
G2.suntasig.guallichico.maicol.alexander.english.project.design.docx
G2.suntasig.guallichico.maicol.alexander.english.project.design.docxG2.suntasig.guallichico.maicol.alexander.english.project.design.docx
G2.suntasig.guallichico.maicol.alexander.english.project.design.docx
Maicol Suntasig
 
1 complete research
1 complete research1 complete research
1 complete research
Hemant Kumar
 
Evaluation of Health IT Implementation (February 17, 2021)
Evaluation of Health IT Implementation (February 17, 2021)Evaluation of Health IT Implementation (February 17, 2021)
Evaluation of Health IT Implementation (February 17, 2021)
Nawanan Theera-Ampornpunt
 

Similar to Lecture_4_Data_Gathering_and_Analysis.pdf (20)

Finalreview
FinalreviewFinalreview
Finalreview
 
Final spss hands on training (descriptive analysis) may 24th 2013
Final spss  hands on training (descriptive analysis) may 24th 2013Final spss  hands on training (descriptive analysis) may 24th 2013
Final spss hands on training (descriptive analysis) may 24th 2013
 
Advanced Research Methodology Session-4.pptx
Advanced Research Methodology Session-4.pptxAdvanced Research Methodology Session-4.pptx
Advanced Research Methodology Session-4.pptx
 
Systematic Literature Review
Systematic Literature ReviewSystematic Literature Review
Systematic Literature Review
 
zero.pptx
zero.pptxzero.pptx
zero.pptx
 
steps in geographical research.pptx
steps in geographical research.pptxsteps in geographical research.pptx
steps in geographical research.pptx
 
Introduction research(1).pptx
Introduction research(1).pptxIntroduction research(1).pptx
Introduction research(1).pptx
 
Ch04 business research process
Ch04 business research processCh04 business research process
Ch04 business research process
 
e3_chapter__5_evaluation_technics_HCeVpPLCvE.ppt
e3_chapter__5_evaluation_technics_HCeVpPLCvE.ppte3_chapter__5_evaluation_technics_HCeVpPLCvE.ppt
e3_chapter__5_evaluation_technics_HCeVpPLCvE.ppt
 
Research methodology
Research methodologyResearch methodology
Research methodology
 
Research methods
Research methodsResearch methods
Research methods
 
Research methodology
Research methodologyResearch methodology
Research methodology
 
Research methodology for behavioral research
Research methodology for behavioral researchResearch methodology for behavioral research
Research methodology for behavioral research
 
Qualitative data analysis_ Software_ Quality issues _ 2023.pptx
Qualitative data analysis_ Software_ Quality issues _ 2023.pptxQualitative data analysis_ Software_ Quality issues _ 2023.pptx
Qualitative data analysis_ Software_ Quality issues _ 2023.pptx
 
Lecture_1_-_Research_Methods_-_Introduction.pdf
Lecture_1_-_Research_Methods_-_Introduction.pdfLecture_1_-_Research_Methods_-_Introduction.pdf
Lecture_1_-_Research_Methods_-_Introduction.pdf
 
Marketing research
Marketing researchMarketing research
Marketing research
 
Research design
Research designResearch design
Research design
 
G2.suntasig.guallichico.maicol.alexander.english.project.design.docx
G2.suntasig.guallichico.maicol.alexander.english.project.design.docxG2.suntasig.guallichico.maicol.alexander.english.project.design.docx
G2.suntasig.guallichico.maicol.alexander.english.project.design.docx
 
1 complete research
1 complete research1 complete research
1 complete research
 
Evaluation of Health IT Implementation (February 17, 2021)
Evaluation of Health IT Implementation (February 17, 2021)Evaluation of Health IT Implementation (February 17, 2021)
Evaluation of Health IT Implementation (February 17, 2021)
 

More from AbdullahOmar64

Digital communcation for engineerings.pdf
Digital communcation for engineerings.pdfDigital communcation for engineerings.pdf
Digital communcation for engineerings.pdf
AbdullahOmar64
 
digital anlage c converter for digital .ppt
digital anlage c converter for digital .pptdigital anlage c converter for digital .ppt
digital anlage c converter for digital .ppt
AbdullahOmar64
 
LineCoding_ADCS.ppt
LineCoding_ADCS.pptLineCoding_ADCS.ppt
LineCoding_ADCS.ppt
AbdullahOmar64
 
companding.ppt
companding.pptcompanding.ppt
companding.ppt
AbdullahOmar64
 
The_Role_of_Raised_Cosine.pdf
The_Role_of_Raised_Cosine.pdfThe_Role_of_Raised_Cosine.pdf
The_Role_of_Raised_Cosine.pdf
AbdullahOmar64
 
09-SpreadSpectrum.ppt
09-SpreadSpectrum.ppt09-SpreadSpectrum.ppt
09-SpreadSpectrum.ppt
AbdullahOmar64
 
3-Matched-Filter.ppt
3-Matched-Filter.ppt3-Matched-Filter.ppt
3-Matched-Filter.ppt
AbdullahOmar64
 
mod.pptx
mod.pptxmod.pptx
mod.pptx
AbdullahOmar64
 
waveforms.ppt
waveforms.pptwaveforms.ppt
waveforms.ppt
AbdullahOmar64
 
aliasing.ppt
aliasing.pptaliasing.ppt
aliasing.ppt
AbdullahOmar64
 
INI_800_Lecture_1_Introduction.pdf
INI_800_Lecture_1_Introduction.pdfINI_800_Lecture_1_Introduction.pdf
INI_800_Lecture_1_Introduction.pdf
AbdullahOmar64
 
books-library.online-02011910Vr3Z0.pdf
books-library.online-02011910Vr3Z0.pdfbooks-library.online-02011910Vr3Z0.pdf
books-library.online-02011910Vr3Z0.pdf
AbdullahOmar64
 
نظم-العد.pdf
نظم-العد.pdfنظم-العد.pdf
نظم-العد.pdf
AbdullahOmar64
 
convert number.pdf
convert number.pdfconvert number.pdf
convert number.pdf
AbdullahOmar64
 
Guide latex.
Guide latex.Guide latex.
Guide latex.
AbdullahOmar64
 

More from AbdullahOmar64 (15)

Digital communcation for engineerings.pdf
Digital communcation for engineerings.pdfDigital communcation for engineerings.pdf
Digital communcation for engineerings.pdf
 
digital anlage c converter for digital .ppt
digital anlage c converter for digital .pptdigital anlage c converter for digital .ppt
digital anlage c converter for digital .ppt
 
LineCoding_ADCS.ppt
LineCoding_ADCS.pptLineCoding_ADCS.ppt
LineCoding_ADCS.ppt
 
companding.ppt
companding.pptcompanding.ppt
companding.ppt
 
The_Role_of_Raised_Cosine.pdf
The_Role_of_Raised_Cosine.pdfThe_Role_of_Raised_Cosine.pdf
The_Role_of_Raised_Cosine.pdf
 
09-SpreadSpectrum.ppt
09-SpreadSpectrum.ppt09-SpreadSpectrum.ppt
09-SpreadSpectrum.ppt
 
3-Matched-Filter.ppt
3-Matched-Filter.ppt3-Matched-Filter.ppt
3-Matched-Filter.ppt
 
mod.pptx
mod.pptxmod.pptx
mod.pptx
 
waveforms.ppt
waveforms.pptwaveforms.ppt
waveforms.ppt
 
aliasing.ppt
aliasing.pptaliasing.ppt
aliasing.ppt
 
INI_800_Lecture_1_Introduction.pdf
INI_800_Lecture_1_Introduction.pdfINI_800_Lecture_1_Introduction.pdf
INI_800_Lecture_1_Introduction.pdf
 
books-library.online-02011910Vr3Z0.pdf
books-library.online-02011910Vr3Z0.pdfbooks-library.online-02011910Vr3Z0.pdf
books-library.online-02011910Vr3Z0.pdf
 
نظم-العد.pdf
نظم-العد.pdfنظم-العد.pdf
نظم-العد.pdf
 
convert number.pdf
convert number.pdfconvert number.pdf
convert number.pdf
 
Guide latex.
Guide latex.Guide latex.
Guide latex.
 

Recently uploaded

carpentry-11-module-1.docx 1 identifying tools
carpentry-11-module-1.docx 1 identifying toolscarpentry-11-module-1.docx 1 identifying tools
carpentry-11-module-1.docx 1 identifying tools
ChristopherAltizen2
 
Santa Barbara City College degree offer diploma Transcript
Santa Barbara City College degree offer diploma TranscriptSanta Barbara City College degree offer diploma Transcript
Santa Barbara City College degree offer diploma Transcript
owbetho
 
杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<
杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<
杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<
amzhoxvzidbke
 
Tutorial on MySQl and its basic concepts
Tutorial on MySQl and its basic conceptsTutorial on MySQl and its basic concepts
Tutorial on MySQl and its basic concepts
anishacotta2
 
Technical Seminar of Mca computer vision .ppt
Technical Seminar of Mca computer vision .pptTechnical Seminar of Mca computer vision .ppt
Technical Seminar of Mca computer vision .ppt
AnkitaVerma776806
 
Red Hat Enterprise Linux Administration 9.0 RH124 pdf
Red Hat Enterprise Linux Administration 9.0 RH124 pdfRed Hat Enterprise Linux Administration 9.0 RH124 pdf
Red Hat Enterprise Linux Administration 9.0 RH124 pdf
mdfkobir
 
Generative-AI-a-boost-for-operations-Presentation.pdf
Generative-AI-a-boost-for-operations-Presentation.pdfGenerative-AI-a-boost-for-operations-Presentation.pdf
Generative-AI-a-boost-for-operations-Presentation.pdf
Aries716858
 
OME754 – INDUSTRIAL SAFETY - unit notes.pptx
OME754 – INDUSTRIAL SAFETY - unit notes.pptxOME754 – INDUSTRIAL SAFETY - unit notes.pptx
OME754 – INDUSTRIAL SAFETY - unit notes.pptx
shanmugamram247
 
Presentation python programming vtu 6th sem
Presentation python programming vtu 6th semPresentation python programming vtu 6th sem
Presentation python programming vtu 6th sem
ssuser8f6b1d1
 
readers writers Problem in operating system
readers writers Problem in operating systemreaders writers Problem in operating system
readers writers Problem in operating system
VADAPALLYPRAVEENKUMA1
 
Distillation-1.vapour liquid equilibrium
Distillation-1.vapour liquid equilibriumDistillation-1.vapour liquid equilibrium
Distillation-1.vapour liquid equilibrium
RjKing12
 
Sustainable construction is the use of renewable and recyclable materials in ...
Sustainable construction is the use of renewable and recyclable materials in ...Sustainable construction is the use of renewable and recyclable materials in ...
Sustainable construction is the use of renewable and recyclable materials in ...
RohitGhulanavar2
 
API-1150WB-Cooling Towers.pdf with details
API-1150WB-Cooling Towers.pdf with detailsAPI-1150WB-Cooling Towers.pdf with details
API-1150WB-Cooling Towers.pdf with details
MuhammadUsmanAsghar4
 
Adv. Digital Signal Processing LAB MANUAL.pdf
Adv. Digital Signal Processing LAB MANUAL.pdfAdv. Digital Signal Processing LAB MANUAL.pdf
Adv. Digital Signal Processing LAB MANUAL.pdf
T.D. Shashikala
 
Probability and Statistics by sheldon ross (8th edition).pdf
Probability and Statistics by sheldon ross (8th edition).pdfProbability and Statistics by sheldon ross (8th edition).pdf
Probability and Statistics by sheldon ross (8th edition).pdf
utkarshakusnake
 
The Control of Relative Humidity & Moisture Content in The Air
The Control of Relative Humidity & Moisture Content in The AirThe Control of Relative Humidity & Moisture Content in The Air
The Control of Relative Humidity & Moisture Content in The Air
Ashraf Ismail
 
Security Attacks and Solutions in Vehicular Ad Hoc Networks: A Survey
Security Attacks and Solutions in Vehicular Ad Hoc Networks: A SurveySecurity Attacks and Solutions in Vehicular Ad Hoc Networks: A Survey
Security Attacks and Solutions in Vehicular Ad Hoc Networks: A Survey
pijans
 
ISO 9001 - 2015 Quality Management Awareness.pdf
ISO 9001 - 2015 Quality Management Awareness.pdfISO 9001 - 2015 Quality Management Awareness.pdf
ISO 9001 - 2015 Quality Management Awareness.pdf
InfoDqms
 
charting the development of the autonomous train
charting the development of the autonomous traincharting the development of the autonomous train
charting the development of the autonomous train
huseindihon
 
the potential for the development of autonomous aircraft
the potential for the development of autonomous aircraftthe potential for the development of autonomous aircraft
the potential for the development of autonomous aircraft
huseindihon
 

Recently uploaded (20)

carpentry-11-module-1.docx 1 identifying tools
carpentry-11-module-1.docx 1 identifying toolscarpentry-11-module-1.docx 1 identifying tools
carpentry-11-module-1.docx 1 identifying tools
 
Santa Barbara City College degree offer diploma Transcript
Santa Barbara City College degree offer diploma TranscriptSanta Barbara City College degree offer diploma Transcript
Santa Barbara City College degree offer diploma Transcript
 
杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<
杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<
杨洋李一桐做爱视频流出【网芷:ht28.co】国产国产午夜精华>>>[网趾:ht28.co】]<<<
 
Tutorial on MySQl and its basic concepts
Tutorial on MySQl and its basic conceptsTutorial on MySQl and its basic concepts
Tutorial on MySQl and its basic concepts
 
Technical Seminar of Mca computer vision .ppt
Technical Seminar of Mca computer vision .pptTechnical Seminar of Mca computer vision .ppt
Technical Seminar of Mca computer vision .ppt
 
Red Hat Enterprise Linux Administration 9.0 RH124 pdf
Red Hat Enterprise Linux Administration 9.0 RH124 pdfRed Hat Enterprise Linux Administration 9.0 RH124 pdf
Red Hat Enterprise Linux Administration 9.0 RH124 pdf
 
Generative-AI-a-boost-for-operations-Presentation.pdf
Generative-AI-a-boost-for-operations-Presentation.pdfGenerative-AI-a-boost-for-operations-Presentation.pdf
Generative-AI-a-boost-for-operations-Presentation.pdf
 
OME754 – INDUSTRIAL SAFETY - unit notes.pptx
OME754 – INDUSTRIAL SAFETY - unit notes.pptxOME754 – INDUSTRIAL SAFETY - unit notes.pptx
OME754 – INDUSTRIAL SAFETY - unit notes.pptx
 
Presentation python programming vtu 6th sem
Presentation python programming vtu 6th semPresentation python programming vtu 6th sem
Presentation python programming vtu 6th sem
 
readers writers Problem in operating system
readers writers Problem in operating systemreaders writers Problem in operating system
readers writers Problem in operating system
 
Distillation-1.vapour liquid equilibrium
Distillation-1.vapour liquid equilibriumDistillation-1.vapour liquid equilibrium
Distillation-1.vapour liquid equilibrium
 
Sustainable construction is the use of renewable and recyclable materials in ...
Sustainable construction is the use of renewable and recyclable materials in ...Sustainable construction is the use of renewable and recyclable materials in ...
Sustainable construction is the use of renewable and recyclable materials in ...
 
API-1150WB-Cooling Towers.pdf with details
API-1150WB-Cooling Towers.pdf with detailsAPI-1150WB-Cooling Towers.pdf with details
API-1150WB-Cooling Towers.pdf with details
 
Adv. Digital Signal Processing LAB MANUAL.pdf
Adv. Digital Signal Processing LAB MANUAL.pdfAdv. Digital Signal Processing LAB MANUAL.pdf
Adv. Digital Signal Processing LAB MANUAL.pdf
 
Probability and Statistics by sheldon ross (8th edition).pdf
Probability and Statistics by sheldon ross (8th edition).pdfProbability and Statistics by sheldon ross (8th edition).pdf
Probability and Statistics by sheldon ross (8th edition).pdf
 
The Control of Relative Humidity & Moisture Content in The Air
The Control of Relative Humidity & Moisture Content in The AirThe Control of Relative Humidity & Moisture Content in The Air
The Control of Relative Humidity & Moisture Content in The Air
 
Security Attacks and Solutions in Vehicular Ad Hoc Networks: A Survey
Security Attacks and Solutions in Vehicular Ad Hoc Networks: A SurveySecurity Attacks and Solutions in Vehicular Ad Hoc Networks: A Survey
Security Attacks and Solutions in Vehicular Ad Hoc Networks: A Survey
 
ISO 9001 - 2015 Quality Management Awareness.pdf
ISO 9001 - 2015 Quality Management Awareness.pdfISO 9001 - 2015 Quality Management Awareness.pdf
ISO 9001 - 2015 Quality Management Awareness.pdf
 
charting the development of the autonomous train
charting the development of the autonomous traincharting the development of the autonomous train
charting the development of the autonomous train
 
the potential for the development of autonomous aircraft
the potential for the development of autonomous aircraftthe potential for the development of autonomous aircraft
the potential for the development of autonomous aircraft
 

Lecture_4_Data_Gathering_and_Analysis.pdf

  • 1. 1 UNIVERSITY OF PRETORIA Session Four: Techniques for Data Gathering and Analysis Professor David Walwyn
  • 2. Data-Gathering Techniques • Perusal (Library research) – data mining (Existing published data, databases, statistics, records, etc.) • Measurement (Experimental and field research) • Observation (Field research) – case studies • Questioning (Field research) – interviews, surveys, questionnaires. 2
  • 3. Desirable Attributes of Techniques • Epistemic – Reliability, validity, objectivity • Constituent – Feasibility, sensitivity • Teleological – Purposeful, coherent, Ethics and Integrity • Axiological – Ethics, integrity, non-maleficence, beneficence 3
  • 4. Data Gathering Spectrum • The closer the research is to the human subject, the more likely that the research will be subjective • QUANT/positivists are concerned with the description of phenomena and causality – a priori research design – internet survey • QUAL/phenomenologists are interested in the participants experience of these phenomena – emergent design is possible (adaptive process in response to the progress and results of the research) – interview, focus groups and direct observation 4
  • 5. Nature of Measurement • Measurement involves the construction of variables • Very often, what we are attempting to measure cannot be measured directly – as a consequence we have to use indicators, proxy or surrogate variables (same meaning) • Different levels of measurement – nominal, ordinal, interval, absolute, etc. – important for statistical analysis 5
  • 6. Numerical Representation • Representation of variables in coded form • Nominal Scale – Numbers used for representation without implication of importance/order/scale (1 = female; 2 = male) • Ordinal Scale – Numbers are used for classification and does convey order but not exact values (low = 1, medium = 2, high = 3) • Interval Scale – Numbers are equidistant but do not necessarily have a true zero point (deg C) • Ratio Scale – Numbers are equidistant and have a true zero point (deg K) value). 6
  • 7. Summary of Numerical Representations 7 Ratio Interval Even Proportions Property Ordinal Equal Differences Equal Differences Nominal Hierarchy Hierarchy Hierarchy Distinction Distinction Distinction Distinction Warm or Cold Cold < Mild < Warm < Hot Centigrade Kelvin Descriptor
  • 8. Measurement 2 • Variables must first be defined conceptually (the concept the variable is attempting to capture) and operationally (how the variable will be measured in practise) • All measuring and data collection procedures must be based on systematic observation replicable by independent observers (unless phenomenological) Variable Conceptual definition Operational definition Industrial innovativeness Introduction of new and improved products, processes and services to the market Number of innovations/annum Percentage of sales from innovated products or services
  • 9. Questionnaire Design • The first step in designing a questionnaire is to create a conceptual model – your framework of analysis which has been derived from your synthesis of the literature and your understanding of the important variables/the relationships between these • The second step is to produce the questionnaire – the introduction, the statement of informed consent, the questions and responses, and the ‘look and feel’ format – you could use SurveyMonkey (www.surveymonkey.com), Google Sheets or http://kwiksurveys.com/ or http://freeonlinesurveys.com/) • The third step is to pre-test the questionnaire.
  • 10. Process for Questionnaire Design Research Problem Research Questions and Hypotheses Research Variables Questionnaire questions
  • 11. Welman’s Considerations • Maintain neutrality • Choose judiciously between open-ended and closed- ended (multiple choice variety) questions – closed ended are easier to analyse and have generally higher repeatability • Be brief and focused • Take the respondent’s literacy level into consideration – if necessary define the key terms in the introduction to the questionnaire 11
  • 12. Welman 2 • Use a logical and justified sequence • Use branching to ensure that respondents do not have to read through irrelevant material • Layout is important; strive for a clear and aesthetically pleasing layout • Be careful not to offend • PRESENT A WELL-CRAFTED COVER LETTER 12
  • 13. Interviews • Structured Interviews – Standardised fixed-format questionnaire – Can be conducted by trained interviewers • Semi-Structured Interviews • Unstructured Interviews – Collection of qualitative information • Focus Group Interviews – Interview of a small (3-12 person) group – Open-ended questions • Preparing for the interview – analyse the research problem and questions – understand what information must be obtained – identify those who can provide the information 13
  • 14. Types of Surveys • Postal mail surveys • Telephone surveys • Electronic surveys – E-mail surveys – Internet surveys • Direct personal surveys
  • 15. Improving the Confidence Level of Low Response Rate Surveys • The problem with a low response rate (<30%) is that the survey becomes a self-selected volunteer (non-probabilistic) survey • Non-response survey – Select a random sample of non-respondents – Measure only a few selected key variables – Ensure high response rate (>70%) – Comparison of results and statistical testing can confirm validity of main survey. – alternatively compare the immediate and the late responses; should be the same in terms of content if there is no bias
  • 16. Coding the Data • Data analysis is most commonly done with statistical tests • Statistical testing can only be done with numerical data • Textual data has to be converted to numerical data to enable statistical testing • This is done by means of a process called Data Coding
  • 17. Steps in Data Management • Prepare the data collection instrument and collect the data • Prepare the data codebook • Prepare the data matrix worksheets • Prepare instructions for data entry and data analysis
  • 18. • For each variable provide the following information in the codebook: – Variable name (e.g. “PARTICIP”) – Variable definition (e.g., “Participant name”) – Position (Column number in worksheet, e.g., “1” ) – Measurement Level (e.g., “Nominal”) – Code for missing values (e.g., “-1”) – Coding (e.g., “Value – Label” Table) The Data Codebook
  • 19. Steps in Data Entry • Prepare the data matrix worksheets • Provide one column for each variable and one row for each record • Label the columns and rows • Enter data using instructions for data entry and code book
  • 20. Class Exercise • Listen to the podcast from This American Life – (18.00 to 28.00) • Analyse the interview based on this framework – was informed consent considered? – was the interview structured or unstructured? – did the interviewer let the informant lead? – was the researcher ethical? – did the researcher explain the purpose of the interview? – other comments? • Another interview: – https://www.youtube.com/watch?v=6PhcglOGFg8
  • 21. 21 Data Analysis and Hypothesis Testing • Qualitative data analysis • Quantitative data analysis – Overview of statistical analysis – Descriptive statistical analysis – Inferential statistical analysis • includes hypothesis testing – Statistical modelling • Parametric vs. Non-Parametric
  • 22. Qualitative Data Analysis • Qualitative data analysis is defined as “the non- numerical examination and interpretation of observations, for the purpose of discovering underlying meanings and patterns of relationships”. – more common in humanities and social sciences – based on language and observation • Some techniques – data ‘coding’ and display (ATLAS.ti) • network display – theme identification • discourse and content analysis 22
  • 23. Defining the Codes (CAQDAS) • Creating the codes and sub-codes, including the hierarchy and networks, is the difficult part (coding itself is quite mechanical) • Must start with your conceptual model and the questionnaire – example of behavioural change • Categories of codes can be – organisational – substantive or descriptive – theoretical or abstract 23
  • 24. Determinants of Behavioural Change 24 Stage 1 Unaware of Issue Stage 5 Decided to Act Stage 6 Acting Stage 4 Decided Not to Act Stage 2 Aware of, but Unengaged, by Issue Stage 3 Undecided about Acting Stage 7 Maintenance
  • 25. ATLAS.ti Codes • 1st step ; creation of a ‘hermeneutic unit’ • 2nd step; importing text files • 3rd step; defining the codes • 4th step; coding the data • 5th step; collating the coded material • 6th step; constructing the results section based on the actual material • 7th step; discussion and interpretation of the results 25
  • 26. ATLAS.ti Network 26 Adoption of Specific Measures Risk and Vulnerability Main Disease Threats Introduce New Measures Barriers to Response Camp Facilities Mitigation and Response Policy Fit Challenge to Specific Measures
  • 27. Example of Coding 27 P 7: 2.1.docx - 7:10 [It’s very good, we’ve got pre-..] (107:107) (Super) Codes: [Camp Facilities] It’s very good, we’ve got pre-fab accommodation. Concrete floors, porcelain toilets and hand basins, running hot and cold water, glass- enclosed showers, air-conditioning everywhere, windows with fly netting on, lockable doors, just like a little bungalow at a university hostel. Double bed, file cabinet, lamp.
  • 28. Quantitative Data Analysis • Quantitative data analysis is defined as the “numerical representation and manipulation of observations for the purpose of describing and explaining the phenomena that those observations reflect” 28
  • 29. Content Analysis of Inaugural Address • See Times analysis of Trump’s speech – what type of analysis has been used? • secondary or primary? • content or syntax? • Is the speech plagiarism? • Do you agree that “protectionism will make America great again?” 29
  • 30. Initial or Exploratory Data Analysis • Code the data for each variable (previous lecture) • Enter the data into a spreadsheet • Define the measurement scale for each variable (normalised, adjusted, etc.) • Check the data and correct errors • ‘Illustrate’ the data (descriptive statistics) if possible • Produce range of graphs to check for correlations 30
  • 31. Descriptive Statistical Analysis • Objectives: observational decision making • Outcome: condenses large volumes of data into a few summary measures – such as mean, variance, frequency • Interpreted by an inspection of the summary measures – leads to decisions by observations – guide to inferential statistics 31
  • 32. Inferential Statistics • Objective: proof of causality • For a causal relationship to exist between two variables, four conditions must exist: – time order: the cause must exist before the effect – co-variation: a change in the cause produces a change in the effect – rationale: reasonable explanation of why they are related – non-spuriousness: no other (rival) cause for the effect found • Causality can never be proven absolutely, we can only show to what degree it is probable! – e.g. global warming debate and IPCC 32
  • 34. Statistical Modelling • Objective: extrapolate and predict • Method: builds models showing relationships between variables • Many different types of models – continuous – discreet – stochastic 34
  • 35. Statistics and Rules • Rutherford: “if you need statistics, you did the wrong experiment” • “Always consult a statistician; but know enough about statistics to view the advice critically” • My objective … – give an overview with reference to detailed information – more details available from “Fundamentals of Biostatistics” (Rosner) and other voluminous texts 35
  • 36. Summary Table of Statistical Tests (Plonskey, 2001) Type of Variables Sample Characteristics (Bivariate) Correlation (Multivariate) 1 Sample Population 2 Samples K Samples Independent Dependent Independent Dependent Categorical or Nominal chi2 or binomial chi2 Macnarmar’s chi2 chi2 Cochran’s Q Rank or Ordinal Mann Whitney U Wilcoxin Matched Pairs Signed Ranks Kruskal Wallis H Friendman’s ANOVA Spearman’s rho Interval & Ratio Z test or t test t test between groups t test within groups 1 Way ANOVA between groups 1 Way ANOVA Pearson’s r Factorial (2 Way) ANOVA
  • 37. Numerical Representation 2 • Response scales (types of ordinal scales): – Rating scales – Attitudinal scales – Ranking scales – Likert scale (!) • Strongly disagree • Disagree somewhat • Neutral • Agree somewhat • Strongly agree 37
  • 38. Types of Data Analysis • Univariate – an expression, equation, function or polynomial of only one variable (e.g. age distribution) – mostly descriptive (frequency, standard deviation, median) • Multivariate – objects of any of these types but involving more than one variable is called multivariate (e.g. x/y/z) – mostly explanatory • Very familiar to natural scientists/engineers! 38 Wikipedia, 2013
  • 39. Stats Software • For example, Moonstats • Load Health.Mon and Study.Mon – Variables • Sex, matric mark, hours of study per week and attitude – Univariate analysis • Frequency distribution • Descriptives (mean, std dev, min/max, median) – Bivariate analysis • Correlation (Pearson and Spearman) • Chi Squared • T Test – Are you less likely to fail if you have a positive attitude? 39
  • 40. Uses for Inferential Statistics • Determining differences between experimental and control groups in experimental research • In descriptive research when comparisons are made between different groups • Enable the researcher to evaluate the effects of an independent variable on a dependent variable • Differences between a sample statistic and a population parameter because the sample is not perfectly representative of the population
  • 41. Chi Square and t Tests • Chi Square is a nonparametric test used with nominally scaled data which are common with survey research – statistic is used when the researcher is interested in the number of responses, objects, or people that fall in two or more categories • Similarly t test is used to determine if two sets of data are significantly different from each other
  • 42. Summary • Qualitative vs quantitative • In quantitative analysis, it is: – important to categorise variables – undertake an initial exploration using descriptive statistics – use the appropriate technique for inferential statistics – consult a statistician if possible, especially for complex datasets containing variables of different types 42
  • 43. Class Work: • Listen to the following YouTube clip on QUAL methods: – https://www.youtube.com/watch?v=opp5tH4uD-w • Answer the following questions: – what does it mean “code structure evolves inductively”? – why is coding an iterative process? 43