1) Qualitative analysis involves turning unstructured data like texts and artifacts into a detailed description of important aspects of a situation or problem through three stages: finding relevant data, identifying descriptive properties and dimensions, and gaining a better understanding.
2) Content analysis is an in-depth process of developing a representative description of unstructured input like text or media content through both quantitative and qualitative techniques to potentially generate new knowledge.
3) Coding schemes involve assigning categories and descriptors to blocks of text, which then serve as "categories for datum," and can be emergent based on the data or priori based on existing theory.
2. Qualitative analysis
• Turn the unstructured data
– texts, other artifacts…
• into a detailed description
– about the important aspects of the situation / problem
3. Objective
• Three stages of Qualitative analysis
1. Start with a data set containing information about our problem
2. Find relevant descriptive properties and dimensions
3. By using the knowledge we gained, make better understanding abo
ut the original substance
4. Content analysis
• Process of developing a representative description
– of text or other unstructured input.
• In-depth analysis that searches for theoretical interpretations
– might generate new knowledge
• Both Quantitative and Qualitative techniques can be used
5. Content
• Media Content
– Printed Publications
– Broadcast programs
– Websites
– Other recordings… (photo, film, music…)
• Audience Content
– Feedback collected from an audience group
– surveys, questionnaires, interviews…
6. For content analysis…
• Have a clear definition for the data set
• Define the population from which the data set is drawn
• Need to understand the specific context
– Studying about the attitude toward security procedures
– Government employee vs. Entertainment staff
– Quite obvious…
9. Coding schemes
• What is “Coding”??
– assigning categories and descriptors to blocks of text
– Then, just paraphrasing and counting key words?
– Actually not..
• Def by Corbin & Strauss
– Interacting with data, comparing data, and so on, and in doing so,
– derive concepts
• Later serve as “Categories for datum”
10. Emergent & Priori coding
• Emergent Coding
– conducted without any theory or model to guide
– start by noting interesting concepts or ideas
– continually refine them
– until they forms a coherent model (captures the important details)
• Priori Coding
– uses Established theory or hypothesis
11. Emergent Coding
• based on GroundedTheory
• Theory Development from continuous interplay b/w
data collection and data analysis.
12. GroundedTheory
• Consists of four stages
1. Open coding
2. Development of concepts
3. Grouping concepts into categories
4. Formation of a theory
13. Step 1 : Open Coding
• Analyze the text & identify any interesting phenomena
• How??
• Borrow the term from the data directly
– In vivo coding
• Or find an appropriate term to describe the instance
– Extract underlying meaning
14. Step 2 ~ 4
Codes
Concepts
Categories
Group (Axial Coding)
Group
Theory
15. GroundedTheory
• Pros
– Provides a systematic approach to analyzing qualitative
– Theory can be backed up by ample evidence (made by “coding”)
– Procedure is quite intuitive to follow
• Cons
– Can be overwhelmed during the coding stage
– Hard to evaluate
– vulnerable to biases
16. Priori Coding
• Uses Theoretical frameworks
– helps you identify the major categories,
– and items that need to be coded
• Important to study the former research
to find out related theoretical framework
17.
18. How can we code the “text”??
1. Look for specific items
2. Ask questions constantly about the data.
3. Making comparisons constantly
I. Compare instances under different coding categories
II. Compare the results b/w different participant groups
III. Compare with previously reported literature
• Additionally…
– record the code (assigning)
– iterate the process!!
20. Ensuring high quality analysis
• Validity
– means that we used well-established and well-documented procedures to i
ncrease the accuracy of findings
• Reliability
– Consistency of results
– if (common data -> similar conclusions)
then reliable
21. Validity
• Face validity ( a. k. a Content validity)
– subjective validity criterion
– can be said to have face validity if it “looks” good to measure what it is sup
posed to measure.
• Criterion validity
– assesses how accurate it can predict a previously validated concept or criter
ion
• Construct validity
– applied if no valid criterion is available
– “What constructs account for variance in test performance??”
22. Reliability
• Intracoder reliability
• Intercoder reliability
• Important to develop a set of explicit coding instructions!!
• Measuring reliability
– percentage of agreement
– Cohen’s Kappa
24. Multimedia information
• Similar to analyze text content, but..
• Image, Audio,Video data:
– need to be coded for specific instances ( specific event or sound)
– does researcher should go through hours of audio or video??
– Extremely time-consuming!!
• Needs Automated tools
25. Multimedia analysis
• Multimedia datum should be annotated or labeled with text
• Why??Text-based IR.
• Manual Annotation
– labor intensive
• Completely Automated Annotation
– semantic gap b/w low-level features and high-level concepts
– highly error-prone
• Partially Automated Annotation
– ML
26. Multimedia analysis
• Other tools for multimedia content analysis
– ex) PhotoSpread System (Kandel et al.)
– allows users to organize and analyze photos via an spreadsheet