Some Methodological Thoughts on Using Text Mining for Frame Analysis of Media Content
Computer Support for Frame
Analysis of Media Content:
Yuwei Lin, 15 Jan 2009, MeCCSA 2009, Bradford, UK
Some Methodological Thoughts
Yuwei Lin
ESRC National Centre for e-Social Science,
University of Manchester
http://www.ncess.ac.uk
Acknowledgement
JISC-funded 18-month TMFA Project:
Yuwei Lin, 15 Jan 2009, MeCCSA 2009, Bradford, UK
Using Text Mining for Frame Analysis of Media
Content
Key members: Sophia Ananiadou, June Finch,
Peter Golding, Peter Halfpenny, Thomas Koenig,
Yuwei Lin, Elisa Pieri, Rob Procter, Brian Rea,
Farida Vis, Davy Weissenbacher (in alphabetical
order)
Outline
A STS-informed
paper on the impact
Yuwei Lin, 15 Jan 2009, MeCCSA 2009, Bradford, UK
of computerisation on
doing social research
Challenges of frame
analysis
Text mining
technologies
Some methodological
issues
Concluding remarks
Challenges of Frame Analysis
Labour-intensive manual coding
Yuwei Lin, 15 Jan 2009, MeCCSA 2009, Bradford, UK
Error-prone, biased, subjective/interpretative
Non-scalable (small corpora): difficult to deal
with increasingly large amount of data
Solutions: More analysts (but low inter-coder
reliability) or Computerising the analysis
Trend of bridging the long-standing tension
between quantitative (statistics) and
qualitative (meanings) methods
Yuwei Lin, 15 Jan 2009, MeCCSA 2009, Bradford, UK
Text-mining
the opportunity of processing large amounts of
textual data systematically, reducing human
errors, and saving time
the potential to at least partly automate the
generation of frames
add-on feature to Computer-Assisted Qualitative
Data Analysis Software (CAQDAS) packages
Some Methodological Thoughts
Does corpus size matter?
Yuwei Lin, 15 Jan 2009, MeCCSA 2009, Bradford, UK
Conceptual validity and generalisability
Whose interpretations / what assumptions?
Conceptual validity and generalisability
Levels of meanings
Conceptual validity and generalisability
Standardisation of units of measurements
Clarity and transparency in doing analysis
Does corpus size matter?
Corpus building: small but focused, or
Yuwei Lin, 15 Jan 2009, MeCCSA 2009, Bradford, UK
large, noisy but indiscriminative
What to include in a corpus?
How to reduce noise in raw data?
Where does human interpretation end?
Does corpus size have any impact on the
conceptual validity and generalisability of
frames? (quantitative or qualitative)
Whose interpretations and
what assumptions?
Manual coding results may be subjective,
Yuwei Lin, 15 Jan 2009, MeCCSA 2009, Bradford, UK
interpretative, biased.
In the case of computer-supported
analysis:
which text mining algorithms/techniques to
adopt?
based on which techniques (statistical ones?)
and on which training datasets?
These techniques and corpora reflect certain
interpretations, assumptions and world views.
Levels of meanings in frames
Diversity in doing frame analysis and
Yuwei Lin, 15 Jan 2009, MeCCSA 2009, Bradford, UK
different definitions of frames
Various types of frames: multiple frames,
overlapping frames, frames that shape
people's actions and their involvement in
everyday activities (Goffman)
How applicable are the lexical
frames extracted by text mining
techniques?
Standardisation of Units of
Measurements
Yuwei Lin, 15 Jan 2009, MeCCSA 2009, Bradford, UK
Frames exist in different levels: words,
sentences, paragraphs, articles (units of
measurement)
Interpretative flexibility in manual coding
Debate on clarity and transparency
Text mining is systematising and
standardising the units of measurement.
More objective? More reliable? More biased?
More transparent or more black-boxed?
Concluding remarks
Labour-intensive, error-prone manual coding
Yuwei Lin, 15 Jan 2009, MeCCSA 2009, Bradford, UK
CAQDAS (particularly those with text mining)
Methodological issued posed by
computerisation:
Does corpus size matter?
Whose interpretations and what assumptions?
Levels of meanings in frames
Standardisation of units of measurements
Depending on research questions & contexts