Tool criticism

Tool Criticism
Marijn Koolen
(Huygens Institute for the History of the Netherlands)
Tools & Methods guest lecture - 2021-03-02 - Groningen

● Python/Jupyter to do:
○ GIS, Plotting locations on Google Maps
○ Machine Learning, (un)supervised learning, visualisation techniques
○ Mining social media data
○ TF*IDF, information processing, Word embedding models
○ Statistics / JASP
● Questions:
○ Why interested?
○ What would you like do with this?
Your Interests?

Online Resources
● Online (note)books for DH and Python
● Generic Jupyter:
○ https://jupyter4edu.github.io/jupyter-edu-book/
○ https://programminghistorian.org/en/lessons/jupyter-notebooks
● Specific methods and techniques
○ Cultural Analytics: https://melaniewalsh.github.io/Intro-Cultural-Analytics/welcome.html
■ Includes TF*IDF, Tweet mining and analysis, Geocoding
○ Named Entity Recognition: http://ner.pythonhumanities.com/intro.html
○ Deep Learning: https://course.fast.ai
○ NLP: Traditional and Deep Learning: https://www.fast.ai/2019/07/08/fastai-nlp/
■ Includes Word Embeddings, sentiment analysis, topic modelling, classification, …
○ GLAM Workbench: https://glam-workbench.github.io
■ Retrieving and analysing data from Galleries, Libraries, Archives, Museums

● Starting point: (digital) source criticism
○ Method / approach in the humanities and specifically in historical research (cf. Fickers, 2012)
○ Internal source criticism: content of the document
○ External source criticism: metadata of the document (context)
■ Who created the document?
■ What kind of document is it?
■ Where was it made and distributed?
■ When was it made?
■ Why was it made?
● Digital Tool Criticism
○ What makes digital tool criticism different from digital source criticism?
○ Tool hermeneutics: what was its intended use? Does that align with my intended use? How
does it affect the digital sources/data it operates on?
Guiding Questions

For researchers:
- Incorporate digital source, data and tool criticism in research process
- Explicitly ask and answer questions about assumptions, choices, limitations
- Document and share workarounds
- Look for “About” pages and documentation on
- Functionalities, configurations, parameter choices
- Selection criteria and transformations of data sets
- Develop method of experimentation with tool to test functioning
- Look under the hood to develop better intuitions, grow your conceptual toolbox
- E.g. how can you test if a search engine filters stopwords or does linguistic normalization?
Recommendations

Model: Reflection as Integrative Practice
Koolen, van Gorp & van Ossenbruggen, 2018. Toward a model for digital tool criticism: Reflection as integrative practice. In Digital
Scholarship in the Humanities 2018. https://academic.oup.com/dsh/advance-article/doi/10.1093/llc/fqy048/5127711

Role of Reflection
● Reflection In Action
○ Process is often unpredictable and uncertain (Schön 1983, p. 40)
○ Some actions, recognitions and judgements we carry out spontaneously, without thinking
about them (p. 54)
○ Use reflection to criticize tacit understanding grown from repetitive experiences (p. 61)
● This fits certain aspects of scholarly practice
○ E.g. searching, browsing, selecting using various information systems (digital archives and
libraries, catalogs and other databases).
○ But information systems already have pre-selection, rarely well-documented (digital source
criticism!)

Research Design as Wicked Problem
● Wicked problem
○ Design theory concept, a problem that is inherently ill-defined (Rittel in Churchman 1967)
○ Working towards solution changes the nature of the problem
● Humanities research is designed iteratively (Bron et al. 2016)
○ Impossible to plan where investigation takes you
○ Engagement with research materials shift goal posts
○ Affects appropriateness of design for RQ
● User-friendliness of digital tools exacerbates the problem
○ Graphical User Interfaces (GUIs) often hide relevant data transformations and manipulations
○ Difficult to look under the hood
○ Requires active reflective attitude

Entanglement of Data and Tools

Entanglement of Data and Tools
Each step changes the underlying data!

● How to address tool criticism questions
○ Focus on research methods
● E.g. Social Network Analysis (SNA)
○ Understand concepts, techniques and applications of SNA before assessing SNA tools
○ How many of you have used SNA tools? How many of you want to use them?
○ Gephi or NetworkX (Python library)
● Before you ask...
○ Which layout algorithm should I use?
○ Which community detection algorithm should I use? What parameters are good?
● … understand core concepts:
○ nodes, edges, link degrees, paths, connected components,
○ Modularity, bridge, weak ties,
○ Completeness, impact of missing data
Tools or Methods?

Source: https://towardsdatascience.com/generating-twitter-ego-networks-detecting-ego-communities-93897883d255

● Term Frequency * Inverse Document Frequency
○ Used in many methods and tools
○ What was TF*IDF originally intended for?
● Again, start from method
○ Natural Language Processing, Information Theory
○ Concepts: Zipf’s law, tokenisation, stop, stem, lemma, part-of-speech, mutual information
TF*IDF

● I’ve prepared a Jupyter notebook that demonstrates the workings of TF*DF
○ Using social media data (tweets and online reviews)
○ With 7 questions to reflect on its details
● Break out groups
○ Open the notebook and discuss the questions (take 20 mins.)
○ Afterwards we discuss your observations and your own questions
○ Also, look at the Wikipedia page on TF*IDF: https://en.wikipedia.org/wiki/Tf-idf
Hands On with TF*IDF

● Text Mining of tweets (and other short records)
○ Tweets are peculiar textual representations
■ Minimal amount of text, low redundancy
■ Majority of terms occur only once
○ Which part of TF*IDF contributes more to the TF*IDF score of a tweet?
○ Consequences for ranking/clustering/mining?
Text Mining in Tweets

● Resources
○ NRC EmoLex: 8 basic emotions (https://saifmohammad.com/WebPages/NRC-Emotion-Lexicon.htm)
○ LIWC:over 70 categories, incl. emotions (https://liwc.wpengine.com)
○ VADER: Valence, Arousal, Dominance (https://github.com/cjhutto/vaderSentiment)
● Critical questions
○ How do they work? What are they intended to measure? For what text genres?
○ How reliable are they? What do they capture well? What are typical mistakes they make?
● Lessons from 20+ years of NLP research:
○ sentiment is domain-specific, nowadays aspect-based (reviews of hotels, restaurants and
smartphone have their own vocabularies)
● ALWAYS combine quantitative with qualitative analysis!
○ They contextualise each other
Sentiment Analysis and Emotion Lexicons

Questions About Social Media Sentiment Mining
● Another Jupyter notebook, that dissects sentiment analysis
○ Using social media data (tweets and online reviews)
○ With 9 questions to reflect on its details and output
● Break out groups
○ Open the notebook and discuss the questions (take 20 mins.)
○ Afterwards we discuss your observations and your own questions

● Concepts:
○ N-grams, skipgrams, distributional semantics
○ Semantic vs. syntactic similarity (related size of context window)
○ Generic vs. domain-specific models and text corpora
○ Pre-trained models, transfer learning
○ Corpus size
● See also (shameless self-promotion):
○ Wevers, M., & Koolen, M. (2020). Digital begriffsgeschichte: Tracing semantic change using
word embeddings. Historical Methods: A Journal of Quantitative and Interdisciplinary History,
53(4), 226-243.
○ https://www.tandfonline.com/doi/pdf/10.1080/01615440.2020.1760157
Word Embedding Models

● Finding patterns in data
○ But are they meaningful patterns?
○ Main point: separating regular features (signal) from ‘accidental feature’ (noise) of a dataset
■ If I throw a 6-sided die 10 times, the average is probably close to 3.5 (regular/signal) but
the particular sequence of sides is accidental (irregular/noise)
■ Many ‘regularities’ are artefacts introduced through selection (tweets from the last 24
hours may cover Sunday evening for one part of the world and Monday morning for
another)
● Which regularities are relevant depends on your research question
○ But ML methods are oblivious to your research question and context
Machine Learning

Tweet Corpora
● Existing corpora
○ Kaggle sentiment140: https://www.kaggle.com/kazanova/sentiment140
○ GESIS TweetsCOV19: https://data.gesis.org/tweetscov19/
○ GateNLP BTC: https://github.com/GateNLP/broad_twitter_corpus
○ Disaster Tweet Corpus 2020: https://zenodo.org/record/3713920
● How were they constructed?
○ Multiple layers of selection:
■ Twitter API
■ Collection methods and period, queries and cleaning/filtering
● For what purpose were they collected?
○ How has that shaped their construction?

Tool Criticism Recommendations (From Journal Article)
● Analyze and discuss tools at the level of data transformations.
○ How do inputs and outputs differ?
○ What does this mean for interpreting the transformed data?
● Questions to ask about digital data:
○ Where do the data come from? Who made the data? Who made the data available? What selection criteria were used?
How is it organized? What preprocessing steps were used to make the data available? If digitized from analogue sources,
how does the digitized data differ from the analogue sources? Are all sources digitized or only selected materials? What
are known omissions/gaps in the data?
● Questions about digital tools:
○ Which tools are available and relevant for your research? Which tool best fits the method you want to use? How does the
tool fit the method you want to use? For which phase of your research is this tool suitable? What kind of tool is it? Who
made the tool, when, why, and what for? How does the tool transform the data that it works upon? What are the potential
consequences of this?
● Questions about digital search tools:
○ What search strategies does the tool allow? What feedback about matching and non-matching documents does the tool
provide? What ways does the tool offer for sense-making and getting an overview of the data it gives access to?
● Questions about digital analysis tools:
○ What elements of the data does the tool allow you to analyze qualitatively or quantitatively? What ways of analyzing does
the tool offer, and what ways to contextualize your analysis?

References
Bron, M., Van Gorp, J., & De Rijke, M. (2016). Media studies research in the data‐driven age: How research questions evolve. Journal
of the Association for Information Science and Technology, 67(7), 1535-1554.
Churchman, C. W. (1967). Wicked problems. Management Science, 14(4), B141–142
Fickers, A. (2012). Towards a new digital historicism? Doing history in the age of abundance. VIEW Journal of European Television
History and Culture, 1(1), 19-26.
Hoekstra, R., & Koolen, M. (2019). Data scopes for digital history research. Historical Methods: A Journal of Quantitative and
Interdisciplinary History, 52(2), 79-94.
Koolen, van Gorp & van Ossenbruggen, 2018. Toward a model for digital tool criticism: Reflection as integrative practice. In Digital
Scholarship in the Humanities 2018.
Schön, D. (1983). The reflective practitioner. How professionals think in action. New York: Basic Book. Inc., Publishers.
Wevers, M., & Koolen, M. (2020). Digital begriffsgeschichte: Tracing semantic change using word embeddings. Historical Methods: A
Journal of Quantitative and Interdisciplinary History, 53(4), 226-243.

Tool criticism

Recommended

Recommended

More Related Content

Similar to Tool criticism

Similar to Tool criticism (20)

More from Marijn Koolen

More from Marijn Koolen (6)

Recently uploaded

Recently uploaded (20)

Tool criticism