The document discusses interactive text mining and visualization tools for digital humanities research. It introduces the Interactive Text Mining Suite (ITMS), a web application created with R and Shiny that allows users to import texts, extract metadata, perform preprocessing, and generate dynamic visualizations. The document also presents a case study using ITMS to analyze a medieval Occitan corpus, demonstrating visualizations for part-of-speech comparisons, keyword analysis, stylistic similarities, document clustering, and topic modeling. The goal of ITMS is to provide a user-friendly interface for both close and distant reading through interactive statistical analysis and visualization of literary texts.
5. Introduction
Visualization
Methods
ITMS
Medieval
Corpus
Conclusion
Digital Humanity Manifesto 2.0 (2009) and Berry
(2011)
1st Wave: “The first wave of digital humanities work was
quantitative, mobilizing the search and retrieval
powers of the database, automating corpus
linguistics, stacking hypercards into critical
arrays”
2nd Wave: “The second wave is qualitative, interpretive”,
concentrating on new tools for creating and
curating digital repositories (Berry, 2011)
3rd Wave: Concentration on the computationality, search,
retrieval and analysis originated in
humanity-based work
5 / 30
21. Introduction
Visualization
Methods
ITMS
Medieval
Corpus
Conclusion
Case Study - Medieval Occitan
Occitan (Proven¸cal) constitutes an important element of the
literary, linguistic, and cultural heritage in the history of
Romance languages
Interactive online database and linguistically annotated corpus
(Scrivner et al., 2014)
http://www.oldoccitancorpus.org
21 / 30
28. Introduction
Visualization
Methods
ITMS
Medieval
Corpus
Conclusion
Conclusion
1 There is a need for text mining tools designed for linguists
and literary scholars
2 Interactive user-friendly applications bridge the gap
between data mining and digital humanities
3 Shiny framework can be incorporated in any digital
corpora to exhibit, search or visualize written collections
28 / 30
30. Introduction
Visualization
Methods
ITMS
Medieval
Corpus
Conclusion
References
Mohammad, Saif. 2013. From Once Upon a Time to Happily Ever After:
Tracking Emotions in Novels and Fairy Tales. In Proceedings of the ACL
Workshop on Language Technology for Cultural Heritage, Social Sciences, and
Humanities (LaTeCH), 2011, Portland, OR.
Moretti, Franco. 2005. Graphs, maps, trees: abstract models for a literary history.
R.R. Donnelley & Sons.
Oelke, Daniela, Dimitrios Kokkinakis and Mats Malm. 2012. Advanced Visual
Analytics Methods for Literature Analysis. In Proceedings of the 6th EACL
Workshop, 35-44.
Rydberg-Cox, Jeff. 2011. Social Networks and the Language of Greek Tragedy.
Journal of the Chicago Colloquium on Digital Humanities and Computer Science.
1(3): 1-11.
Thomas, James and Kristin Cook. 2005. Illuminating the Path: the Research and
Development Agenda for Visual Analytics. National Visualization and Analytics
Center.
Vuillemot, Romain, Tanya Clement, Catherine Plaisant and Amit Kumar. 2009.
What’s Being Near “Martha”? Exploring Name Entities in Literary Text
Collections. In Proceedings if the IEEE Symposium. Atlantic City, New Jersey.
107-114.
http://www.clipartbest.com/clipart-9i4A55xiE
30 / 30