Successfully reported this slideshow.

Visualising Dante: The data behind the Divine Comedy

0

Share

1 of 13
1 of 13

More Related Content

Visualising Dante: The data behind the Divine Comedy

  1. 1. Data Stories, “Visualising Dante - The data behind the Divine Comedy”, Ginestra Ferraro, King’s Digital Lab, London, 27 November 2020 Ginestra Ferraro Senior Research Software UI/UX Designer @ginez_17 ginestra.ferraro@kcl.ac.uk Visualising Dante: The data behind the Divine Comedy 26/27 November 2020 Data Stories Symposium
  2. 2. Data Stories, “Visualising Dante - The data behind the Divine Comedy”, Ginestra Ferraro, King’s Digital Lab, London, 27 November 2020 King’s Digital Lab (KDL) ● Research Software Engineering Lab ● Sits in the Faculty of Arts & Humanities ● 13+ permanent staff members ● Established in 2015 ● Solid SDLC and Agile processes
  3. 3. Data Stories, “Visualising Dante - The data behind the Divine Comedy”, Ginestra Ferraro, King’s Digital Lab, London, 27 November 2020 Visualising Dante - The data behind the Divine Comedy Context (and disclaimers) This work is the result of a final MSc Computer Science project, built* using 10% personal development time offered by KDL plus evenings and weekends available in the span of two months during the Summer of 2018. The project receives irregular updates. It’s a proof of concept with a vision to create a reusable tool to generate semi-automated data visualisations based on text analysis. The code is available in a Github repository for anyone to play with (under MIT license). https://github.com/ginestra/dante-visualised And the project is published at https://ginestra.github.io/dante-visualised/ * Built, but by no means completed.
  4. 4. Data Stories, “Visualising Dante - The data behind the Divine Comedy”, Ginestra Ferraro, King’s Digital Lab, London, 27 November 2020 Visualising Dante - The data behind the Divine Comedy Who is Dante Alighieri and why the Divine Comedy ? Dante Alighieri was a 13th Century Italian poet, credited with clearing the path for using and innovating vernacular language instead of Latin in Italian poetry, making it accessible to a larger audience. The Divine Comedy is an allegorical long narrative poem written circa 1308–20. It is widely considered to be one of the greatest works of world literature.1 The narrative traces the Dante’s journey from darkness and error to the revelation of the divine light, culminating in the Beatific Vision of God.2 The poem is divided in three sections: Inferno, Purgatorio, Paradiso. Dante’s Divine Comedy makes an interesting case study because of its structural (spatial and temporal) textual components lend themselves to be represented graphically, and offer insights into its original linguistic content. 1 https://en.wikipedia.org/wiki/Divine_Comedy 2 https://www.britannica.com/topic/The-Divine-Comedy
  5. 5. Data Stories, “Visualising Dante - The data behind the Divine Comedy”, Ginestra Ferraro, King’s Digital Lab, London, 27 November 2020 Visualising Dante - The data behind the Divine Comedy Visual representation of the sentiment analysis and allow for comparisons between the three sections. Specific to the Italian version (Petrocchi 1966-67), render the schematic representation of the poem’s structure and rhythm. The work is written in terza rima, a set of tercets, where every line if formed by a fixed number of syllables (11), and alternate rhymes, and the sections contain respectively 34, 33 and 33 Cantos (a form of division in medieval long poetry). Show a particularly interesting and uneven distribution of keywords. Objectives
  6. 6. Data Stories, “Visualising Dante - The data behind the Divine Comedy”, Ginestra Ferraro, King’s Digital Lab, London, 27 November 2020 Visualising Dante - The data behind the Divine Comedy Inferno: Bad, very bad. Purgatorio: It was bad, it will be okay. Paradiso: Good. Actually, great! The story, the data, the sentiment Violent death and painful wounds may be 3 You will not reach the peak before you see 4 That he loves well and hopes well and has faith 5 3 Inferno, Canto XI, line 34 4 Purgatorio, Canto VI, line 55 5 Paradiso, Canto XXIV, line 40 -0.89 +0.02 +0.91 Range (-1, +1)
  7. 7. Data Stories, “Visualising Dante - The data behind the Divine Comedy”, Ginestra Ferraro, King’s Digital Lab, London, 27 November 2020 Visualising Dante - The data behind the Divine Comedy The story, the data, the sentiment Sentiment analysis visualisation of the three cantiche. Red is negative, blue is positive and the opacity indicates how close to the polarity (-1, 1) the sentiment is. One square per line. ginestra.github.io/dante-visualised/sentiment-pattern/
  8. 8. Data Stories, “Visualising Dante - The data behind the Divine Comedy”, Ginestra Ferraro, King’s Digital Lab, London, 27 November 2020 Visualising Dante - The data behind the Divine Comedy Distribution of keywords The story, the data, the keywords Fun fact Dante closes each Cantica with the word stelle (stars), never uses the word Cristo (Christ) in the Inferno whilst its often present in the Paradiso. ginestra.github.io/dante-visualised/repetitions-pattern/
  9. 9. Data Stories, “Visualising Dante - The data behind the Divine Comedy”, Ginestra Ferraro, King’s Digital Lab, London, 27 November 2020 Visualising Dante - The data behind the Divine Comedy Rhyme prediction * The story, the data, the textual structure Nel mezzo del cammin di nostra vita mi ritrovai per una selva oscura ché la diritta via era smarrita. Ahi quanto a dir qual era è cosa dura esta selva selvaggia e aspra e forte che nel pensier rinova la paura! Tant’ è amara che poco è più morte; ma per trattar del ben ch’i’ vi trovai, dirò de l’altre cose ch’i’ v’ho scorte. 1st, 2nd and 3rd tercets, Inferno, Canto I * The rhyme prediction is accurate depending on the textual structure analysed. In the case of terza rima, it works except for the first and last lines of each Canto.
  10. 10. Data Stories, “Visualising Dante - The data behind the Divine Comedy”, Ginestra Ferraro, King’s Digital Lab, London, 27 November 2020 Visualising Dante - The data behind the Divine Comedy Rhyme rhythm ** The story, the data, the textual structure Nel mezzo del cammin di nostra vita mi ritrovai per una selva oscura ché la diritta via era smarrita. Ahi quanto a dir qual era è cosa dura esta selva selvaggia e aspra e forte che nel pensier rinova la paura! Tant’ è amara che poco è più morte; ma per trattar del ben ch’i’ vi trovai, dirò de l’altre cose ch’i’ v’ho scorte. 1st, 2nd and 3rd tercets, Inferno, Canto I ** Line and rhyme lengths as well as matching rhymes counted by number of chars.
  11. 11. Data Stories, “Visualising Dante - The data behind the Divine Comedy”, Ginestra Ferraro, King’s Digital Lab, London, 27 November 2020 Visualising Dante - The data behind the Divine Comedy Rhyme rhythm The story, the data, the textual structure ginestra.github.io/ dante-visualised/ rhymes/inferno/
  12. 12. Data Stories, “Visualising Dante - The data behind the Divine Comedy”, Ginestra Ferraro, King’s Digital Lab, London, 27 November 2020 Visualising Dante - The data behind the Divine Comedy The main success lies in its modular development, making it amenable to further development. More languages and different text structures will be integrated and a wider range of output visualisations offered, while making use of the same core functionalities for ingesting and processing data. Data model and future plans The data model of the application, illustrating the separation of concerns and the potential for extensibility.
  13. 13. Data Stories, “Visualising Dante - The data behind the Divine Comedy”, Ginestra Ferraro, King’s Digital Lab, London, 27 November 2020 Visualising Dante - The data behind the Divine Comedy Thank you. Ginestra Ferraro Senior Research Software UI/UX Designer @ginez_17 ginestra.ferraro@kcl.ac.uk

Editor's Notes

  • Who I am and why I’m here
    UI/UX Designer, working at King’s since 2013, previously in the Department of Digital Humanities, at KDL since 2015.
    My main interests are data visualizations, immersive experience and accessibility. My work in KDL ranges from user research to applied user interface design and development.
  • KDL was established in 2015 by staff previously embedded within the Department of Digital Humanities. Moved to RSE Lab, with solid research processes aligning with Research & Development units both in academia and the commercial world.
    Range of collaborative projects ranging from History and Classics to Augmented Reality and Immersive Experience.
  • hendecasyllable
  • Example verse for each Cantica.
    The analysis was performed with Vader on NLTK. Returning values range from -1 to +1.
    Although the analysis is by line and not by tercet, the use of words with their meaning adds up to the expected sentiment for each Cantica.
    One obvious improvement would have been to output the mean value per Cantica, rather than relying on the output by colour only. This is on the list of new features.
  • Example verse for each Cantica.
    The sentiment analysis was performed with Vader on NLTK. Returning values range from -1 to +1.
    Although the analysis is by line and not by tercet, the use of words with their meaning adds up to the expected sentiment for each Cantica.
  • The interaction allows for contextual information like the Cantica, the Canto number, the line number and the text in the line.
    It can be improved by displaying the occurence in context, making its location clearer, showing the uneven distribution.
  • The rhyme prediction is accurate depending on the textual structure analysed. In the case of terza rima, it works except for the first and last lines of each Canto.
    The hendecasyllable gives a particular rhythm to the reading.
  • The rhyme prediction is accurate depending on the textual structure analysed. In the case of terza rima, it works except for the first and last lines of each Canto.
    The representation of the line is rendered by counting the number of chars and the length of the rhyme is counted based on the number of matching letters counting from the end. The number of syllable is the same for every line: 11. There are exceptions in the Italian language on how they are counted based on how they are related to the following word.
  • The rhyme prediction is accurate depending on the textual structure analysed. In the case of terza rima, it works except for the first and last lines of each Canto.
  • The main success lies in its modular development, making it amenable to further development (algorithm refinements, visualisation workflows, stylometric analysis).
    More languages and different text structures will be integrated and a wider range of output visualisations offered, while making use of the same core functionalities for ingesting and processing data.
  • Questions?
  • ×