Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Bibliotheca Digitalis
Reconstitution of Early Modern Cultural Networks
From Primary Source to Data
DARIAH / Biblissima Sum...
7/8/2017
1
Visualisation in Digital Humanities for
Understanding, Cleaning, and Explaining
Jean-Daniel Fekete
INRIA
http:/...
7/8/2017
2
Visualization and Visual Perception
• Visualization is grounded in the visual and
cognitive capabilities of hum...
7/8/2017
3
Preattentive Processing
July 8th 2017 Summer School Le Mans
Preattentive Processing
• Preattentive processing
–...
7/8/2017
4
Preattentive Processing
July 8th 2017 Summer School Le Mans
Where does Visualization Stands?
Theory / Law
Model...
7/8/2017
5
Example
I II III IV
x y x y x y x y
10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58
8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76
...
7/8/2017
6
Visual Representation of the Data
Visual representation reveals a different story
[Source: Anscombe's quartet, ...
7/8/2017
7
Where does Visualization Stands?
Theory / Law
Model
Visualization
Facts / Measurements
Support xor
Contradict I...
7/8/2017
8
Scale Matters!
• 100 - 103 : Small corpus (Master’s thesis / PhD)
• 103 – 106 : Collaborative project
• 106 – 1...
7/8/2017
9
100: One document
• N. McCurdy, J. Lein, K. Coles, M. Meyer. Poemage: Visualizing the Sonic Topology of
a Poem....
7/8/2017
10
100 – 103: Small(ish) Networks
July 8th 2017 Summer School Le Mans
http://vistorian.net/
100 – 103: Small Corp...
7/8/2017
11
Genealogical trees
July 8th 2017 Summer School Le Mans
July 8th 2017 Summer School Le Mans
Transfer of the Lan...
7/8/2017
12
July 8th 2017 Summer School Le Mans
Migration Map
Space&Time: GeoTime
[link]
July 8th 2017 Summer School Le Ma...
7/8/2017
13
100 – 103: Archeological Collection
Create a spreadsheet
• 1 line per object found
• 1 column per feature
• 1 ...
7/8/2017
14
100 – 103: Bertifier
• Play with our tool online
July 8th 2017 Summer School Le Mans
http://www.aviz.fr/bertif...
7/8/2017
15
July 8th 2017 Summer School Le Mans
100 – 103: Diffamation
(Chevalier et al. CHI 2010, http://www.aviz.fr/diff...
7/8/2017
16
100 – 103: Multidimensional Data
Summer School Le MansJuly 8th 2017
July 8th 2017 Summer School Le Mans
7/8/2017
17
100 – 103: Small Corpus
July 8th 2017 Summer School Le Mans
http://multiviz.gforge.inria.fr/scatterdice/oscars...
7/8/2017
18
103 – 106: Library/Coll. Project
• Too many items to show each of them in detail
• Still need to provide guida...
7/8/2017
19
103 – 106: Parallel Tag Clouds
Parallel Tag Clouds to Explore Faceted Text Corpora (Collins et al., VAST 2009)...
7/8/2017
20
De-duplication
D-Dupe: An Interactive Tool for Entity Resolution in Social Networks (Mustafa Bilgic, Louis Lic...
7/8/2017
21
July 8th 2017 Summer School Le Mans
July 8th 2017 Summer School Le Mans
7/8/2017
22
July 8th 2017 Summer School Le Mans
106 – 109: Institutional project
• Only aggregated information can be pres...
7/8/2017
23
106 – 109: Institutional project (HAL)
July 8th 2017 Summer School Le Mans
http://traces1.saclay.inria.fr/inri...
7/8/2017
24
106 – 109: EU Project Cendari
July 8th 2017 Summer School Le Mans
https://notes.cendari.dariah.eu/
106 – 109: ...
7/8/2017
25
> 109: World Scale
• Few providers
– Google
– Photo collections (Flickr)
– Astronomical databases
• The cost o...
7/8/2017
26
> 109: Query Previews
• Query over very large data about the Earth
July 8th 2017 Summer School Le Mans
http://...
7/8/2017
27
References
• Jacques Bertin, Semiology of Graphics: Diagrams, Networks, Maps.
ESRI Press; Nov. 2010. ISBN: 978...
Upcoming SlideShare
Loading in …5
×

Bibliotheca Digitalis Summer school: Visualisation in Digital Humanities for Understanding, Cleaning, and Explaining - Jean-Daniel Fekete

135 views

Published on

Bibliotheca Digitalis. Reconstitution of Early Modern Cultural Networks. From Primary Source to Data.
DARIAH / Biblissima Summer School, 4-8 July 2017, Le Mans, France.
5th and last day, July 8th – Digital representation and data accuracy for Humanities.

Visualisation in Digital Humanities for Understanding, Cleaning, and Explaining.
Jean-Daniel Fekete – Research Scientist, INRIA.
Abstract: https://bvh.hypotheses.org/3330#conf-JDFekete

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Bibliotheca Digitalis Summer school: Visualisation in Digital Humanities for Understanding, Cleaning, and Explaining - Jean-Daniel Fekete

  1. 1. Bibliotheca Digitalis Reconstitution of Early Modern Cultural Networks From Primary Source to Data DARIAH / Biblissima Summer School Le Mans, 4-8 July 2017 Visualisation in Digital Humanities for Understanding, Cleaning, and Explaining 5th and last day, July 8th – Digital representation and data accuracy for Humanities Jean-Daniel Fekete Research Scientist, INRIA
  2. 2. 7/8/2017 1 Visualisation in Digital Humanities for Understanding, Cleaning, and Explaining Jean-Daniel Fekete INRIA http://www.aviz.fr/~fekete Visualization? Visualization is any technique for creating images, diagrams, or animations to communicate a message [Wikipedia, Visualization, May 2016] Information visualization is the study of (interactive) visual representations of abstract data to reinforce human cognition [Card, S. and Mackinlay, J. and Shneiderman B., Readings in Information Visualization, 1999] July 8th 2017 Summer School Le Mans
  3. 3. 7/8/2017 2 Visualization and Visual Perception • Visualization is grounded in the visual and cognitive capabilities of humans – Inferring from visual forms • Relies on visual capabilities of the human eye and brain – Preattentive processing – Ready…is there a red circle in the next slide? July 8th 2017 Summer School Le Mans Preattentive Processing July 8th 2017 Summer School Le Mans
  4. 4. 7/8/2017 3 Preattentive Processing July 8th 2017 Summer School Le Mans Preattentive Processing • Preattentive processing – 200ms response time (in a glimpse) – Effortless – Reliable estimates • Many visual features can be perceived preattentively: – Orientation of line/bloc, length, width, size, curvature, cardinality, etc. • Problems: – Preattentive features interfere with each other • Except one – Preattentive features have limitations • 7 colors max (Healey, 96) • 2 or 3 shapes July 8th 2017 Summer School Le Mans
  5. 5. 7/8/2017 4 Preattentive Processing July 8th 2017 Summer School Le Mans Where does Visualization Stands? Theory / Law Model Descriptive statistics Facts / Measurements Support xor Contradict Induces? Fits Describes July 8th 2017 Summer School Le Mans
  6. 6. 7/8/2017 5 Example I II III IV x y x y x y x y 10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58 8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76 13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71 9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84 11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47 14.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04 6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25 4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50 12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.56 7.0 4.82 7.0 7.26 7.0 6.42 8.0 7.91 5.0 5.68 5.0 4.74 5.0 5.73 8.0 6.89 Raw Data from Anscombe’s Quartet [Source: Anscombe's quartet, Wikipedia] July 8th 2017 Summer School Le Mans Statistical Analysis I II III IV x y x y x y x y 10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58 8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76 13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71 9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84 11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47 14.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04 6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25 4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50 12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.56 7.0 4.82 7.0 7.26 7.0 6.42 8.0 7.91 5.0 5.68 5.0 4.74 5.0 5.73 8.0 6.89 Mean of x 9.0 Variance of x 11.0 Mean of y 7.5 Variance of y 4.12 Correlation between x and y 0.816 Linear regression line y = 3 + 0.5x For all columns, the main descriptive statistics are identical [Source: Anscombe's quartet, Wikipedia] July 8th 2017 Summer School Le Mans
  7. 7. 7/8/2017 6 Visual Representation of the Data Visual representation reveals a different story [Source: Anscombe's quartet, Wikipedia] I II III IV x y x y x y x y 10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58 8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76 13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71 9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84 11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47 14.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04 6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25 4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50 12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.56 7.0 4.82 7.0 7.26 7.0 6.42 8.0 7.91 5.0 5.68 5.0 4.74 5.0 5.73 8.0 6.89 July 8th 2017 Summer School Le Mans Same Stats, Different Graphs: Generating Datasets with Varied Appearance and Identical Statistics through Simulated Annealing [CHI17] July 8th 2017 Summer School Le Mans https://www.autodeskresearch.com/publications/samestats
  8. 8. 7/8/2017 7 Where does Visualization Stands? Theory / Law Model Visualization Facts / Measurements Support xor Contradict Induces? Fits Describes Descriptive Statistics July 8th 2017 Summer School Le Mans Four Scales • Most DH projects rely on the concept of collections of documents or artifacts • Visualization can be effective to make sense of these collections – But there is no “one size fits all” • I will present visualizations to manage the four scales • With queries, smaller scales can be extracted from larger scales July 8th 2017 Summer School Le Mans
  9. 9. 7/8/2017 8 Scale Matters! • 100 - 103 : Small corpus (Master’s thesis / PhD) • 103 – 106 : Collaborative project • 106 – 109 : Institutional project (BnF, LoC) or portal • > 109 : Large scale – Europeana, Google Powers of Ten™ (1977) July 8th 2017 Summer School Le Mans https://www.youtube.com/watch?v=0fKBhvDjuy0 100 – 103: Small Corpus • Myriad of visualizations available for small corpora – Text, network, genealogy, manuscripts, maps, etc. • Using these visualizations for exploring small corpora reveals interesting unexpected information ALWAYS • On Web sites dedicated to small corpora, visualization will help navigate and understand the scope of the corpus July 8th 2017 Summer School Le Mans
  10. 10. 7/8/2017 9 100: One document • N. McCurdy, J. Lein, K. Coles, M. Meyer. Poemage: Visualizing the Sonic Topology of a Poem. IEEE Transactions on Visualization and Computer Graphics (Proceedings of InfoVis 2015), pages 439-448, January 2016 July 8th 2017 Summer School Le Mans http://www.sci.utah.edu/~nmccurdy/Poemage/ https://vimeo.com/136205958 http://xkcd.com/657/ 100: One document July 8th 2017 Summer School Le Mans http://vis.cs.ucdavis.edu/~tanahashi/storylines/
  11. 11. 7/8/2017 10 100 – 103: Small(ish) Networks July 8th 2017 Summer School Le Mans http://vistorian.net/ 100 – 103: Small Corpus N. Dufournaud Thesis ~1000 documents July 8th 2017 Summer School Le Mans http://nicole.dufournaud.org/
  12. 12. 7/8/2017 11 Genealogical trees July 8th 2017 Summer School Le Mans July 8th 2017 Summer School Le Mans Transfer of the Land « La Fruglaye »
  13. 13. 7/8/2017 12 July 8th 2017 Summer School Le Mans Migration Map Space&Time: GeoTime [link] July 8th 2017 Summer School Le Mans
  14. 14. 7/8/2017 13 100 – 103: Archeological Collection Create a spreadsheet • 1 line per object found • 1 column per feature • 1 black dot at the intersection when an object has a feature July 8th 2017 Summer School Le Mans July 8th 2017 Summer School Le Mans
  15. 15. 7/8/2017 14 100 – 103: Bertifier • Play with our tool online July 8th 2017 Summer School Le Mans http://www.aviz.fr/bertifier https://www.youtube.com/watch?v=tJxAF_a_yBQ Visualizing an XML Corpus: Compus • Transform the following XML document: 0 1 2 3 4 012345678901234567890123456789012345678901234567 <A>abcd<B>efgh</B><C>ijkl<D>mnop</D></C>qrst</A> • into a set of intervals : A=[0,48[, B=[7,18[, C=[18,40[, D=[25,36[ • One color is given to each element • Only XML elements are visualized July 8th 2017 Summer School Le Mans
  16. 16. 7/8/2017 15 July 8th 2017 Summer School Le Mans 100 – 103: Diffamation (Chevalier et al. CHI 2010, http://www.aviz.fr/diffamation/) July 8th 2017 Summer School Le Mans
  17. 17. 7/8/2017 16 100 – 103: Multidimensional Data Summer School Le MansJuly 8th 2017 July 8th 2017 Summer School Le Mans
  18. 18. 7/8/2017 17 100 – 103: Small Corpus July 8th 2017 Summer School Le Mans http://multiviz.gforge.inria.fr/scatterdice/oscars/ 100 – 103: Small Corpus • Myriad of visualizations available for small corpora – Text, network, genealogy, manuscripts, maps, etc. • Using these visualizations for exploring small corpora reveals interesting unexpected information ALWAYS • On Web sites dedicated to small corpora, visualization will help navigate and understand the scope of the corpus July 8th 2017 Summer School Le Mans
  19. 19. 7/8/2017 18 103 – 106: Library/Coll. Project • Too many items to show each of them in detail • Still need to provide guidance to users • Many tools exist but entering data become technical July 8th 2017 Summer School Le Mans 103 – 106: Jigsaw July 8th 2017 Summer School Le Mans
  20. 20. 7/8/2017 19 103 – 106: Parallel Tag Clouds Parallel Tag Clouds to Explore Faceted Text Corpora (Collins et al., VAST 2009) July 8th 2017 Summer School Le Mans http://vialab.science.uoit.ca/portfolio/parallel-tag-clouds-to-explore-faceted-text-corpora July 8th 2017 Summer School Le Mans
  21. 21. 7/8/2017 20 De-duplication D-Dupe: An Interactive Tool for Entity Resolution in Social Networks (Mustafa Bilgic, Louis Licamele, Lise Getoor, Ben Shneiderman), In Visual Analytics Science and Technology (VAST), 2006. • Resolving named entity using relation network July 8th 2017 Summer School Le Mans 103 – 106: Genealogies July 8th 2017 Summer School Le Mans
  22. 22. 7/8/2017 21 July 8th 2017 Summer School Le Mans July 8th 2017 Summer School Le Mans
  23. 23. 7/8/2017 22 July 8th 2017 Summer School Le Mans 106 – 109: Institutional project • Only aggregated information can be presented • Faceted browsing / search very useful! – Use it! • e.g. Europeana: 53 106 items July 8th 2017 Summer School Le Mans
  24. 24. 7/8/2017 23 106 – 109: Institutional project (HAL) July 8th 2017 Summer School Le Mans http://traces1.saclay.inria.fr/inria/ 106 – 109: EU Project Cendari July 8th 2017 Summer School Le Mans
  25. 25. 7/8/2017 24 106 – 109: EU Project Cendari July 8th 2017 Summer School Le Mans https://notes.cendari.dariah.eu/ 106 – 109: Institutional project • Only aggregated information can be presented • Faceted browsing / search very useful! – Use it! • e.g. Europeana: 53 106 items • Problem: metadata quality and semantics • What is the date of a book? July 8th 2017 Summer School Le Mans
  26. 26. 7/8/2017 25 > 109: World Scale • Few providers – Google – Photo collections (Flickr) – Astronomical databases • The cost of computing facets is too high for interactive time responses • No good general solution July 8th 2017 Summer School Le Mans > 109: Internet Backbone • Where are you? • Who cares? July 8th 2017 Summer School Le Mans
  27. 27. 7/8/2017 26 > 109: Query Previews • Query over very large data about the Earth July 8th 2017 Summer School Le Mans http://www.cs.umd.edu/hcil/eosdis/ Conclusion • Larger collections are harder to manage – Big data problem • A large collection can always be queried to extract a smaller collection – Scaling down the results and increasing the number of techniques usable • Still, current technologies are limited for DH – No management of uncertainty – No reasonable model of old geographical concepts – No good model of time and date • Still, use the tools and ask for improvements! July 8th 2017 Summer School Le Mans
  28. 28. 7/8/2017 27 References • Jacques Bertin, Semiology of Graphics: Diagrams, Networks, Maps. ESRI Press; Nov. 2010. ISBN: 9781589482616 • Edward Tufte. The Visual Display of Quantitative Information. Cheshire, CT: Graphics Press, 2010 ISBN 0-9613921-4-2 • Tamara Munzner. Visualization Analysis and Design. A K Peters Visualization Series, CRC Press, 2014. ISBN 9781466508910 • Alberto Cairo. The Truthful Art: Data, Charts, and Maps for Communication. New Riders, 2016. ISBN 0321934075 • Tableau for Students: https://www.tableau.com/academic/students • Jänicke, Stefan; Franzini, Greta; Cheema, Muhammad Faisal; Scheuermann, Gerik. On Close and Distant Reading in Digital Humanities: A Survey and Future Challenges. Eurographics Conference on Visualization (EuroVis) – STARs. 2015. http://dx.doi.org/10.2312/eurovisstar.20151113 July 8th 2017 Summer School Le Mans

×