Bibliotheca Digitalis. Reconstitution of Early Modern Cultural Networks. From Primary Source to Data.
DARIAH / Biblissima Summer School, 4-8 July 2017, Le Mans, France.
5th and last day, July 8th – Digital representation and data accuracy for Humanities.
Visualisation in Digital Humanities for Understanding, Cleaning, and Explaining.
Jean-Daniel Fekete – Research Scientist, INRIA.
Abstract: https://bvh.hypotheses.org/3330#conf-JDFekete
The Data Lifecycle - EUDAT Summer School (Yann Le Franc)EUDAT
Yann will introduce the notion of data life cycles (DLCs) as an overarching framework for the workshop. This presentation will explain the key activities and roles identified by EUDAT and undertaken by researchers and data service providers in the process of creating, analysing, managing, sharing and archiving research data. It will highlight how the EUDAT service suite addresses this data lifecycle to support researchers with their key data requirements. He will then present the current research work undertaken in EUDAT to model community specific DLCs, the relation with the concept of provenance and the prototype services being currently developed to bridge the identified gaps in DLC coverage.
Visit https://eudat.eu/eudat-summer-school
An overview of and introduction to the concept of data sonification as a potential resources for creating accessible views of data online. A presentation for the University of Michigan Web Accessibility Working Group.
Neus Lorenzo's presentation at Friends of Education conference, Struga, Macedonia, 8-9 April, 2017.
In an era where technology is moving at astonishing rates, we need to draw on all forms of learning to give children the skills, resiliency, and flexibility they need to meet the challenges of the UNESCO 2030 goals for sustainable development, and to face a global, interconnected, plurilingual and pluricultural world. This presentation provides some ideas and guidelines.
The Data Lifecycle - EUDAT Summer School (Yann Le Franc)EUDAT
Yann will introduce the notion of data life cycles (DLCs) as an overarching framework for the workshop. This presentation will explain the key activities and roles identified by EUDAT and undertaken by researchers and data service providers in the process of creating, analysing, managing, sharing and archiving research data. It will highlight how the EUDAT service suite addresses this data lifecycle to support researchers with their key data requirements. He will then present the current research work undertaken in EUDAT to model community specific DLCs, the relation with the concept of provenance and the prototype services being currently developed to bridge the identified gaps in DLC coverage.
Visit https://eudat.eu/eudat-summer-school
An overview of and introduction to the concept of data sonification as a potential resources for creating accessible views of data online. A presentation for the University of Michigan Web Accessibility Working Group.
Neus Lorenzo's presentation at Friends of Education conference, Struga, Macedonia, 8-9 April, 2017.
In an era where technology is moving at astonishing rates, we need to draw on all forms of learning to give children the skills, resiliency, and flexibility they need to meet the challenges of the UNESCO 2030 goals for sustainable development, and to face a global, interconnected, plurilingual and pluricultural world. This presentation provides some ideas and guidelines.
Ray Gallon's presentation at the Friends of Education conference in Struga, Macedonia, 8-9 April, 2017.
Industry 4.0 works on the mariage of the Internet of Things and Artificial Intelligence, among other things. In a world where decisions are taken autonomously by machines, there are ethical implications, questions of responsibility. Educators need strategies for preparing young people to deal with these questions, and to be flexible enough to change as the many unknowns of this development evolve. This presentation looks at the unknowns, and the questions we don't have answers to, in an attempt to focus attention on what needs to happen next, and proposes a collective space in which to start dealing with it.
SSHOC at EOSC-hub Week - ESS in SSHOC - Bodil Agasøster - NSDSSHOC
Presentation from Bodil Agasøster on the ESS in SSHOC, Managing International Comparative Data at the EOSC-hub Week, 10 May 2019.
EOSC for Social Sciences and Humanities panel
This presentation deals with a historical account as well as an conceptual ground and leads towards modern concept of learning i.e. Collaborative Learning
Open Data and Cross Disciplinary Research - EUDAT Summer School (Brian Matthe...EUDAT
The European Open Science Cloud (EOSC) has become a driving force behind the current evolution of e-Infrastructure to support research. The EOSC offers the vision of an integrated ecosystem of data, services and expertise providing a common platform for open cross-community research in Europe and beyond. In this session, I shall consider the aims of the EOSC and discuss some the opportunities it offers, and barriers it needs to overcome to realise the vision. I shall introduce the EOSC-Pilot project which is aiming to pave the way towards the EOSC by exploring the opportunities and barriers, and proposing how the EOSC should evolve, both technically, including its architecture, and organisationally, including how it should be managed. Participants will be invited to consider what the issues of the EOSC are and how it might affect their own domain.
Visit: https://www.eudat.eu/eudat-summer-school
How to become the best datascientist in EuropeDigitYser
How to become the best datascientist in Europe.
How to boost your datascience skills.
Ho to recruit the most promissing young graduates.
How a company can boost its digital transformation effort.
How to become data driven.
Join the data science bootcamp starting mid September 2016 - prepare during the summer camp for coders.
It is a brief description about how education can be digitized. The digitization has been seen in the light of processes in education i.e. administration, learning, evaluation and extension, These are just points.The presentation requires elaboration of a speaker.
Telling a Story – or Even Propaganda – Through Data VisualizationDemetris Trihinas
A Chinese proverb states that "a picture is worth 1000 words"... it may even be worth more. Expanding on this point, this talk goes beyond aesthetics by introducing data visualization as a powerful tool for data exploration and knowledge communication. However, although data visualizations can be used to make story narratives more apprehendable and statistics easier to digest, they can also be used for deceit, misinformation and even propaganda. The negative impact of storytelling through data will be a prominent part of this talk where we will cover how misinformation can prevail unintentionally by misinterpreting the knowledge extracted from data, and intentionally by “fitting” the visualization to the message that must be conveyed.
Par Marie-Luce Demonet (CESR). Le 25 novembre 2022, Assemblée générale 2022 du programme de recherche Bibliothèques Virtuelles Humanistes. CESR, Tours.
Ray Gallon's presentation at the Friends of Education conference in Struga, Macedonia, 8-9 April, 2017.
Industry 4.0 works on the mariage of the Internet of Things and Artificial Intelligence, among other things. In a world where decisions are taken autonomously by machines, there are ethical implications, questions of responsibility. Educators need strategies for preparing young people to deal with these questions, and to be flexible enough to change as the many unknowns of this development evolve. This presentation looks at the unknowns, and the questions we don't have answers to, in an attempt to focus attention on what needs to happen next, and proposes a collective space in which to start dealing with it.
SSHOC at EOSC-hub Week - ESS in SSHOC - Bodil Agasøster - NSDSSHOC
Presentation from Bodil Agasøster on the ESS in SSHOC, Managing International Comparative Data at the EOSC-hub Week, 10 May 2019.
EOSC for Social Sciences and Humanities panel
This presentation deals with a historical account as well as an conceptual ground and leads towards modern concept of learning i.e. Collaborative Learning
Open Data and Cross Disciplinary Research - EUDAT Summer School (Brian Matthe...EUDAT
The European Open Science Cloud (EOSC) has become a driving force behind the current evolution of e-Infrastructure to support research. The EOSC offers the vision of an integrated ecosystem of data, services and expertise providing a common platform for open cross-community research in Europe and beyond. In this session, I shall consider the aims of the EOSC and discuss some the opportunities it offers, and barriers it needs to overcome to realise the vision. I shall introduce the EOSC-Pilot project which is aiming to pave the way towards the EOSC by exploring the opportunities and barriers, and proposing how the EOSC should evolve, both technically, including its architecture, and organisationally, including how it should be managed. Participants will be invited to consider what the issues of the EOSC are and how it might affect their own domain.
Visit: https://www.eudat.eu/eudat-summer-school
How to become the best datascientist in EuropeDigitYser
How to become the best datascientist in Europe.
How to boost your datascience skills.
Ho to recruit the most promissing young graduates.
How a company can boost its digital transformation effort.
How to become data driven.
Join the data science bootcamp starting mid September 2016 - prepare during the summer camp for coders.
It is a brief description about how education can be digitized. The digitization has been seen in the light of processes in education i.e. administration, learning, evaluation and extension, These are just points.The presentation requires elaboration of a speaker.
Telling a Story – or Even Propaganda – Through Data VisualizationDemetris Trihinas
A Chinese proverb states that "a picture is worth 1000 words"... it may even be worth more. Expanding on this point, this talk goes beyond aesthetics by introducing data visualization as a powerful tool for data exploration and knowledge communication. However, although data visualizations can be used to make story narratives more apprehendable and statistics easier to digest, they can also be used for deceit, misinformation and even propaganda. The negative impact of storytelling through data will be a prominent part of this talk where we will cover how misinformation can prevail unintentionally by misinterpreting the knowledge extracted from data, and intentionally by “fitting” the visualization to the message that must be conveyed.
Similar to Bibliotheca Digitalis Summer school: Visualisation in Digital Humanities for Understanding, Cleaning, and Explaining - Jean-Daniel Fekete (20)
Par Marie-Luce Demonet (CESR). Le 25 novembre 2022, Assemblée générale 2022 du programme de recherche Bibliothèques Virtuelles Humanistes. CESR, Tours.
Par Anne-Laure ALLAIN, Marlène ARRUGA, Sarra FERJANI et Sandrine BREUIL. Le 25 novembre 2022, Assemblée générale 2022 du programme de recherche Bibliothèques Virtuelles Humanistes. CESR, Tours.
Par Toshinori Uetani (CESR), Guillaume Porte (ARCHE, Strasbourg). Le 25 novembre 2022, Assemblée générale 2022 du programme de recherche Bibliothèques Virtuelles Humanistes. CESR, Tours.
Par Nicole Dufournaud (LISAA), Philippe Gambette (LIGM), Toshinori Uetani (CESR), le 25 novembre 2022. Assemblée générale 2022 du programme de recherche Bibliothèques Virtuelles Humanistes. CESR, Tours.
Par Toshinori UETANI (CESR, BVH), Guillaume PORTE (ARCHE, Université de Strasbourg), Flora POUPINOT (Equipex Biblissima, CESR, BVH). Le 15 décembre 2021. CESR, Tours. https://bibfr.bvh.univ-tours.fr
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Bibliotheca Digitalis Summer school: Visualisation in Digital Humanities for Understanding, Cleaning, and Explaining - Jean-Daniel Fekete
1. Bibliotheca Digitalis
Reconstitution of Early Modern Cultural Networks
From Primary Source to Data
DARIAH / Biblissima Summer School
Le Mans, 4-8 July 2017
Visualisation in Digital Humanities
for Understanding, Cleaning, and
Explaining
5th and last day, July 8th – Digital representation and data accuracy for Humanities
Jean-Daniel Fekete
Research Scientist, INRIA
2. 7/8/2017
1
Visualisation in Digital Humanities for
Understanding, Cleaning, and Explaining
Jean-Daniel Fekete
INRIA
http://www.aviz.fr/~fekete
Visualization?
Visualization is any technique for creating
images, diagrams, or animations to
communicate a message
[Wikipedia, Visualization, May 2016]
Information visualization is the study of
(interactive) visual representations of abstract
data to reinforce human cognition
[Card, S. and Mackinlay, J. and Shneiderman B., Readings in Information Visualization, 1999]
July 8th 2017 Summer School Le Mans
3. 7/8/2017
2
Visualization and Visual Perception
• Visualization is grounded in the visual and
cognitive capabilities of humans
– Inferring from visual forms
• Relies on visual capabilities of the human eye
and brain
– Preattentive processing
– Ready…is there a red circle in the next slide?
July 8th 2017 Summer School Le Mans
Preattentive Processing
July 8th 2017 Summer School Le Mans
4. 7/8/2017
3
Preattentive Processing
July 8th 2017 Summer School Le Mans
Preattentive Processing
• Preattentive processing
– 200ms response time (in a glimpse)
– Effortless
– Reliable estimates
• Many visual features can be perceived preattentively:
– Orientation of line/bloc, length, width, size, curvature, cardinality, etc.
• Problems:
– Preattentive features interfere with each other
• Except one
– Preattentive features have limitations
• 7 colors max (Healey, 96)
• 2 or 3 shapes
July 8th 2017 Summer School Le Mans
5. 7/8/2017
4
Preattentive Processing
July 8th 2017 Summer School Le Mans
Where does Visualization Stands?
Theory / Law
Model
Descriptive statistics
Facts / Measurements
Support xor
Contradict Induces?
Fits
Describes
July 8th 2017 Summer School Le Mans
6. 7/8/2017
5
Example
I II III IV
x y x y x y x y
10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58
8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76
13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71
9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84
11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47
14.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04
6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25
4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50
12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.56
7.0 4.82 7.0 7.26 7.0 6.42 8.0 7.91
5.0 5.68 5.0 4.74 5.0 5.73 8.0 6.89
Raw Data from Anscombe’s Quartet
[Source: Anscombe's quartet, Wikipedia]
July 8th 2017 Summer School Le Mans
Statistical Analysis
I II III IV
x y x y x y x y
10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58
8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76
13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71
9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84
11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47
14.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04
6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25
4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50
12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.56
7.0 4.82 7.0 7.26 7.0 6.42 8.0 7.91
5.0 5.68 5.0 4.74 5.0 5.73 8.0 6.89
Mean of x 9.0
Variance of x 11.0
Mean of y 7.5
Variance of y 4.12
Correlation between x and y 0.816
Linear regression line y = 3 + 0.5x
For all columns, the main descriptive statistics are identical
[Source: Anscombe's quartet, Wikipedia]
July 8th 2017 Summer School Le Mans
7. 7/8/2017
6
Visual Representation of the Data
Visual representation reveals a different story
[Source: Anscombe's quartet, Wikipedia]
I II III IV
x y x y x y x y
10.0 8.04 10.0 9.14 10.0 7.46 8.0 6.58
8.0 6.95 8.0 8.14 8.0 6.77 8.0 5.76
13.0 7.58 13.0 8.74 13.0 12.74 8.0 7.71
9.0 8.81 9.0 8.77 9.0 7.11 8.0 8.84
11.0 8.33 11.0 9.26 11.0 7.81 8.0 8.47
14.0 9.96 14.0 8.10 14.0 8.84 8.0 7.04
6.0 7.24 6.0 6.13 6.0 6.08 8.0 5.25
4.0 4.26 4.0 3.10 4.0 5.39 19.0 12.50
12.0 10.84 12.0 9.13 12.0 8.15 8.0 5.56
7.0 4.82 7.0 7.26 7.0 6.42 8.0 7.91
5.0 5.68 5.0 4.74 5.0 5.73 8.0 6.89
July 8th 2017 Summer School Le Mans
Same Stats, Different Graphs: Generating Datasets with Varied Appearance
and Identical Statistics through Simulated Annealing [CHI17]
July 8th 2017 Summer School Le Mans
https://www.autodeskresearch.com/publications/samestats
8. 7/8/2017
7
Where does Visualization Stands?
Theory / Law
Model
Visualization
Facts / Measurements
Support xor
Contradict Induces?
Fits
Describes
Descriptive
Statistics
July 8th 2017 Summer School Le Mans
Four Scales
• Most DH projects rely on the concept of
collections of documents or artifacts
• Visualization can be effective to make sense of
these collections
– But there is no “one size fits all”
• I will present visualizations to manage the
four scales
• With queries, smaller scales can be extracted
from larger scales
July 8th 2017 Summer School Le Mans
9. 7/8/2017
8
Scale Matters!
• 100 - 103 : Small corpus (Master’s thesis / PhD)
• 103 – 106 : Collaborative project
• 106 – 109 : Institutional project (BnF, LoC) or portal
• > 109 : Large scale
– Europeana, Google
Powers of Ten™ (1977)
July 8th 2017 Summer School Le Mans
https://www.youtube.com/watch?v=0fKBhvDjuy0
100 – 103: Small Corpus
• Myriad of visualizations available for small
corpora
– Text, network, genealogy, manuscripts, maps, etc.
• Using these visualizations for exploring small
corpora reveals interesting unexpected
information ALWAYS
• On Web sites dedicated to small corpora,
visualization will help navigate and understand
the scope of the corpus
July 8th 2017 Summer School Le Mans
10. 7/8/2017
9
100: One document
• N. McCurdy, J. Lein, K. Coles, M. Meyer. Poemage: Visualizing the Sonic Topology of
a Poem. IEEE Transactions on Visualization and Computer Graphics (Proceedings of
InfoVis 2015), pages 439-448, January 2016
July 8th 2017 Summer School Le Mans
http://www.sci.utah.edu/~nmccurdy/Poemage/
https://vimeo.com/136205958
http://xkcd.com/657/
100: One document
July 8th 2017 Summer School Le Mans
http://vis.cs.ucdavis.edu/~tanahashi/storylines/
11. 7/8/2017
10
100 – 103: Small(ish) Networks
July 8th 2017 Summer School Le Mans
http://vistorian.net/
100 – 103: Small Corpus
N. Dufournaud
Thesis
~1000 documents
July 8th 2017 Summer School Le Mans
http://nicole.dufournaud.org/
13. 7/8/2017
12
July 8th 2017 Summer School Le Mans
Migration Map
Space&Time: GeoTime
[link]
July 8th 2017 Summer School Le Mans
14. 7/8/2017
13
100 – 103: Archeological Collection
Create a spreadsheet
• 1 line per object found
• 1 column per feature
• 1 black dot at the
intersection when an object
has a feature
July 8th 2017 Summer School Le Mans
July 8th 2017 Summer School Le Mans
15. 7/8/2017
14
100 – 103: Bertifier
• Play with our tool online
July 8th 2017 Summer School Le Mans
http://www.aviz.fr/bertifier
https://www.youtube.com/watch?v=tJxAF_a_yBQ
Visualizing an XML Corpus: Compus
• Transform the following XML document:
0 1 2 3 4
012345678901234567890123456789012345678901234567
<A>abcd<B>efgh</B><C>ijkl<D>mnop</D></C>qrst</A>
• into a set of intervals :
A=[0,48[, B=[7,18[, C=[18,40[, D=[25,36[
• One color is given to each element
• Only XML elements are visualized
July 8th 2017 Summer School Le Mans
16. 7/8/2017
15
July 8th 2017 Summer School Le Mans
100 – 103: Diffamation
(Chevalier et al. CHI 2010, http://www.aviz.fr/diffamation/)
July 8th 2017 Summer School Le Mans
17. 7/8/2017
16
100 – 103: Multidimensional Data
Summer School Le MansJuly 8th 2017
July 8th 2017 Summer School Le Mans
18. 7/8/2017
17
100 – 103: Small Corpus
July 8th 2017 Summer School Le Mans
http://multiviz.gforge.inria.fr/scatterdice/oscars/
100 – 103: Small Corpus
• Myriad of visualizations available for small
corpora
– Text, network, genealogy, manuscripts, maps, etc.
• Using these visualizations for exploring small
corpora reveals interesting unexpected
information ALWAYS
• On Web sites dedicated to small corpora,
visualization will help navigate and understand
the scope of the corpus
July 8th 2017 Summer School Le Mans
19. 7/8/2017
18
103 – 106: Library/Coll. Project
• Too many items to show each of them in detail
• Still need to provide guidance to users
• Many tools exist but entering data become
technical
July 8th 2017 Summer School Le Mans
103 – 106: Jigsaw
July 8th 2017 Summer School Le Mans
20. 7/8/2017
19
103 – 106: Parallel Tag Clouds
Parallel Tag Clouds to Explore Faceted Text Corpora (Collins et al., VAST 2009)
July 8th 2017 Summer School Le Mans
http://vialab.science.uoit.ca/portfolio/parallel-tag-clouds-to-explore-faceted-text-corpora
July 8th 2017 Summer School Le Mans
21. 7/8/2017
20
De-duplication
D-Dupe: An Interactive Tool for Entity Resolution in Social Networks (Mustafa Bilgic, Louis Licamele,
Lise Getoor, Ben Shneiderman), In Visual Analytics Science and Technology (VAST), 2006.
• Resolving named entity using relation network
July 8th 2017 Summer School Le Mans
103 – 106: Genealogies
July 8th 2017 Summer School Le Mans
23. 7/8/2017
22
July 8th 2017 Summer School Le Mans
106 – 109: Institutional project
• Only aggregated information can be presented
• Faceted browsing / search very useful!
– Use it!
• e.g. Europeana: 53 106 items
July 8th 2017 Summer School Le Mans
24. 7/8/2017
23
106 – 109: Institutional project (HAL)
July 8th 2017 Summer School Le Mans
http://traces1.saclay.inria.fr/inria/
106 – 109: EU Project Cendari
July 8th 2017 Summer School Le Mans
25. 7/8/2017
24
106 – 109: EU Project Cendari
July 8th 2017 Summer School Le Mans
https://notes.cendari.dariah.eu/
106 – 109: Institutional project
• Only aggregated information can be presented
• Faceted browsing / search very useful!
– Use it!
• e.g. Europeana: 53 106 items
• Problem: metadata quality and semantics
• What is the date of a book?
July 8th 2017 Summer School Le Mans
26. 7/8/2017
25
> 109: World Scale
• Few providers
– Google
– Photo collections (Flickr)
– Astronomical databases
• The cost of computing facets is too high for
interactive time responses
• No good general solution
July 8th 2017 Summer School Le Mans
> 109: Internet Backbone
• Where are you?
• Who cares?
July 8th 2017 Summer School Le Mans
27. 7/8/2017
26
> 109: Query Previews
• Query over very large data about the Earth
July 8th 2017 Summer School Le Mans
http://www.cs.umd.edu/hcil/eosdis/
Conclusion
• Larger collections are harder to manage
– Big data problem
• A large collection can always be queried to
extract a smaller collection
– Scaling down the results and increasing the number of
techniques usable
• Still, current technologies are limited for DH
– No management of uncertainty
– No reasonable model of old geographical concepts
– No good model of time and date
• Still, use the tools and ask for improvements!
July 8th 2017 Summer School Le Mans
28. 7/8/2017
27
References
• Jacques Bertin, Semiology of Graphics: Diagrams, Networks, Maps.
ESRI Press; Nov. 2010. ISBN: 9781589482616
• Edward Tufte. The Visual Display of Quantitative Information.
Cheshire, CT: Graphics Press, 2010 ISBN 0-9613921-4-2
• Tamara Munzner. Visualization Analysis and Design. A K Peters
Visualization Series, CRC Press, 2014. ISBN 9781466508910
• Alberto Cairo. The Truthful Art: Data, Charts, and Maps for
Communication. New Riders, 2016. ISBN 0321934075
• Tableau for Students: https://www.tableau.com/academic/students
• Jänicke, Stefan; Franzini, Greta; Cheema, Muhammad Faisal;
Scheuermann, Gerik. On Close and Distant Reading in Digital
Humanities: A Survey and Future Challenges. Eurographics
Conference on Visualization (EuroVis) – STARs. 2015.
http://dx.doi.org/10.2312/eurovisstar.20151113
July 8th 2017 Summer School Le Mans