SlideShare a Scribd company logo
Search & Data Mining 
SKILLS SEMINAR 
Master of European History, University of Luxembourg, 11 December 2014 
Gerben Zaagsma 
Lichtenberg-Kolleg,
Overview 
1. 
2. T 
3. Practical exercises 
1. Introduction search & data mining
Code yourself… …or use existing tools
Why historians should be 
interested: 
Old New CHANGE 
Analogue resources Digital resources 
SCALE 
Small data Big data 
Close reading Distant reading TECHNOLOGY
the Big Data revolution? 
Big data and claims about a paradigm change in the 
humanities
culturomics and Google ngrams
the Big Data revolution? 
Big data and claims about a paradigm change in the 
humanities 
Data driven history
the Big Data revolution? 
Big data and claims about a paradigm change in the 
humanities 
Data driven history 
Patterns and structures: a new essentialism?
the Big Data revolution? 
Big data and claims about a paradigm change in the 
humanities 
Data driven history 
Patterns and structures: a new essentialism? 
Based upon changes of scale & method: humanities 
supposedly becoming more ‘scientific’ > results can be 
checked and replicated, but can they? Interpretation.
the Big Data revolution? 
Big data and claims about a paradigm change in the 
humanities 
Data driven history 
Patterns and structures: a new essentialism? 
Based upon changes of scale & method: humanities 
supposedly becoming more ‘scientific’ > results can be 
checked and replicated, but can they? Interpretation. 
Politics: funding & valorisation
“One of the problems confronting data enthusiasts in 
the humanities is that we feel a need to convince our 
more old-fashioned colleagues about what can be done. 
But our role as advocates of data shouldn't mean that 
we lose our critical sense as scholars. 
[....] there is a risk that we look more carefully at the 
technical components of the datasets than the 
historical context of the information that they represent. 
Andrew Prescott, ‘The Deceptions of Data’, Digital Riffs (13 
January 2013).
Frédéric Clavert, ‘Lecture des sources historiennes à l’ère 
numérique’ (14 November 2012) 
Integrate 
approaches 
& methods/ 
hybridity
1. SEARCH
Google/ Bing/ Yahoo 
er is veel meer ...
zoeken op Internet algemeen: 
Google 
er is veel meer dan Google 
filter bubble? bekijk eens: http://dontbubble.us
zoeken op Internet algemeen: 
Google 
er is veel meer dan Google 
filter bubble? bekijk eens: http://dontbubble.us 
http://www.langreiter.com/exec/yahoo-vs-google.html
zoeken op Internet algemeen: 
Google 
er is veel meer dan Google 
filter bubble? bekijk eens: http://dontbubble.us 
http://yometa.com
filter bubble? 
http://www.thefilterbubble.com
filter bubble? 
http://www.thefilterbubble.com
Web search round-up 
differences between search engines 
filter bubble 
deep web versus visible web
Searching digital libraries & archives…
composition of resources, selection…
example of Compactmemory: a great resource on 
German-Jewish history
Die Sammlung umfasst die 110 wichtigsten jüdischen 
Zeitungen und Zeitschriften des deutschsprachigen Raumes 
aus den Jahren 1806-1938. Die Periodika repräsentieren die 
gesamte religiöse, politische, soziale, literarische oder 
wissenschaftliche Bandbreite der jüdischen Gemeinschaft. 
but be aware of selection: focus on elites and organisations that 
highlight German Jewry’s process of emancipation : 
• classical vision in historiography on German Jewry? 
• reinforcement of existing master narratives?
mind the context…
Processing and searching data on your own 
computer…
1. DATA MINING
data? 
data = computer-processable information
Example of structured data
Many digital libraries/archives: 
un-/semi-structured data
Digital editions: bridging the gap with XML
http://eculture.cs.vu.nl/europeana/session/search 
•Google/ Bing/ Yahoo 
• er is veel meer ... 
• resultaten verschillen per zoekmachine 
• en er is een filter bubbel 
•--> kortom: weten wat je zoekt en zoekstrategie cruciaal 
Semantic web and linking data
•Google/ Bing/ Yahoo 
• er is veel meer ... 
• resultaten verschillen per zoekmachine 
• en er is een filter bubbel 
•--> kortom: weten wat je zoekt en zoekstrategie cruciaal 
cs.vu.nl/europeana/session/search
•Google/ Bing/ Yahoo 
• er is veel meer ... 
• resultaten verschillen per zoekmachine 
• en er is een filter bubbel 
•--> kortom: weten wat je zoekt en zoekstrategie cruciaal
Some definitions of data mining:
At its simplest, data mining is the process of extracting 
new knowledge (usually in terms of previously unknown 
patterns) from sets of data already in existence. 
Jonathan Hagood
Data mining (the analysis step of the "Knowledge Discovery in 
Databases" process, or KDD), an interdisciplinary subfield of 
computer science, is the computational process of discovering 
patterns in large data sets involving methods at the intersection 
of artificial intelligence, machine learning, statistics, and 
database systems. 
The overall goal of the data mining process is to extract 
information from a data set and transform it into an 
understandable structure for further use. 
Wikipedia
Examples of projects and techniques
an n-gram is a contiguous sequence of n 
items from a given sequence of text or speech
Topic Modeling Martha Ballard’s Diary
data? 
data & data mining ≠ neutral
“What is too often forgotten, though, is that our 
digital helpers are full of ‘theory’ and ‘judgement’ 
already. As with any methodology, they rely on sets 
of assumptions, models, and strategies. Theory is 
already at work on the most basic level when it 
comes to defining units of analysis, algorithms, and 
visualisation procedures.” 
Bernhard Rieder and Theo Röhle, ‘Digital Methods: Five 
Challenges’ in: David M Berry ed., Understanding Digital 
Humanities (Houndmills: Palgrave Macmillan, 2012) 67-85, 
70.
2. TOOLS
3. Practical exercises
Overview of exercises 
http://goo.gl/72fCn7
Tools & workflows 
Voyant Tools 
Voyant Tools Documentation 
Programming Historian 
DIRT: Digital Research Tools 
Turkel, William J., Kevin Kee, and Spencer Roberts, ‘A 
Method for Navigating the Infinite Archive’ in: Toni 
Weller ed., History in the Digital Age (London; New 
York: Routledge, 2013). 
William J. Turkel: How To
Further reading 
Special issue on Digital History, BMGN - Low Countries Historical Review, 128/4 (2013). 
Haber, Peter, Digital Past : Geschichtswissenschaft Im Digitalen Zeitalter (München: 
Oldenbourg Verlag, 2011). 
Boonstra, Onno, Leen Breure, and Peter Doorn, Past, Present and Future of Historical 
Information Science (Amsterdam: NIWI-KNAW, 2004). 
Ciravegna, Fabio, Mark Greengrass, Tim Hitchcock, Sam Chapman, Jamie McLaughlin, 
and Ravish Bhagdev, ‘Finding Needles in Hay- Stacks: Data-Mining in Distributed 
Historical Datasets’ in: Mark Greengrass and Lorna M Hughes eds., The Virtual 
Representation of the Past (Ashgate, 2008). 
Cohen, D, F Gibbs, T Hitchcock, G Rockwell, J Sander, R Shoemaker, S Sinclair, S Takats, 
W J Turkel, and C Briquet. "Data Mining with Criminal Intent." Final white paper (2011). 
Hagood, Jonathan, "A Brief Introduction to Data Mining Projects in the Humanities." 
Bulletin of the American Society for Information Science and Technology 38/4 (2012). 
Hitchcock, Tim, "Big Data for Dead People: Digital Readings and the Conundrums of 
Positivism." (9 December 2013). 
Leonard, Peter, "Mining Large Datasets for the Humanities”, IFLA WLIC 2014.
Dr. Gerben Zaagsma 
http://gerbenzaagsma.org 
de.linkedin.com/in/gerbenzaagsma/ 
https://twitter.com/gerbenzaagsma 
https://uni-goettingen.academia.edu/GerbenZaagsma 
https://www.researchgate.net/profile/Gerben_Zaagsma 
https://www.slideshare.net/gerbenzaagsma
Image credits 
The Field Museum Library, Hall 37 Geology overview. URL: https://www.flickr.com/photos/ 
field_museum_library/3333920156/in/set-72157614881700424. 
The U.S. National Archives, Photograph of Card Catalog in Central Search Room, 1942. URL: http:// 
www.flickr.com/photos/usnationalarchives/3873932255/. 
Witch computer 1951: Wolverhampton and Staffordshire College of Technology in 1961, The National 
Computing Museum and Computer Conservation Society/UKAEA/Wolverhampton Express and Star, via: 
http://www.wired.com/2009/09/britan-oldest-computer/. 
Code: https://www.flickr.com/photos/lord_james/4696338852/. 
Tools: Flickr Commons 
The droids we're googling for: https://www.flickr.com/photos/st3f4n/3951143570/. 
Jaws (Steven Spielberg) original movie poster: https://en.wikipedia.org/wiki/File:JAWS_Movie_poster.jpg 
Structured/unstructured data: http://www.emc.com/collateral/demos/microsites/emc-digital-universe- 
2011/index.htm 
Macbook Data Mining: http://www.flickr.com/photos/17208993@N00/442531562/. 
Topic Modeling Martha Ballard’s Diary: http://www.cameronblevins.org/posts/topic-modeling-martha-ballards- 
diary/. 
Boolean operators: http://uksourcers.co.uk/2012/capital-letters-the-key-to-boolean-success/ 
Miami University students in laboratory classroom 1908: https://www.flickr.com/photos/ 
muohio_digital_collections/3199691495/

More Related Content

What's hot

Rogers digitalmethods 4nov2010
Rogers digitalmethods 4nov2010Rogers digitalmethods 4nov2010
Rogers digitalmethods 4nov2010
Digital Methods Initiative
 
Semantic web Santhosh N Basavarajappa
Semantic web   Santhosh N BasavarajappaSemantic web   Santhosh N Basavarajappa
Semantic web Santhosh N Basavarajappa
Santhosh Basavarajappa
 
International Collaboration Networks in the Emerging (Big) Data Science
International Collaboration Networks in the Emerging (Big) Data ScienceInternational Collaboration Networks in the Emerging (Big) Data Science
International Collaboration Networks in the Emerging (Big) Data Science
datasciencekorea
 
A Cabinet Of Web2.0 Scientific Curiosities
A Cabinet Of Web2.0 Scientific CuriositiesA Cabinet Of Web2.0 Scientific Curiosities
A Cabinet Of Web2.0 Scientific Curiosities
Ian Mulvany
 
Big Data in the Arts and Humanities
Big Data in the Arts and HumanitiesBig Data in the Arts and Humanities
Big Data in the Arts and Humanities
Andrew Prescott
 
Humanities in the Digital World
Humanities in the Digital WorldHumanities in the Digital World
Humanities in the Digital World
David De Roure
 
From DARPA to Shakespeare: All the Data we Can Handle
From DARPA to Shakespeare: All the Data we Can Handle From DARPA to Shakespeare: All the Data we Can Handle
From DARPA to Shakespeare: All the Data we Can Handle
Kimberly Hoffman
 
Mini-Training: DataViz, data-driven documents and D3.js
Mini-Training: DataViz, data-driven documents and D3.jsMini-Training: DataViz, data-driven documents and D3.js
Mini-Training: DataViz, data-driven documents and D3.js
Betclic Everest Group Tech Team
 
Data-driven journalism: What is there to learn? (Stanford, June 2010) #ddj
Data-driven journalism: What is there to learn? (Stanford, June 2010) #ddjData-driven journalism: What is there to learn? (Stanford, June 2010) #ddj
Data-driven journalism: What is there to learn? (Stanford, June 2010) #ddj
Mirko Lorenz
 
New Forms of Data for e-Research
New Forms of Data for e-ResearchNew Forms of Data for e-Research
New Forms of Data for e-Research
David De Roure
 
CLIR Fellows - Science Data - 14_0730
CLIR Fellows - Science Data - 14_0730CLIR Fellows - Science Data - 14_0730
CLIR Fellows - Science Data - 14_0730
jeffreylancaster
 
MPhil Lecture of Data Vis for Presentation
MPhil Lecture of Data Vis for PresentationMPhil Lecture of Data Vis for Presentation
MPhil Lecture of Data Vis for PresentationShawn Day
 
Data Management Solutions from Libraries at NSF Large Facilities Workshop
Data Management Solutions from Libraries at NSF Large Facilities WorkshopData Management Solutions from Libraries at NSF Large Facilities Workshop
Data Management Solutions from Libraries at NSF Large Facilities Workshop
Carly Strasser
 
Scholarship in the Digital World
Scholarship in the Digital WorldScholarship in the Digital World
Scholarship in the Digital World
David De Roure
 
CUA Humanities Lecture on Scholarly Communications LSC634 Fall2014
CUA Humanities Lecture on Scholarly Communications LSC634 Fall2014CUA Humanities Lecture on Scholarly Communications LSC634 Fall2014
CUA Humanities Lecture on Scholarly Communications LSC634 Fall2014
Kimberly Hoffman
 
Intro to Linked Open Data in Libraries Archives & Museums.
Intro to Linked Open Data in Libraries Archives & Museums.Intro to Linked Open Data in Libraries Archives & Museums.
Intro to Linked Open Data in Libraries Archives & Museums.
Jon Voss
 
Beyond Preservation: Situating Archaeological Data in Professional Practice
Beyond Preservation: Situating Archaeological Data in Professional PracticeBeyond Preservation: Situating Archaeological Data in Professional Practice
Beyond Preservation: Situating Archaeological Data in Professional Practice
Eric Kansa
 
How to Build Linked Data Sites with Drupal 7 and RDFa
How to Build Linked Data Sites with Drupal 7 and RDFaHow to Build Linked Data Sites with Drupal 7 and RDFa
How to Build Linked Data Sites with Drupal 7 and RDFa
scorlosquet
 
Our World is Socio-technical
Our World is Socio-technicalOur World is Socio-technical
Our World is Socio-technical
Markus Luczak-Rösch
 
Google Tools for Digital Humanities Scholars
Google Tools for Digital Humanities ScholarsGoogle Tools for Digital Humanities Scholars
Google Tools for Digital Humanities Scholars
Shawn Day
 

What's hot (20)

Rogers digitalmethods 4nov2010
Rogers digitalmethods 4nov2010Rogers digitalmethods 4nov2010
Rogers digitalmethods 4nov2010
 
Semantic web Santhosh N Basavarajappa
Semantic web   Santhosh N BasavarajappaSemantic web   Santhosh N Basavarajappa
Semantic web Santhosh N Basavarajappa
 
International Collaboration Networks in the Emerging (Big) Data Science
International Collaboration Networks in the Emerging (Big) Data ScienceInternational Collaboration Networks in the Emerging (Big) Data Science
International Collaboration Networks in the Emerging (Big) Data Science
 
A Cabinet Of Web2.0 Scientific Curiosities
A Cabinet Of Web2.0 Scientific CuriositiesA Cabinet Of Web2.0 Scientific Curiosities
A Cabinet Of Web2.0 Scientific Curiosities
 
Big Data in the Arts and Humanities
Big Data in the Arts and HumanitiesBig Data in the Arts and Humanities
Big Data in the Arts and Humanities
 
Humanities in the Digital World
Humanities in the Digital WorldHumanities in the Digital World
Humanities in the Digital World
 
From DARPA to Shakespeare: All the Data we Can Handle
From DARPA to Shakespeare: All the Data we Can Handle From DARPA to Shakespeare: All the Data we Can Handle
From DARPA to Shakespeare: All the Data we Can Handle
 
Mini-Training: DataViz, data-driven documents and D3.js
Mini-Training: DataViz, data-driven documents and D3.jsMini-Training: DataViz, data-driven documents and D3.js
Mini-Training: DataViz, data-driven documents and D3.js
 
Data-driven journalism: What is there to learn? (Stanford, June 2010) #ddj
Data-driven journalism: What is there to learn? (Stanford, June 2010) #ddjData-driven journalism: What is there to learn? (Stanford, June 2010) #ddj
Data-driven journalism: What is there to learn? (Stanford, June 2010) #ddj
 
New Forms of Data for e-Research
New Forms of Data for e-ResearchNew Forms of Data for e-Research
New Forms of Data for e-Research
 
CLIR Fellows - Science Data - 14_0730
CLIR Fellows - Science Data - 14_0730CLIR Fellows - Science Data - 14_0730
CLIR Fellows - Science Data - 14_0730
 
MPhil Lecture of Data Vis for Presentation
MPhil Lecture of Data Vis for PresentationMPhil Lecture of Data Vis for Presentation
MPhil Lecture of Data Vis for Presentation
 
Data Management Solutions from Libraries at NSF Large Facilities Workshop
Data Management Solutions from Libraries at NSF Large Facilities WorkshopData Management Solutions from Libraries at NSF Large Facilities Workshop
Data Management Solutions from Libraries at NSF Large Facilities Workshop
 
Scholarship in the Digital World
Scholarship in the Digital WorldScholarship in the Digital World
Scholarship in the Digital World
 
CUA Humanities Lecture on Scholarly Communications LSC634 Fall2014
CUA Humanities Lecture on Scholarly Communications LSC634 Fall2014CUA Humanities Lecture on Scholarly Communications LSC634 Fall2014
CUA Humanities Lecture on Scholarly Communications LSC634 Fall2014
 
Intro to Linked Open Data in Libraries Archives & Museums.
Intro to Linked Open Data in Libraries Archives & Museums.Intro to Linked Open Data in Libraries Archives & Museums.
Intro to Linked Open Data in Libraries Archives & Museums.
 
Beyond Preservation: Situating Archaeological Data in Professional Practice
Beyond Preservation: Situating Archaeological Data in Professional PracticeBeyond Preservation: Situating Archaeological Data in Professional Practice
Beyond Preservation: Situating Archaeological Data in Professional Practice
 
How to Build Linked Data Sites with Drupal 7 and RDFa
How to Build Linked Data Sites with Drupal 7 and RDFaHow to Build Linked Data Sites with Drupal 7 and RDFa
How to Build Linked Data Sites with Drupal 7 and RDFa
 
Our World is Socio-technical
Our World is Socio-technicalOur World is Socio-technical
Our World is Socio-technical
 
Google Tools for Digital Humanities Scholars
Google Tools for Digital Humanities ScholarsGoogle Tools for Digital Humanities Scholars
Google Tools for Digital Humanities Scholars
 

Viewers also liked

Data mining slides
Data mining slidesData mining slides
Data mining slidessmj
 
Data Mining Ieee Papers Trichy
Data Mining Ieee Papers TrichyData Mining Ieee Papers Trichy
Data Mining Ieee Papers Trichy
krish madhi
 
Presentation data mining(1)
Presentation data mining(1)Presentation data mining(1)
Presentation data mining(1)
cegonsoft1999
 
Cloud computing 2015 ieee papers Data mining ieee project titles
Cloud computing  2015 ieee papers  Data mining ieee project titlesCloud computing  2015 ieee papers  Data mining ieee project titles
Cloud computing 2015 ieee papers Data mining ieee project titles
DoClick Solutions
 
Project center in trichy @ieee 2016 17 titles for java and dotnet
Project center in trichy @ieee 2016 17 titles for java and dotnetProject center in trichy @ieee 2016 17 titles for java and dotnet
Project center in trichy @ieee 2016 17 titles for java and dotnet
Elakkiya Triplen
 
MINING HEALTH EXAMINATION RECORDS A GRAPH-BASED APPROACH
MINING HEALTH EXAMINATION RECORDS  A GRAPH-BASED APPROACHMINING HEALTH EXAMINATION RECORDS  A GRAPH-BASED APPROACH
MINING HEALTH EXAMINATION RECORDS A GRAPH-BASED APPROACH
Nexgen Technology
 
Mining Electronic Health Records for Insights
Mining Electronic Health Records for InsightsMining Electronic Health Records for Insights
Mining Electronic Health Records for Insights
Ontotext
 
Final year IEEE,NON IEEE projects for 2013-14 for BCA,BTECH,Diploma,Mtech,MCA
Final year IEEE,NON IEEE projects for 2013-14 for BCA,BTECH,Diploma,Mtech,MCAFinal year IEEE,NON IEEE projects for 2013-14 for BCA,BTECH,Diploma,Mtech,MCA
Final year IEEE,NON IEEE projects for 2013-14 for BCA,BTECH,Diploma,Mtech,MCA
projectsepark
 
Data mining on social networks for students learning experiences
Data mining on social networks for students learning experiences Data mining on social networks for students learning experiences
Data mining on social networks for students learning experiences
Biplab Debnath
 
Text categorization
Text categorizationText categorization
Text categorization
Shubham Pahune
 
SMART HEALTH PREDICTION USING DATA MINING by Dr.Mahboob Khan Phd
SMART HEALTH PREDICTION USING DATA MINING by Dr.Mahboob Khan PhdSMART HEALTH PREDICTION USING DATA MINING by Dr.Mahboob Khan Phd
SMART HEALTH PREDICTION USING DATA MINING by Dr.Mahboob Khan Phd
Healthcare consultant
 
Smart health prediction using data mining by customsoft
Smart health prediction using data mining by customsoftSmart health prediction using data mining by customsoft
Smart health prediction using data mining by customsoft
Custom Soft
 
Introduction to Big Data & Hadoop
Introduction to Big Data & HadoopIntroduction to Big Data & Hadoop
Introduction to Big Data & Hadoop
Edureka!
 
Monkey runner & Monkey testing
Monkey runner & Monkey testingMonkey runner & Monkey testing
Monkey runner & Monkey testing
SWAAM Tech
 
Human machine interface
Human machine interfaceHuman machine interface
Human machine interface
HumanMachineInterfacex
 
Data mining seminar report
Data mining seminar reportData mining seminar report
Data mining seminar reportmayurik19
 

Viewers also liked (20)

Data mining slides
Data mining slidesData mining slides
Data mining slides
 
Data Mining Ieee Papers Trichy
Data Mining Ieee Papers TrichyData Mining Ieee Papers Trichy
Data Mining Ieee Papers Trichy
 
Presentation data mining(1)
Presentation data mining(1)Presentation data mining(1)
Presentation data mining(1)
 
Cloud computing 2015 ieee papers Data mining ieee project titles
Cloud computing  2015 ieee papers  Data mining ieee project titlesCloud computing  2015 ieee papers  Data mining ieee project titles
Cloud computing 2015 ieee papers Data mining ieee project titles
 
Project center in trichy @ieee 2016 17 titles for java and dotnet
Project center in trichy @ieee 2016 17 titles for java and dotnetProject center in trichy @ieee 2016 17 titles for java and dotnet
Project center in trichy @ieee 2016 17 titles for java and dotnet
 
MINING HEALTH EXAMINATION RECORDS A GRAPH-BASED APPROACH
MINING HEALTH EXAMINATION RECORDS  A GRAPH-BASED APPROACHMINING HEALTH EXAMINATION RECORDS  A GRAPH-BASED APPROACH
MINING HEALTH EXAMINATION RECORDS A GRAPH-BASED APPROACH
 
Mining Electronic Health Records for Insights
Mining Electronic Health Records for InsightsMining Electronic Health Records for Insights
Mining Electronic Health Records for Insights
 
PPT FOR BIG
PPT FOR BIGPPT FOR BIG
PPT FOR BIG
 
Final year IEEE,NON IEEE projects for 2013-14 for BCA,BTECH,Diploma,Mtech,MCA
Final year IEEE,NON IEEE projects for 2013-14 for BCA,BTECH,Diploma,Mtech,MCAFinal year IEEE,NON IEEE projects for 2013-14 for BCA,BTECH,Diploma,Mtech,MCA
Final year IEEE,NON IEEE projects for 2013-14 for BCA,BTECH,Diploma,Mtech,MCA
 
Data mining on social networks for students learning experiences
Data mining on social networks for students learning experiences Data mining on social networks for students learning experiences
Data mining on social networks for students learning experiences
 
Data mining
Data miningData mining
Data mining
 
Text categorization
Text categorizationText categorization
Text categorization
 
SMART HEALTH PREDICTION USING DATA MINING by Dr.Mahboob Khan Phd
SMART HEALTH PREDICTION USING DATA MINING by Dr.Mahboob Khan PhdSMART HEALTH PREDICTION USING DATA MINING by Dr.Mahboob Khan Phd
SMART HEALTH PREDICTION USING DATA MINING by Dr.Mahboob Khan Phd
 
Smart health prediction using data mining by customsoft
Smart health prediction using data mining by customsoftSmart health prediction using data mining by customsoft
Smart health prediction using data mining by customsoft
 
Monkey talk
Monkey talkMonkey talk
Monkey talk
 
Introduction to Big Data & Hadoop
Introduction to Big Data & HadoopIntroduction to Big Data & Hadoop
Introduction to Big Data & Hadoop
 
Monkey runner & Monkey testing
Monkey runner & Monkey testingMonkey runner & Monkey testing
Monkey runner & Monkey testing
 
HMI
HMIHMI
HMI
 
Human machine interface
Human machine interfaceHuman machine interface
Human machine interface
 
Data mining seminar report
Data mining seminar reportData mining seminar report
Data mining seminar report
 

Similar to Introduction for skills seminar on Search and Data Mining, Master of European History, University of Luxembourg, 11 December 2014

Data Science in 2016: Moving Up
Data Science in 2016: Moving UpData Science in 2016: Moving Up
Data Science in 2016: Moving Up
Paco Nathan
 
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Big Data Spain
 
Digital research: Collections, data, tools and methods
Digital research: Collections, data, tools and methods Digital research: Collections, data, tools and methods
Digital research: Collections, data, tools and methods
Stella Wisdom
 
Digital Humanities and “Digital” Social Sciences
Digital Humanities and “Digital” Social SciencesDigital Humanities and “Digital” Social Sciences
Digital Humanities and “Digital” Social Sciences
Chantal van Son
 
Accessing and Using Big Data to Advance Social Science Knowledge
Accessing and Using Big Data to Advance Social Science KnowledgeAccessing and Using Big Data to Advance Social Science Knowledge
Accessing and Using Big Data to Advance Social Science Knowledge
Josh Cowls
 
Critical issues in the collection, analysis and use of student (digital) data
Critical issues in the collection, analysis and use of student (digital) dataCritical issues in the collection, analysis and use of student (digital) data
Critical issues in the collection, analysis and use of student (digital) data
University of South Africa (Unisa)
 
AHRC CDP Digital Humanities 101
AHRC CDP Digital Humanities 101  AHRC CDP Digital Humanities 101
AHRC CDP Digital Humanities 101
Digital Research and Curator Team @ British Library
 
Mapping (big) data science (15 dec2014)대학(원)생
Mapping (big) data science (15 dec2014)대학(원)생Mapping (big) data science (15 dec2014)대학(원)생
Mapping (big) data science (15 dec2014)대학(원)생
Han Woo PARK
 
When Search becomes Research and Research becomes Search
When Search becomes Research and Research becomes SearchWhen Search becomes Research and Research becomes Search
When Search becomes Research and Research becomes Search
Jaap Kamps
 
Dh presentation 2018
Dh presentation 2018Dh presentation 2018
Dh presentation 2018
University of Cape Town
 
Digital Humanities Workshop
Digital Humanities WorkshopDigital Humanities Workshop
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
Philip Bourne
 
Digital Methods by Richard Rogers
Digital Methods by Richard RogersDigital Methods by Richard Rogers
Digital Methods by Richard Rogers
Digital Methods Initiative
 
Digital Humanities by Ingrid Thomson
Digital Humanities  by Ingrid ThomsonDigital Humanities  by Ingrid Thomson
Digital Humanities by Ingrid Thomson
pvhead123
 
Digital Humanities - Conversation Starter 2015
Digital Humanities - Conversation Starter 2015Digital Humanities - Conversation Starter 2015
Digital Humanities - Conversation Starter 2015
University of Cape Town
 
Introduction to the Venice Time Machine
Introduction to the Venice Time MachineIntroduction to the Venice Time Machine
Introduction to the Venice Time Machine
Giovanni Colavizza
 
Data Culture Series - Keynote & Panel - Birmingham - 8th April 2015
Data Culture Series  - Keynote & Panel - Birmingham - 8th April 2015Data Culture Series  - Keynote & Panel - Birmingham - 8th April 2015
Data Culture Series - Keynote & Panel - Birmingham - 8th April 2015
Jonathan Woodward
 
Exploring human behaviour in interdisciplinary learning environments - Ali Fi...
Exploring human behaviour in interdisciplinary learning environments - Ali Fi...Exploring human behaviour in interdisciplinary learning environments - Ali Fi...
Exploring human behaviour in interdisciplinary learning environments - Ali Fi...
The Higher Education Academy
 
Big Data in NATO and Your Role
Big Data in NATO and Your RoleBig Data in NATO and Your Role
Big Data in NATO and Your Role
Jay Gendron
 
Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.
Paul Groth
 

Similar to Introduction for skills seminar on Search and Data Mining, Master of European History, University of Luxembourg, 11 December 2014 (20)

Data Science in 2016: Moving Up
Data Science in 2016: Moving UpData Science in 2016: Moving Up
Data Science in 2016: Moving Up
 
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
Data Science in 2016: Moving up by Paco Nathan at Big Data Spain 2015
 
Digital research: Collections, data, tools and methods
Digital research: Collections, data, tools and methods Digital research: Collections, data, tools and methods
Digital research: Collections, data, tools and methods
 
Digital Humanities and “Digital” Social Sciences
Digital Humanities and “Digital” Social SciencesDigital Humanities and “Digital” Social Sciences
Digital Humanities and “Digital” Social Sciences
 
Accessing and Using Big Data to Advance Social Science Knowledge
Accessing and Using Big Data to Advance Social Science KnowledgeAccessing and Using Big Data to Advance Social Science Knowledge
Accessing and Using Big Data to Advance Social Science Knowledge
 
Critical issues in the collection, analysis and use of student (digital) data
Critical issues in the collection, analysis and use of student (digital) dataCritical issues in the collection, analysis and use of student (digital) data
Critical issues in the collection, analysis and use of student (digital) data
 
AHRC CDP Digital Humanities 101
AHRC CDP Digital Humanities 101  AHRC CDP Digital Humanities 101
AHRC CDP Digital Humanities 101
 
Mapping (big) data science (15 dec2014)대학(원)생
Mapping (big) data science (15 dec2014)대학(원)생Mapping (big) data science (15 dec2014)대학(원)생
Mapping (big) data science (15 dec2014)대학(원)생
 
When Search becomes Research and Research becomes Search
When Search becomes Research and Research becomes SearchWhen Search becomes Research and Research becomes Search
When Search becomes Research and Research becomes Search
 
Dh presentation 2018
Dh presentation 2018Dh presentation 2018
Dh presentation 2018
 
Digital Humanities Workshop
Digital Humanities WorkshopDigital Humanities Workshop
Digital Humanities Workshop
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
 
Digital Methods by Richard Rogers
Digital Methods by Richard RogersDigital Methods by Richard Rogers
Digital Methods by Richard Rogers
 
Digital Humanities by Ingrid Thomson
Digital Humanities  by Ingrid ThomsonDigital Humanities  by Ingrid Thomson
Digital Humanities by Ingrid Thomson
 
Digital Humanities - Conversation Starter 2015
Digital Humanities - Conversation Starter 2015Digital Humanities - Conversation Starter 2015
Digital Humanities - Conversation Starter 2015
 
Introduction to the Venice Time Machine
Introduction to the Venice Time MachineIntroduction to the Venice Time Machine
Introduction to the Venice Time Machine
 
Data Culture Series - Keynote & Panel - Birmingham - 8th April 2015
Data Culture Series  - Keynote & Panel - Birmingham - 8th April 2015Data Culture Series  - Keynote & Panel - Birmingham - 8th April 2015
Data Culture Series - Keynote & Panel - Birmingham - 8th April 2015
 
Exploring human behaviour in interdisciplinary learning environments - Ali Fi...
Exploring human behaviour in interdisciplinary learning environments - Ali Fi...Exploring human behaviour in interdisciplinary learning environments - Ali Fi...
Exploring human behaviour in interdisciplinary learning environments - Ali Fi...
 
Big Data in NATO and Your Role
Big Data in NATO and Your RoleBig Data in NATO and Your Role
Big Data in NATO and Your Role
 
Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.Data Communities - reusable data in and outside your organization.
Data Communities - reusable data in and outside your organization.
 

More from Gerben Zaagsma

20130315 - Cursus Digitaal Historisch Onderzoek 2013: College 3 - Bronnenkri...
20130315 - Cursus Digitaal Historisch Onderzoek 2013: College 3  - Bronnenkri...20130315 - Cursus Digitaal Historisch Onderzoek 2013: College 3  - Bronnenkri...
20130315 - Cursus Digitaal Historisch Onderzoek 2013: College 3 - Bronnenkri...
Gerben Zaagsma
 
20130314 - Historical sources and data in the digital age
20130314 - Historical sources and data in the digital age20130314 - Historical sources and data in the digital age
20130314 - Historical sources and data in the digital age
Gerben Zaagsma
 
20130301 - Cursus Digitaal Historisch Onderzoek 2013: College 2 - Historische...
20130301 - Cursus Digitaal Historisch Onderzoek 2013: College 2 - Historische...20130301 - Cursus Digitaal Historisch Onderzoek 2013: College 2 - Historische...
20130301 - Cursus Digitaal Historisch Onderzoek 2013: College 2 - Historische...
Gerben Zaagsma
 
20130215 - Cursus Digitaal Historisch Onderzoek 2013: College 1 - Inleiding
20130215 - Cursus Digitaal Historisch Onderzoek 2013: College 1 - Inleiding20130215 - Cursus Digitaal Historisch Onderzoek 2013: College 1 - Inleiding
20130215 - Cursus Digitaal Historisch Onderzoek 2013: College 1 - Inleiding
Gerben Zaagsma
 
20130107 - Introduction: On Digital History
20130107 -  Introduction: On Digital History20130107 -  Introduction: On Digital History
20130107 - Introduction: On Digital History
Gerben Zaagsma
 
20110517 - Presenting the Yiddish past in contemporary Europe
20110517 - Presenting the Yiddish past in contemporary Europe20110517 - Presenting the Yiddish past in contemporary Europe
20110517 - Presenting the Yiddish past in contemporary Europe
Gerben Zaagsma
 
20111031 - Online Jewish content in a broader context
20111031 - Online Jewish content in a broader context20111031 - Online Jewish content in a broader context
20111031 - Online Jewish content in a broader context
Gerben Zaagsma
 

More from Gerben Zaagsma (7)

20130315 - Cursus Digitaal Historisch Onderzoek 2013: College 3 - Bronnenkri...
20130315 - Cursus Digitaal Historisch Onderzoek 2013: College 3  - Bronnenkri...20130315 - Cursus Digitaal Historisch Onderzoek 2013: College 3  - Bronnenkri...
20130315 - Cursus Digitaal Historisch Onderzoek 2013: College 3 - Bronnenkri...
 
20130314 - Historical sources and data in the digital age
20130314 - Historical sources and data in the digital age20130314 - Historical sources and data in the digital age
20130314 - Historical sources and data in the digital age
 
20130301 - Cursus Digitaal Historisch Onderzoek 2013: College 2 - Historische...
20130301 - Cursus Digitaal Historisch Onderzoek 2013: College 2 - Historische...20130301 - Cursus Digitaal Historisch Onderzoek 2013: College 2 - Historische...
20130301 - Cursus Digitaal Historisch Onderzoek 2013: College 2 - Historische...
 
20130215 - Cursus Digitaal Historisch Onderzoek 2013: College 1 - Inleiding
20130215 - Cursus Digitaal Historisch Onderzoek 2013: College 1 - Inleiding20130215 - Cursus Digitaal Historisch Onderzoek 2013: College 1 - Inleiding
20130215 - Cursus Digitaal Historisch Onderzoek 2013: College 1 - Inleiding
 
20130107 - Introduction: On Digital History
20130107 -  Introduction: On Digital History20130107 -  Introduction: On Digital History
20130107 - Introduction: On Digital History
 
20110517 - Presenting the Yiddish past in contemporary Europe
20110517 - Presenting the Yiddish past in contemporary Europe20110517 - Presenting the Yiddish past in contemporary Europe
20110517 - Presenting the Yiddish past in contemporary Europe
 
20111031 - Online Jewish content in a broader context
20111031 - Online Jewish content in a broader context20111031 - Online Jewish content in a broader context
20111031 - Online Jewish content in a broader context
 

Recently uploaded

The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
kaushalkr1407
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
BhavyaRajput3
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
joachimlavalley1
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Atul Kumar Singh
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
siemaillard
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
TechSoup
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
MIRIAMSALINAS13
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
Anna Sz.
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
beazzy04
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
Atul Kumar Singh
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
Nguyen Thanh Tu Collection
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 

Recently uploaded (20)

The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 
Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345Sha'Carri Richardson Presentation 202345
Sha'Carri Richardson Presentation 202345
 
Language Across the Curriculm LAC B.Ed.
Language Across the  Curriculm LAC B.Ed.Language Across the  Curriculm LAC B.Ed.
Language Across the Curriculm LAC B.Ed.
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 

Introduction for skills seminar on Search and Data Mining, Master of European History, University of Luxembourg, 11 December 2014

  • 1. Search & Data Mining SKILLS SEMINAR Master of European History, University of Luxembourg, 11 December 2014 Gerben Zaagsma Lichtenberg-Kolleg,
  • 2.
  • 3. Overview 1. 2. T 3. Practical exercises 1. Introduction search & data mining
  • 4. Code yourself… …or use existing tools
  • 5.
  • 6. Why historians should be interested: Old New CHANGE Analogue resources Digital resources SCALE Small data Big data Close reading Distant reading TECHNOLOGY
  • 7. the Big Data revolution? Big data and claims about a paradigm change in the humanities
  • 9.
  • 10.
  • 11. the Big Data revolution? Big data and claims about a paradigm change in the humanities Data driven history
  • 12. the Big Data revolution? Big data and claims about a paradigm change in the humanities Data driven history Patterns and structures: a new essentialism?
  • 13. the Big Data revolution? Big data and claims about a paradigm change in the humanities Data driven history Patterns and structures: a new essentialism? Based upon changes of scale & method: humanities supposedly becoming more ‘scientific’ > results can be checked and replicated, but can they? Interpretation.
  • 14. the Big Data revolution? Big data and claims about a paradigm change in the humanities Data driven history Patterns and structures: a new essentialism? Based upon changes of scale & method: humanities supposedly becoming more ‘scientific’ > results can be checked and replicated, but can they? Interpretation. Politics: funding & valorisation
  • 15. “One of the problems confronting data enthusiasts in the humanities is that we feel a need to convince our more old-fashioned colleagues about what can be done. But our role as advocates of data shouldn't mean that we lose our critical sense as scholars. [....] there is a risk that we look more carefully at the technical components of the datasets than the historical context of the information that they represent. Andrew Prescott, ‘The Deceptions of Data’, Digital Riffs (13 January 2013).
  • 16. Frédéric Clavert, ‘Lecture des sources historiennes à l’ère numérique’ (14 November 2012) Integrate approaches & methods/ hybridity
  • 18. Google/ Bing/ Yahoo er is veel meer ...
  • 19. zoeken op Internet algemeen: Google er is veel meer dan Google filter bubble? bekijk eens: http://dontbubble.us
  • 20. zoeken op Internet algemeen: Google er is veel meer dan Google filter bubble? bekijk eens: http://dontbubble.us http://www.langreiter.com/exec/yahoo-vs-google.html
  • 21. zoeken op Internet algemeen: Google er is veel meer dan Google filter bubble? bekijk eens: http://dontbubble.us http://yometa.com
  • 24.
  • 25. Web search round-up differences between search engines filter bubble deep web versus visible web
  • 28. example of Compactmemory: a great resource on German-Jewish history
  • 29. Die Sammlung umfasst die 110 wichtigsten jüdischen Zeitungen und Zeitschriften des deutschsprachigen Raumes aus den Jahren 1806-1938. Die Periodika repräsentieren die gesamte religiöse, politische, soziale, literarische oder wissenschaftliche Bandbreite der jüdischen Gemeinschaft. but be aware of selection: focus on elites and organisations that highlight German Jewry’s process of emancipation : • classical vision in historiography on German Jewry? • reinforcement of existing master narratives?
  • 31.
  • 32.
  • 33.
  • 34. Processing and searching data on your own computer…
  • 35.
  • 36.
  • 37.
  • 39.
  • 40. data? data = computer-processable information
  • 41.
  • 43. Many digital libraries/archives: un-/semi-structured data
  • 44. Digital editions: bridging the gap with XML
  • 45.
  • 46.
  • 47. http://eculture.cs.vu.nl/europeana/session/search •Google/ Bing/ Yahoo • er is veel meer ... • resultaten verschillen per zoekmachine • en er is een filter bubbel •--> kortom: weten wat je zoekt en zoekstrategie cruciaal Semantic web and linking data
  • 48. •Google/ Bing/ Yahoo • er is veel meer ... • resultaten verschillen per zoekmachine • en er is een filter bubbel •--> kortom: weten wat je zoekt en zoekstrategie cruciaal cs.vu.nl/europeana/session/search
  • 49. •Google/ Bing/ Yahoo • er is veel meer ... • resultaten verschillen per zoekmachine • en er is een filter bubbel •--> kortom: weten wat je zoekt en zoekstrategie cruciaal
  • 50. Some definitions of data mining:
  • 51. At its simplest, data mining is the process of extracting new knowledge (usually in terms of previously unknown patterns) from sets of data already in existence. Jonathan Hagood
  • 52. Data mining (the analysis step of the "Knowledge Discovery in Databases" process, or KDD), an interdisciplinary subfield of computer science, is the computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics, and database systems. The overall goal of the data mining process is to extract information from a data set and transform it into an understandable structure for further use. Wikipedia
  • 53. Examples of projects and techniques
  • 54.
  • 55. an n-gram is a contiguous sequence of n items from a given sequence of text or speech
  • 56.
  • 57.
  • 58. Topic Modeling Martha Ballard’s Diary
  • 59. data? data & data mining ≠ neutral
  • 60. “What is too often forgotten, though, is that our digital helpers are full of ‘theory’ and ‘judgement’ already. As with any methodology, they rely on sets of assumptions, models, and strategies. Theory is already at work on the most basic level when it comes to defining units of analysis, algorithms, and visualisation procedures.” Bernhard Rieder and Theo Röhle, ‘Digital Methods: Five Challenges’ in: David M Berry ed., Understanding Digital Humanities (Houndmills: Palgrave Macmillan, 2012) 67-85, 70.
  • 63. Overview of exercises http://goo.gl/72fCn7
  • 64. Tools & workflows Voyant Tools Voyant Tools Documentation Programming Historian DIRT: Digital Research Tools Turkel, William J., Kevin Kee, and Spencer Roberts, ‘A Method for Navigating the Infinite Archive’ in: Toni Weller ed., History in the Digital Age (London; New York: Routledge, 2013). William J. Turkel: How To
  • 65. Further reading Special issue on Digital History, BMGN - Low Countries Historical Review, 128/4 (2013). Haber, Peter, Digital Past : Geschichtswissenschaft Im Digitalen Zeitalter (München: Oldenbourg Verlag, 2011). Boonstra, Onno, Leen Breure, and Peter Doorn, Past, Present and Future of Historical Information Science (Amsterdam: NIWI-KNAW, 2004). Ciravegna, Fabio, Mark Greengrass, Tim Hitchcock, Sam Chapman, Jamie McLaughlin, and Ravish Bhagdev, ‘Finding Needles in Hay- Stacks: Data-Mining in Distributed Historical Datasets’ in: Mark Greengrass and Lorna M Hughes eds., The Virtual Representation of the Past (Ashgate, 2008). Cohen, D, F Gibbs, T Hitchcock, G Rockwell, J Sander, R Shoemaker, S Sinclair, S Takats, W J Turkel, and C Briquet. "Data Mining with Criminal Intent." Final white paper (2011). Hagood, Jonathan, "A Brief Introduction to Data Mining Projects in the Humanities." Bulletin of the American Society for Information Science and Technology 38/4 (2012). Hitchcock, Tim, "Big Data for Dead People: Digital Readings and the Conundrums of Positivism." (9 December 2013). Leonard, Peter, "Mining Large Datasets for the Humanities”, IFLA WLIC 2014.
  • 66. Dr. Gerben Zaagsma http://gerbenzaagsma.org de.linkedin.com/in/gerbenzaagsma/ https://twitter.com/gerbenzaagsma https://uni-goettingen.academia.edu/GerbenZaagsma https://www.researchgate.net/profile/Gerben_Zaagsma https://www.slideshare.net/gerbenzaagsma
  • 67. Image credits The Field Museum Library, Hall 37 Geology overview. URL: https://www.flickr.com/photos/ field_museum_library/3333920156/in/set-72157614881700424. The U.S. National Archives, Photograph of Card Catalog in Central Search Room, 1942. URL: http:// www.flickr.com/photos/usnationalarchives/3873932255/. Witch computer 1951: Wolverhampton and Staffordshire College of Technology in 1961, The National Computing Museum and Computer Conservation Society/UKAEA/Wolverhampton Express and Star, via: http://www.wired.com/2009/09/britan-oldest-computer/. Code: https://www.flickr.com/photos/lord_james/4696338852/. Tools: Flickr Commons The droids we're googling for: https://www.flickr.com/photos/st3f4n/3951143570/. Jaws (Steven Spielberg) original movie poster: https://en.wikipedia.org/wiki/File:JAWS_Movie_poster.jpg Structured/unstructured data: http://www.emc.com/collateral/demos/microsites/emc-digital-universe- 2011/index.htm Macbook Data Mining: http://www.flickr.com/photos/17208993@N00/442531562/. Topic Modeling Martha Ballard’s Diary: http://www.cameronblevins.org/posts/topic-modeling-martha-ballards- diary/. Boolean operators: http://uksourcers.co.uk/2012/capital-letters-the-key-to-boolean-success/ Miami University students in laboratory classroom 1908: https://www.flickr.com/photos/ muohio_digital_collections/3199691495/