Finding/Capturing
How do you know what you are looking for?
!

4 February 2014
How Do You Engage?
Who Found a New Way to Engage and Practice Social Scholarship after last
week’s lecture?
Automated Twitter Follow/Search
‣

Active History (http://www.activehistory.co.uk/historianson-twitter/)
Is there digital artefacts that are not data?
Reading
‣

Europeana
“Digitisation and online accessibility are essential ways to
highlight cultural and scientific heritage, to inspire the
creation of new content and to encourage new online
services to emerge. They help to democratise access and to
develop the information society and the knowledge-based
economy.”
Europeana Objective:
!

"to give access to all of Europe’s

digitised cultural heritage by 2025."
Europeana by the Numbers
30000000
22500000
15000000
7500000
0

2008

2009

2010

2011

2012

2013

2014

2015
Barriers
‣
‣
‣

Intellectual Property
Acceleration of Process of Digitisation
Sustainable Funding
“Creativity is the driving force
of economic growth.”
Richard Florida, The Rise of the Creative Class, Basic Books, 2002

!
"If a resource does not have any
associated metadata information, then
it is essentially lost."
"If a resource has erroneous,
inconsistent, or not enough metadata
information, then it is essentially nonexistent."
Initial Questions
‣

Structure?
‣

‣

Format?
‣

‣

Structured versus Unstructured?
What do you use?

Use?
‣

Where is this all going?
A Data Vis driven way to locate papers
Data Consisting of What?
‣

Basic types of content that we are used to deal with:
‣
‣
‣
‣

‣

Text
Numbers
Images
Video

Other, more “complex” stuff:
‣
‣
‣
‣

Relations, connections, links - a genealogy
Time and space coords - the path of migratory birds
Animations – a piece of courseware
3D models – the plan of your house
Sources
‣

Data
‣
‣
‣
‣

‣

Image Data
‣
‣

‣

Europeana Project Guttenberg Open Data Catalog NINES Getty British Library

Systems / Tools
‣
‣

Spiders - Heretrix Getty -
Data: Europeana
‣

Europeana
Data: Project Guttenberg
Data: Open Data Catalog
Open Data Index
Open Data Ireland
Data: NINES
Images: The Getty Archive
Images: The British Library
Portals are one thing, but increasingly it is through
communication streams that you discover what’s available…
blog post on Active History this am.
Tools: Spiders
‣
‣

https://webarchive.jira.com/
Heritrix is the Internet Archive's open-source, extensible,
web-scale, archival-quality web crawler project.
Tools: Getty Vocabularies
Next Week: Analysing
Please take a look at:
!

Reading Ship’s Logs
Exploring 50000 Images from the WWW
Comparing Web Archives by Using Large Numbers of
Images
Thank You
shawn.day@ucc.ie @iridium

How do you know what you are looking for?