Digital Research
Dr Aquiles Alencar-Brayner
Digital Curator
@AquilesBrayner
http://britishlibrary.typepad.co.uk/digital-
scholarship/
www.bl.uk 2
Digital Scholarship at British Library
“The production, use and
integration of digital content,
services and tools to facilitate
scholarship and research. It
allows research areas to be
investigated in new ways,
using new tools, leading to
new discoveries and analysis
to generate new
understanding”
-Adam Farquhar
Head of Digital Scholarship
Created in 2010, the department
works to enable….
• production of digital content
• sharing and integration of
digital content
• wider collaboration and
contribution around digital content
• complex analysis & facilitation of
new discoveries
www.bl.uk 3
More than resource discovery…
• Libraries and archives have spent the
last two decades making digital assets
and harvesting born-digital objects.
• We can now do much more than use
technology to discover these digital
objects and embrace the opportunities
afforded by an intellectual turn toward
digitally-driven research
• So digital research is about:
– New tools
– New discoveries
– New understanding
“The emergence of the new
digital humanities [and
social sciences] isn’t an
isolated academic
phenomenon. The
institutional and disciplinary
changes are part of a larger
cultural shift, inside and
outside the academy, a
rapid cycle of emergence
and convergence in
technology and culture”
Steven E Jones, Emergence of
the Digital Humanities (2013)
www.bl.uk 4
Digital Libraries: 10 “in” rules
1.Integrity: access to digital
object as it has been created
2.Integration: different contents
and file formats available from a
single platform
3.Interoperability: different
programmes and operating
systems compatible with each
other
4.Instant access: unrestricted
access to material, especially
from mobile devices
5.Interaction: catalogues that
provide Web 2.0 features (blogs,
wikis, tags, content sharing, etc)
6.Information: comprehensive
metadata for fast and reliable
retrieval of content
7.Ingest of content: constant
upload of new digital content
8. Interpretation: digital content placed in
relation to other items in the collection
9.Innovation: material to be presented in
innovative ways
10.Indefinite access: digital objects to be
preserved for posterity
www.bl.uk 5
Scalability: how to filter, find and analyse
the information I need?
• How many data is generated in
ONE day?
1. Twitter: 7 TB
2. Facebook: 10 TB
• By 2020 we will have
approximately 35 ZB (1.1 Trillion
GB) of Data available
www.bl.uk 6
Analysis of digital content
• Ngram Viewer applied to
Web Archive collections
• Visualisation: Tag Cloud
• BL Georeferencer
www.bl.uk 7
Personal Digital Archive (PDA)
• Extracting and archiving digital content
from personal devices
• Assist with capture, management,
description, and preservation of
personal digital collections to facilitate
access and content analysis
• Data analysis beyond documents
www.bl.uk 8
BL Labs
• BL Labs
• British Library Mechanical
Curator
• Digital Music Labs
• Off the Map
www.bl.uk 9
“Literary scholars and historians have in the past been limited in their
analyses of print culture by the constraints of physical archives and human
capacity. A lone scholar cannot read, much less make sense
of, millions of newspaper pages. With the aid of computational
linguistics tools and digitized corpora, however, we are working toward a
large-scale, systemic understanding of how texts were valued and
transmitted during this period”
David A. Smith, Ryan Cordell, and Elizabeth Maddock Dillon, ‘Infectious Texts:
Modeling Text Reuse in Nineteenth-Century Newspapers’ (2013)
http://www.ccs.neu.edu/home/dasmith/infect-bighum-2013.pdf
www.bl.uk 10
Projects: some examples
• Corpus del Español
• French Oral Narrative Corpus
• Spatial History Project: Machado
de Assis
• Transcriptorium
www.bl.uk 11
Mapping Metaphor Project: University of Glasgow
www.bl.uk 12
Web Based Tools: some examples
• Wordle tool for generating “word clouds” from text
that you provide. The clouds give greater prominence
to words that appear more frequently in the source
text.
• Google Trends Look at search trends in
Google. Browse by date, or look at top searches in
different categories to see how it trended over time and
location.
• Google Public Data Explorer search
through databases from around the world, including
the World Bank, OECD, Eurostat and the U.S. Census
Bureau.
• Google Ngram Viewer search keywords in
millions of books over the span of half a millennium, a
useful tool for finding trends over time. Ngram Viewer
also has advanced options, such as searching for
particular keywords as specific parts of speech or
combining keywords
www.bl.uk 13
discipline
camp and
camps sentence
www.bl.uk 14
Music Ngram Viewer
Peachnote
http://www.peachnote.com
Created by Vladimir Viro
Ngram - Music
www.bl.uk 15
DIRT: Digital Research Tools
• he DiRT Directory is a registry of
digital research tools for scholarly
use. DiRT makes it easy for digital
humanists and others conducting
digital research to find and
compare resources ranging from
content management systems to
music OCR, statistical analysis
packages to mindmapping
software http://dirtdirectory.org/
www.bl.uk 16
New Tools, New Discoveries
• Crowd as a source
– UK Sound Map
• Open Access Software for
Research:
• http://sourceforge.net/
www.bl.uk 17
Task time
During your break, find a flip-chart and consider one
of the following questions:
– What analytical tools(s) would you like to use/develop for your
research?
– What are the ethical considerations when using digital data?
– Should all Humanities research be published openly?
– How might computational methods change the nature of
collaboration in Humanities?
www.bl.uk 18
Thank you!
@AquilesBrayner (aquiles.alencarbrayner@bl.uk)
/

Aquiles imlr seminar

  • 1.
    Digital Research Dr AquilesAlencar-Brayner Digital Curator @AquilesBrayner http://britishlibrary.typepad.co.uk/digital- scholarship/
  • 2.
    www.bl.uk 2 Digital Scholarshipat British Library “The production, use and integration of digital content, services and tools to facilitate scholarship and research. It allows research areas to be investigated in new ways, using new tools, leading to new discoveries and analysis to generate new understanding” -Adam Farquhar Head of Digital Scholarship Created in 2010, the department works to enable…. • production of digital content • sharing and integration of digital content • wider collaboration and contribution around digital content • complex analysis & facilitation of new discoveries
  • 3.
    www.bl.uk 3 More thanresource discovery… • Libraries and archives have spent the last two decades making digital assets and harvesting born-digital objects. • We can now do much more than use technology to discover these digital objects and embrace the opportunities afforded by an intellectual turn toward digitally-driven research • So digital research is about: – New tools – New discoveries – New understanding “The emergence of the new digital humanities [and social sciences] isn’t an isolated academic phenomenon. The institutional and disciplinary changes are part of a larger cultural shift, inside and outside the academy, a rapid cycle of emergence and convergence in technology and culture” Steven E Jones, Emergence of the Digital Humanities (2013)
  • 4.
    www.bl.uk 4 Digital Libraries:10 “in” rules 1.Integrity: access to digital object as it has been created 2.Integration: different contents and file formats available from a single platform 3.Interoperability: different programmes and operating systems compatible with each other 4.Instant access: unrestricted access to material, especially from mobile devices 5.Interaction: catalogues that provide Web 2.0 features (blogs, wikis, tags, content sharing, etc) 6.Information: comprehensive metadata for fast and reliable retrieval of content 7.Ingest of content: constant upload of new digital content 8. Interpretation: digital content placed in relation to other items in the collection 9.Innovation: material to be presented in innovative ways 10.Indefinite access: digital objects to be preserved for posterity
  • 5.
    www.bl.uk 5 Scalability: howto filter, find and analyse the information I need? • How many data is generated in ONE day? 1. Twitter: 7 TB 2. Facebook: 10 TB • By 2020 we will have approximately 35 ZB (1.1 Trillion GB) of Data available
  • 6.
    www.bl.uk 6 Analysis ofdigital content • Ngram Viewer applied to Web Archive collections • Visualisation: Tag Cloud • BL Georeferencer
  • 7.
    www.bl.uk 7 Personal DigitalArchive (PDA) • Extracting and archiving digital content from personal devices • Assist with capture, management, description, and preservation of personal digital collections to facilitate access and content analysis • Data analysis beyond documents
  • 8.
    www.bl.uk 8 BL Labs •BL Labs • British Library Mechanical Curator • Digital Music Labs • Off the Map
  • 9.
    www.bl.uk 9 “Literary scholarsand historians have in the past been limited in their analyses of print culture by the constraints of physical archives and human capacity. A lone scholar cannot read, much less make sense of, millions of newspaper pages. With the aid of computational linguistics tools and digitized corpora, however, we are working toward a large-scale, systemic understanding of how texts were valued and transmitted during this period” David A. Smith, Ryan Cordell, and Elizabeth Maddock Dillon, ‘Infectious Texts: Modeling Text Reuse in Nineteenth-Century Newspapers’ (2013) http://www.ccs.neu.edu/home/dasmith/infect-bighum-2013.pdf
  • 10.
    www.bl.uk 10 Projects: someexamples • Corpus del Español • French Oral Narrative Corpus • Spatial History Project: Machado de Assis • Transcriptorium
  • 11.
    www.bl.uk 11 Mapping MetaphorProject: University of Glasgow
  • 12.
    www.bl.uk 12 Web BasedTools: some examples • Wordle tool for generating “word clouds” from text that you provide. The clouds give greater prominence to words that appear more frequently in the source text. • Google Trends Look at search trends in Google. Browse by date, or look at top searches in different categories to see how it trended over time and location. • Google Public Data Explorer search through databases from around the world, including the World Bank, OECD, Eurostat and the U.S. Census Bureau. • Google Ngram Viewer search keywords in millions of books over the span of half a millennium, a useful tool for finding trends over time. Ngram Viewer also has advanced options, such as searching for particular keywords as specific parts of speech or combining keywords
  • 13.
  • 14.
    www.bl.uk 14 Music NgramViewer Peachnote http://www.peachnote.com Created by Vladimir Viro Ngram - Music
  • 15.
    www.bl.uk 15 DIRT: DigitalResearch Tools • he DiRT Directory is a registry of digital research tools for scholarly use. DiRT makes it easy for digital humanists and others conducting digital research to find and compare resources ranging from content management systems to music OCR, statistical analysis packages to mindmapping software http://dirtdirectory.org/
  • 16.
    www.bl.uk 16 New Tools,New Discoveries • Crowd as a source – UK Sound Map • Open Access Software for Research: • http://sourceforge.net/
  • 17.
    www.bl.uk 17 Task time Duringyour break, find a flip-chart and consider one of the following questions: – What analytical tools(s) would you like to use/develop for your research? – What are the ethical considerations when using digital data? – Should all Humanities research be published openly? – How might computational methods change the nature of collaboration in Humanities?
  • 18.
    www.bl.uk 18 Thank you! @AquilesBrayner(aquiles.alencarbrayner@bl.uk) /