Professor Andrew Prescott, Theme Leader Fellow
AHRC Digital Transformations
Strategic Theme
Big Data: Some Initial Reflect...
• The Met Office currently generates about 20TB of
data each day
• ‘The problems which confront the meteorologist
today wi...
• Large Hadron Collider: 600 million ‘collision events’ per
second
• One million jobs run by servers each day, with over 1...
http://www.flickr.com/photos/ibm_research_zurich/6777192080/in/set-72157629212636619
Whole brain imaging of neurone activity in a zebra fish, made using
light sheet microscopy by Misha Ahrens and neuroscient...
• Some working definitions of big data
• Big data exceeds the capacity of existing
desktop machines and networks: you need...
Examples of everyday big
data of research value
• Retail data generated by supermarkets
• Online retail data: Amazon
• Tra...
Visualisation of languages used in tweets in London in
Summer 2012: Centre for Advanced Spatial Analysis, UCL:
http://mapp...
Wolphram Alpha analytics of my Facebook friends
Analytic of my friend network
Does Big Data Yet Exist for
the Humanities?
Letter of Gladstone to
Disraeli, 1878: British
Library, Add. MS. 44457, f.
166
The political and literary
papers of Gladst...
George W. Bush Presidential Library:
200 million e-mails
4 million photographs
A Thousand Words: Advanced Visualisation in the
Humanities
Texas Advanced Computing Center
Link: http://www.youtube.com/wa...
‘Big data’ has already
been an issue for
linguists for many years
Another familiar example of big data in the
humanities: censuses
Moving images and sound present some of the most
challenging big data issue for arts and humanities
Archives and library catalogues as big data:
Visible Archive browser: visiblearchive.blogspot.com
Visualisation by Jon Orwant of Google of Library of
Congress subject categorisations of books published
between 1600 and 2...
Commons Explorer: experimental interface to allow
exploration of large quantities of images in Flickr
Commons: http://mtch...
The Anglo-American Legal Tradition: web site
holding seven million images of medieval
legal records in the National Archiv...
Fabio Lattanzi Antinori,The Obelisk (2012): Open Data
Institute: http://www.theodi.org/culture/obelisk-2012
Asia Trend Map:
predicting popularity of
games, manga and
anime:
www.asiatrendmap.jp
Some Big Data Issues
• Research has historically been hypothesis-
driven; is a more data-driven research
required?
• How v...
Digital Transformation theme and
Big Data
• Theme seeks to promote new research
methods: using digital tools and materials...
Big Data: Some Initial Reflectons
Upcoming SlideShare
Loading in...5
×

Big Data: Some Initial Reflectons

117

Published on

Slides from presentation to AHRC internal staff seminar, April 2014

Published in: Internet
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
117
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Big Data: Some Initial Reflectons

  1. 1. Professor Andrew Prescott, Theme Leader Fellow AHRC Digital Transformations Strategic Theme Big Data: Some Initial Reflections
  2. 2. • The Met Office currently generates about 20TB of data each day • ‘The problems which confront the meteorologist today will be faced by the humanities scholar within ten years’
  3. 3. • Large Hadron Collider: 600 million ‘collision events’ per second • One million jobs run by servers each day, with over 10 GB of data per second transferred at peak times • Approx. 20 petabytes of data produced annually • Over 70 universities involved in processing the data
  4. 4. http://www.flickr.com/photos/ibm_research_zurich/6777192080/in/set-72157629212636619
  5. 5. Whole brain imaging of neurone activity in a zebra fish, made using light sheet microscopy by Misha Ahrens and neuroscientists at the Howard Hughes Medical Institute. Each image comprises over 1 terabyte of data. Link: http://www.youtube.com/watch?feature=player_embedded&v=KE9mVEimQVU
  6. 6. • Some working definitions of big data • Big data exceeds the capacity of existing desktop machines and networks: you need help to deal with it • Data that is so large that existing methods of analysis simply don’t work: you have to change your methodology (probably to something quantitative) • Gartner definition: “Big data” is high- volume, -velocity and –variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making.
  7. 7. Examples of everyday big data of research value • Retail data generated by supermarkets • Online retail data: Amazon • Transport information: Oyster card • Hospital data • Data from utility companies • Social media
  8. 8. Visualisation of languages used in tweets in London in Summer 2012: Centre for Advanced Spatial Analysis, UCL: http://mappinglondon.co.uk/2012/londons-twitter-tongues/
  9. 9. Wolphram Alpha analytics of my Facebook friends
  10. 10. Analytic of my friend network
  11. 11. Does Big Data Yet Exist for the Humanities?
  12. 12. Letter of Gladstone to Disraeli, 1878: British Library, Add. MS. 44457, f. 166 The political and literary papers of Gladstone preserved in the British Library comprise 762 volumes containing approx. 160,000 documents.
  13. 13. George W. Bush Presidential Library: 200 million e-mails 4 million photographs
  14. 14. A Thousand Words: Advanced Visualisation in the Humanities Texas Advanced Computing Center Link: http://www.youtube.com/watch?v=kvOuJ2RwBTA
  15. 15. ‘Big data’ has already been an issue for linguists for many years
  16. 16. Another familiar example of big data in the humanities: censuses
  17. 17. Moving images and sound present some of the most challenging big data issue for arts and humanities
  18. 18. Archives and library catalogues as big data: Visible Archive browser: visiblearchive.blogspot.com
  19. 19. Visualisation by Jon Orwant of Google of Library of Congress subject categorisations of books published between 1600 and 2010: winedarksea.org
  20. 20. Commons Explorer: experimental interface to allow exploration of large quantities of images in Flickr Commons: http://mtchl.net/cex/
  21. 21. The Anglo-American Legal Tradition: web site holding seven million images of medieval legal records in the National Archives: www.aalt.law.uh.edu
  22. 22. Fabio Lattanzi Antinori,The Obelisk (2012): Open Data Institute: http://www.theodi.org/culture/obelisk-2012
  23. 23. Asia Trend Map: predicting popularity of games, manga and anime: www.asiatrendmap.jp
  24. 24. Some Big Data Issues • Research has historically been hypothesis- driven; is a more data-driven research required? • How valid are predictive and probabilistic techniques in arts and humanities research? • Data quality issues: do we lose a sense of the context and stratigraphy of the data? • Danger of thinking that data=truth
  25. 25. Digital Transformation theme and Big Data • Theme seeks to promote new research methods: using digital tools and materials to develop completely new type of scholarship • Additional funding of £4m has been allocated to work on big data • Following this workshop, call for big data projects will be issued • Smaller projects (up to £100k) • Larger projects (up to £600k)
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×