Prescottbigdata

1,126 views

Published on

Presentation on big data and AHRC Digital Transformations theme for AHRC Big Data workshop, London, 25 June 2013

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,126
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
5
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Prescottbigdata

  1. 1. Professor Andrew Prescott, Theme Leader FellowAHRC Digital TransformationsStrategic ThemeBig Data: Some Initial Reflections
  2. 2. • The Met Office currently generates about 20TB ofdata each day• ‘The problems which confront the meteorologisttoday will be faced by the humanities scholar withinten years’
  3. 3. • Large Hadron Collider: 600 million ‘collision events’ persecond• One million jobs run by servers each day, with over 10GB of data per second transferred at peak times• Approx. 20 petabytes of data produced annually• Over 70 universities involved in processing the data
  4. 4. http://www.flickr.com/photos/ibm_research_zurich/6777192080/in/set-72157629212636619
  5. 5. Whole brain imaging of neurone activity in a zebra fish, made usinglight sheet microscopy by Misha Ahrens and neuroscientists at theHoward Hughes Medical Institute. Each image comprises over 1terabyte of data.Link:http://www.youtube.com/watch?feature=player_embedded&v=KE9mVEimQVU
  6. 6. • Some working definitions of big data• Big data exceeds the capacity of existingdesktop machines and networks: you needhelp to deal with it• Data that is so large that existing methodsof analysis simply don’t work: you have tochange your methodology (probably tosomething quantitative)• Gartner definition: “Big data” is high-volume, -velocity and –variety informationassets that demand cost-effective,innovative forms of information processingfor enhanced insight and decision making.
  7. 7. Examples of everyday bigdata of research value• Retail data generated by supermarkets• Online retail data: Amazon• Transport information: Oyster card• Hospital data• Data from utility companies• Social media
  8. 8. Visualisation of languages used in tweets in London inSummer 2012: Centre for Advanced Spatial Analysis, UCL:http://mappinglondon.co.uk/2012/londons-twitter-tongues/
  9. 9. Wolphram Alpha analytics of my Facebook friends
  10. 10. Analytic of my friend network
  11. 11. Does Big Data Yet Exist forthe Humanities?
  12. 12. Letter of Gladstone toDisraeli, 1878: BritishLibrary, Add. MS. 44457, f.166The political and literarypapers of Gladstonepreserved in the BritishLibrary comprise 762volumes containingapprox. 160,000documents.
  13. 13. George W. Bush Presidential Library:200 million e-mails4 million photographs
  14. 14. A Thousand Words: Advanced Visualisation in theHumanitiesTexas Advanced Computing CenterLink: http://www.youtube.com/watch?v=kvOuJ2RwBTA
  15. 15. ‘Big data’ has alreadybeen an issue forlinguists for many years
  16. 16. Another familiar example of big data in thehumanities: censuses
  17. 17. Moving images and sound present some of the mostchallenging big data issue for arts and humanities
  18. 18. Archives and library catalogues as big data:Visible Archive browser: visiblearchive.blogspot.com
  19. 19. Visualisation by Jon Orwant of Google of Library ofCongress subject categorisations of books publishedbetween 1600 and 2010: winedarksea.org
  20. 20. Commons Explorer: experimental interface to allowexploration of large quantities of images in FlickrCommons: http://mtchl.net/cex/
  21. 21. The Anglo-American Legal Tradition: web siteholding seven million images of medievallegal records in the National Archives:www.aalt.law.uh.edu
  22. 22. Fabio Lattanzi Antinori,The Obelisk (2012): Open DataInstitute: http://www.theodi.org/culture/obelisk-2012
  23. 23. Asia Trend Map:predicting popularity ofgames, manga andanime:www.asiatrendmap.jp
  24. 24. Some Big Data Issues• Research has historically been hypothesis-driven; is a more data-driven researchrequired?• How valid are predictive and probabilistictechniques in arts and humanities research?• Data quality issues: do we lose a sense of thecontext and stratigraphy of the data?• Danger of thinking that data=truth
  25. 25. Digital Transformation theme andBig Data• Theme seeks to promote new researchmethods: using digital tools and materials todevelop completely new type of scholarship• Additional funding of £4m has been allocatedto work on big data• Following this workshop, call for big dataprojects will be issued• Smaller projects (up to £100k)• Larger projects (up to £600k)

×