Will We Command Our Data? From the Petascale to the Personal

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

1 comments

Comments 1 - 1 of 1 previous next Post a comment

Post a comment
Embed Video
Edit your comment Cancel

Notes on slide 1

http://en.wikipedia.org/wiki/File:Postduif.jpg (public domain)

http://www.flickr.com/photos/rakerman/2907065239/

1 Favorite

Will We Command Our Data? From the Petascale to the Personal - Presentation Transcript

  1. Richard Akerman
    NRC-CISTI
    Presented at Access 2009, Oct. 1, 2009
    Will We Command Our Data?From the Petascale to the Personal
  2. Overview
    Definitions / Assumptions
    How Big is Data?
    Four Sources of Data
    Drivers
    Activities
  3. Definitions / Assumptions
    Petabyte = 1000 Terabytes
    data = datasets
    “data is”
  4. How Big is Data?
    http://www.instructables.com/file/FA9N61CF54HJ6GG/
  5. How Big is Data?
    http://www.flickr.com/photos/doctorow/2731870631/
  6. How Big is Data?
    http://en.wikipedia.org/wiki/File:Postduif.jpg
  7. Four Sources of Data
    Research data
    Government data
    Library data
    Personal data
  8. General Drivers
    Since 2000, a convergence of factors:
    Value of sharing
    Ease of sharing
    Level of sharing (machine level)
  9. Specific Drivers: Research Data
    OECD Principles and Guidelines for Access to Research Data from Public Funding (April 2007)
    The Toronto Statement on prepublication data sharing (September 2009)
  10. OECD Principles
    “Open access to research data from public funding should be easy, timely, user-friendly and preferably Internet-based.”
    http://www.flickr.com/photos/ben-zvan-photography/468487548/
  11. Specific Drivers: Open Government Data
    US Memorandum on Transparency and Open Government (January 2009)
    US Memorandum on the Freedom of Information Act (January 2009)
  12. Specific Drivers: Open Government Data
    UK Power of Information Task Force Report (March 2009)
    Modernise data publishing and reusehttp://poit.cabinetoffice.gov.uk/poit/category/data-final/
    “public information held by for example the police, health bodies and local authorities is often not available. This is bad for democratic expression, the economy and citizen customers.”
    Data.gov (May 2009)
    UK PM Brown meets with Sir Berners-Lee (Sept. 2009)
  13. Specific Drivers: Library Data
    ILS Customer Bill-of-Rights, John Blyberg (November 2005)
    “Berkeley Accord” (March 2008)
  14. Specific Drivers: Personal Data
    Wired cover feature “Living by numbers” (July 2009)
    “Know Thyself: Tracking Every Facet of Life, from Sleep to Mood to Pain, 24/7/365”
    “Numbers are making their way into the smallest crevices of our lives. We have pedometers in the soles of our shoes and phones that can post our location as we move around town. We can tweet what we eat into a database and subscribe to Web services that track our finances. There are sites and programs for monitoring mood, pain, blood sugar, blood pressure, heart rate, … and prayers.”
  15. Why Libraries
    Advocates
    Exemplars
    Experts
  16. Research Data:DataCite
    http://www.datacite.org/
    “DOIs for data”
    “The long term vision of the partnership is to support researchers by providing methods for them to locate, identify, and cite research datasets with confidence.”
  17. Research Data: Gateway to Data Sets
    NRC-CISTI, Gateway to (Canadian) Scientific Data Sets
    http://cisti-icist.nrc-cnrc.gc.ca/eng/services/cisti/scientific-data/data-sets/
    e.g. Canadian Astronomy Data Centre (CADC), Large Synoptic Survey Telescope (LSST)
  18. Government Data: Canada - Federal
    http://geogratis.cgdi.gc.ca/
    StatsCanData Liberation Initiative (DLI)
    Ontario Data Documentation, Extraction Service and Infrastructure Initiative (ODESI)
    “The project will target Statistics Canada datasets... The files will be marked-up using DDI, an international, XML-based metadata tagging system which allows data resource discovery, distributed access, extraction and analysis.”
  19. Government Data: Municipal - Vancouver
    http://data.vancouver.ca/
  20. Government Data:Municipal - SF
    San Francisco http://datasf.org/
  21. Library Data
    A million free covers from LibraryThing
    Open Library http://openlibrary.org/dev/docs/data
    Talis Connected Commons
    MESUR – Services
    http://id.loc.gov/ (LCSH)
  22. APIs vs raw data
    APIs
    Always serve up latest data
    Control over access
    Tracking/stats
    Advanced/complex functionality on top of the data
    Raw data
    Unconstrained / can do things never imagined by API
    Hard to track / version
    Can lose metadata
    Allows choice of computing
  23. Personal Data:Daytum
    http://www.daytum.com/
  24. Personal Data:Total Recall
    http://totalrecallbook.com/(Sept. 2009)
  25. Richard Akerman
    © 2009 Government of Canada
    Licensed in the Creative Commons
    Thank You
    http://creativecommons.org/licenses/by-nc-sa/2.5/ca/

+ Richard AkermanRichard Akerman, 1 month ago

custom

219 views, 1 favs, 0 embeds more stats

More info about this document

CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

Go to text version

  • Total Views 219
    • 219 on SlideShare
    • 0 from embeds
  • Comments 1
  • Favorites 1
  • Downloads 0
Most viewed embeds

more

All embeds

less

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

Cancel
File a copyright complaint
Having problems? Go to our helpdesk?

Categories