e-Research and the Data Librarian   Stuart Macdonald Edinburgh University Data Library / EDINA National Data Centre Luis M...
<ul><li>What are data? </li></ul><ul><li>Where do you get it from? </li></ul><ul><li>Data support services </li></ul><ul><...
What are Data? <ul><ul><li>Some definitions: </li></ul></ul><ul><ul><li>a collection of observations or other information ...
Data Types <ul><li>Social Sciences -  micro data; aggregated data; geospatial data; financial data; qualitative data; in a...
 
<ul><li>More data will be created in the next five years than has been collected in the </li></ul><ul><li>whole of human h...
Research Council-funded Data Centres  <ul><li>EDINA, MIMAS (JISC/ESRC) </li></ul><ul><li>UK Data Archive, ESDS (JISC/ESRC)...
<ul><li>National Statistical Agencies: </li></ul><ul><ul><li>Office of National Statistics (ONS) - http://www.statistics.g...
Data Support Services <ul><li>Institutions provide support for data services in different ways:   </li></ul><ul><ul><li>Da...
UK Data Libraries <ul><li>Edinburgh University Data Library  -  first such service in the UK, 1983 </li></ul><ul><li>Oxfor...
Web 2.0 – lateral thinking in a linear world? <ul><li>Blogs and wikis – Wordpress, blogger </li></ul><ul><li>Social Bookma...
Institutional Repositories <ul><li>UK Repository Projects: </li></ul><ul><ul><li>StORe – Source-to-Output Repositories </l...
<ul><ul><li>eScience, e-Social Science, e-Research and cyberinfrastructure  </li></ul></ul><ul><ul><li>“ E-Research extend...
Examples <ul><li>GRIDPP </li></ul><ul><ul><li>Large Hadron Collider </li></ul></ul><ul><ul><li>GRID Prototype to analyze d...
Examples <ul><li>CQeSS </li></ul><ul><ul><li>Develop and support quantitative  </li></ul></ul><ul><ul><li>E-Social Science...
Seamless Access to Multiple Datasets (SAMD) <ul><li>MIMAS as major contributor </li></ul><ul><li>ESRC and DTI funded </li>...
 
 
 
DISC-UK DATASHARE PROJECT <ul><li>JISC Repository and Preservation Programme </li></ul><ul><li>March 2007 to March 2009  <...
<ul><li>Growing presence of IRs </li></ul><ul><li>SToRe Social Science Report </li></ul><ul><ul><li>70% of survey responde...
Deliverables <ul><li>Enhancements to partners’ IRs  </li></ul><ul><li>Exemplars of the process of setting up an institutio...
Issues <ul><li>Management: storage, curation, policies  </li></ul><ul><li>Legal: access rights, confidentiality and creati...
<ul><li>Thank you </li></ul>
Upcoming SlideShare
Loading in …5
×

e-Research and the Data Librarian

980 views
895 views

Published on

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
980
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
11
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide
  • Financial or company level data Sea Ice concentration model Underlying Sea Ice data from the Hadley Centre Aerial Photograph of Durham SPSS micro data file Molecular structure of Vitamin B2 Photon Cross Section Data for Vanadium Country-level consumer price index data Spatial data – postcode sector attribute data with corresponding boundary data
  • e-Research and the Data Librarian

    1. 1. e-Research and the Data Librarian Stuart Macdonald Edinburgh University Data Library / EDINA National Data Centre Luis Martinez London School of Economics Data Library
    2. 2. <ul><li>What are data? </li></ul><ul><li>Where do you get it from? </li></ul><ul><li>Data support services </li></ul><ul><li>Developments in data storage, dissemination and analysis </li></ul><ul><li>e-Research definition and examples </li></ul><ul><li>DISC-UK DataShare </li></ul>
    3. 3. What are Data? <ul><ul><li>Some definitions: </li></ul></ul><ul><ul><li>a collection of observations or other information related to a particular question, problem, experiment or place </li></ul></ul><ul><ul><li>information, most commonly in the form of a series of binary digits, stored on a physical storage medium for manipulation by a computer program </li></ul></ul><ul><ul><li>information in numerical form that can be digitally transmitted or processed </li></ul></ul><ul><ul><li>a representation of facts, concepts, or instructions in a formalized manner suitable for communication, interpretation, or processing by humans or by automated means </li></ul></ul>
    4. 4. Data Types <ul><li>Social Sciences - micro data; aggregated data; geospatial data; financial data; qualitative data; in addition to commercial or private data (bank transactions, Tesco customer purchase records, government administrative records, CCTV footage) </li></ul><ul><li>‘ Hard Science’ : astronomical and meteorological observations; climate modelling; crystallography; gene sequence data; clinical and epidemiological records; mass spectrometry; satellite or archaeological images and aerial photography; polar orbit tracking data; chemical, structural and mechanical engineering data; remote sensing,……… </li></ul><ul><li>Associated concerns : </li></ul><ul><li>ethics (confidentiality/disclosure), </li></ul><ul><li>scale (time/storage), </li></ul><ul><li>proprietary formats, </li></ul><ul><li>copyright and legal issues, </li></ul><ul><li>long-term preservation </li></ul>
    5. 6. <ul><li>More data will be created in the next five years than has been collected in the </li></ul><ul><li>whole of human history. Properly managed, this data will form major resource </li></ul><ul><li>for Australian researchers. </li></ul><ul><li>* Department of Education, Science and Training (2007) &quot;Backing Australia's Ability - An Ongoing Commitment&quot; – url: </li></ul><ul><li>http://backingaus.innovation.gov.au/info_booklet/on_commit.htm </li></ul><ul><li>Researchers, government institutions, non-profit organizations, schools, </li></ul><ul><li>commercial organizations, and individual citizens all need the widest possible </li></ul><ul><li>access to data from all sources to explore, experiment, test, create new </li></ul><ul><li>knowledge and new products, and, ultimately, to increase understanding </li></ul><ul><li>of our world. </li></ul><ul><li>*Harlan Onsrud and James Campbell, Department of Spatial Information Science and Engineering, University of Maine [2006] – “Big Opportunities in Access to ‘Small Science’ Data” </li></ul>‘ increase the democratisation of knowledge’
    6. 7. Research Council-funded Data Centres <ul><li>EDINA, MIMAS (JISC/ESRC) </li></ul><ul><li>UK Data Archive, ESDS (JISC/ESRC) </li></ul><ul><li>Arts and Humanities Data Service (AHRC/JISC) </li></ul><ul><li>NGDC - National Geoscience Data Centre (NERC) </li></ul><ul><li>BADC - British Atmospheric Data Centre (NERC) </li></ul><ul><li>AEDC - Antarctic Environmental Data Centre (NERC) </li></ul><ul><li>NEODC - NERC Earth Observation Data Centre (NERC) </li></ul><ul><li>BODC - British Oceanographic Data Centre (NERC) </li></ul><ul><li>NEBC - NERC Environmental Bioinformatics Centre (NERC) </li></ul><ul><li>UK Cluster Data Centre ( Particle Physics and Astronomy Research Council) </li></ul><ul><li>UK Stem Cell Bank (MRC) </li></ul><ul><li>UK DNA Banking Network (MRC) </li></ul><ul><li>Brain Tissue Bank (MRC) </li></ul><ul><li>UKIDC - UK Infrared Space Observatory Data Centre (STFC) </li></ul><ul><li>UKSSDC - UK Solar System Data Centre (STFC) </li></ul><ul><li>Chemical Database Service (STFC) </li></ul>
    7. 8. <ul><li>National Statistical Agencies: </li></ul><ul><ul><li>Office of National Statistics (ONS) - http://www.statistics.gov.uk/ </li></ul></ul><ul><ul><li>General Register Office for Scotland (GROS) - http://www.gro-scotland.gov.uk/ </li></ul></ul><ul><ul><li>Northern Ireland Statistics and Research Agency (NISRA) - http://www.nisra.gov.uk/ </li></ul></ul><ul><ul><li>Statistics for Wales - http://new.wales.gov.uk/topics/statistics/ </li></ul></ul><ul><ul><li>Eurostat - http://epp.eurostat.ec.europa.eu/portal/ </li></ul></ul><ul><li>Free Resources: </li></ul><ul><ul><li>Non-Governmental Organisations </li></ul></ul><ul><ul><li>Government websites (national/local) </li></ul></ul><ul><ul><li>Independent Research Organisations </li></ul></ul><ul><ul><li>Charitable Organisations </li></ul></ul><ul><ul><li>Media Organisations </li></ul></ul><ul><li>Data Discovery Tools: </li></ul><ul><ul><li>Intute: http://www.intute.ac.uk/ </li></ul></ul><ul><ul><li>Go-Geo! - http://www.gogeo.ac.uk/ </li></ul></ul>Other Sources
    8. 9. Data Support Services <ul><li>Institutions provide support for data services in different ways: </li></ul><ul><ul><li>Data Libraries </li></ul></ul><ul><ul><li>University Libraries </li></ul></ul><ul><ul><li>Computing Centres </li></ul></ul><ul><ul><li>Research Offices </li></ul></ul><ul><ul><li>Academic Departments </li></ul></ul><ul><li>Data Libraries go beyond local support of national data centres & statistical agencies: </li></ul><ul><ul><li>Act as a ‘repository’ of data </li></ul></ul><ul><ul><li>Reference service </li></ul></ul><ul><ul><li>Train users to access and handle data resources </li></ul></ul>
    9. 10. UK Data Libraries <ul><li>Edinburgh University Data Library - first such service in the UK, 1983 </li></ul><ul><li>Oxford University Data Library – 1988 </li></ul><ul><li>London School of Economics Data Library –1997 </li></ul><ul><li>RLab Data Service – 1999, providing support to LSE’s research laboratory </li></ul><ul><li>Other institutions with ‘Social Statistics’ libraries : </li></ul><ul><ul><ul><ul><li>University of Southampton </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Strathclyde University </li></ul></ul></ul></ul><ul><li>DISC-UK (Data Information Specialist Committee – UK) </li></ul><ul><ul><li>Foster understanding between data users and providers </li></ul></ul><ul><ul><li>Raise awareness of the value of data support in Universities </li></ul></ul><ul><ul><li>Share information and resources among local data support staff </li></ul></ul><ul><ul><li>URL:http://www.disc-uk.org/ </li></ul></ul>
    10. 11. Web 2.0 – lateral thinking in a linear world? <ul><li>Blogs and wikis – Wordpress, blogger </li></ul><ul><li>Social Bookmarking – del.icio.us </li></ul><ul><li>Media-sharing services – YouTube, Flickr, Scridb </li></ul><ul><li>Social networking systems – MySpace, Elgg </li></ul><ul><li>Collaborative editing tools – Google Docs and Spreadsheets, Gliffy </li></ul><ul><li>Syndication technologies – RSS </li></ul><ul><li>Mashups: </li></ul><ul><li>Numeric Data: </li></ul><ul><li>Swivel - http://swivel.com/ </li></ul><ul><li>Many Eyes - http://services.alphaworks.ibm.com/manyeyes/home </li></ul><ul><li>Data360 - http://www.data360.co.uk/ </li></ul><ul><li>Spatial Data : </li></ul><ul><li>BackOfMyHand – http://www.backofmyhand.com </li></ul><ul><li>Map Builder – http://www.mapbuilder.net, </li></ul><ul><li>Maptrot – http://www.maptrot.com, </li></ul><ul><li>Click2Map – http://www.click2map.com, </li></ul><ul><li>Blockrocker – http://www.blockrocker.com </li></ul>
    11. 12. Institutional Repositories <ul><li>UK Repository Projects: </li></ul><ul><ul><li>StORe – Source-to-Output Repositories </li></ul></ul><ul><ul><li>GRADE - Geospatial Repository for Academic Deposit and Extraction </li></ul></ul><ul><ul><li>R4L – Repository for the Laboratory </li></ul></ul><ul><ul><li>SPECTRa – Submission, Preserv’n & Exposure of Chemistry Teaching & Research data </li></ul></ul><ul><ul><li>CLADDIER – Citation, Location And Deposition in Discipline & Institutional Repositories </li></ul></ul><ul><li>Issues for further development: </li></ul><ul><ul><li>Interoperability - Dublin Core, OAI versus domain-specific XML schemas </li></ul></ul><ul><ul><li>Embedding - repository seen as part of the organisational workflow </li></ul></ul><ul><ul><li>Redefining repository - as a suite of methodological and technological processes that facilitate the research lifecycle </li></ul></ul><ul><ul><li>Web 2.0 tools for collaboration - across and within department / institution / discipline </li></ul></ul><ul><ul><li>Clarity on data citation & persistent identifiers </li></ul></ul><ul><ul><li>Data rights - open access v restricted access v user-defined access </li></ul></ul><ul><li>Domain-Specific Repositories: </li></ul><ul><ul><li>ArXiv.org – physics, maths, computer science </li></ul></ul><ul><ul><li>Blue Obelisk Data Repository – chemoinformatics </li></ul></ul><ul><ul><li>PubMedCentral – biomedical and lifesciences </li></ul></ul>
    12. 13. <ul><ul><li>eScience, e-Social Science, e-Research and cyberinfrastructure </li></ul></ul><ul><ul><li>“ E-Research extends e-Science’s remit to all sciences referring to the use of distributed resources across multiple domains to do science or further research with the following key features: collaborative, multidisciplinary, use of GRID technologies and vast amounts of data” (CURL Workshop, 2005) </li></ul></ul>
    13. 14. Examples <ul><li>GRIDPP </li></ul><ul><ul><li>Large Hadron Collider </li></ul></ul><ul><ul><li>GRID Prototype to analyze data </li></ul></ul><ul><li>AstroGRID </li></ul><ul><ul><li>UK contribution to Virtual </li></ul></ul><ul><ul><li>Observatory </li></ul></ul>
    14. 15. Examples <ul><li>CQeSS </li></ul><ul><ul><li>Develop and support quantitative </li></ul></ul><ul><ul><li>E-Social Science </li></ul></ul><ul><li>MiMeG </li></ul><ul><ul><li>Tools and techniques to analyse </li></ul></ul><ul><ul><li>audio-visual qualitative data </li></ul></ul>
    15. 16. Seamless Access to Multiple Datasets (SAMD) <ul><li>MIMAS as major contributor </li></ul><ul><li>ESRC and DTI funded </li></ul><ul><li>Solving a problem of the UK academic Social Science community </li></ul>
    16. 20. DISC-UK DATASHARE PROJECT <ul><li>JISC Repository and Preservation Programme </li></ul><ul><li>March 2007 to March 2009 </li></ul><ul><li>DISC-UK members </li></ul><ul><ul><li>EDINA (lead) </li></ul></ul><ul><ul><li>University of Edinburgh </li></ul></ul><ul><ul><li>London School of Economics </li></ul></ul><ul><ul><li>University of Oxford </li></ul></ul><ul><ul><li>University of Southampton </li></ul></ul><ul><li>Purpose </li></ul><ul><li>“ provide exemplars for a range of approaches and policies in which to embed the deposit and stewardship of datasets in institutional repositories” </li></ul>
    17. 21. <ul><li>Growing presence of IRs </li></ul><ul><li>SToRe Social Science Report </li></ul><ul><ul><li>70% of survey respondents producing quantitative questionnaire data </li></ul></ul><ul><ul><li>Vast majority of researchers not depositing data </li></ul></ul>DATASHARE Motivation
    18. 22. Deliverables <ul><li>Enhancements to partners’ IRs </li></ul><ul><li>Exemplars of the process of setting up an institutional data repository service </li></ul><ul><li>Documentation and open source code for adapting repository software for handling datasets. </li></ul><ul><li>Technical watch on e-Research, VREs and Web 2.0 developments. </li></ul><ul><li>Papers, presentations and online dissemination of collected knowledge. </li></ul>
    19. 23. Issues <ul><li>Management: storage, curation, policies </li></ul><ul><li>Legal: access rights, confidentiality and creating public use files </li></ul><ul><li>Technical: standards to describe, transport and communicate </li></ul><ul><li>Cultural and political: do people want to share data? Central vs. distributed. Self-archiving vs. assisted deposited </li></ul>
    20. 24. <ul><li>Thank you </li></ul>

    ×