JISC Cross-Programme Discovery Workshop
                                                                  April 2012


                                                 ed
                                                    uc dis
                                                 da atio cov
                                                   ta    n      e
                                                      da and ry in
                                                        ta
                                                           ev rese
                                                             e ry a
                                                                 w h rc h
                                                                    ere :




discovery and open data                                  #discopen
Amber Thomas @ambrouk
Programme Manager, JISC Digital Infrastructure Team
                                                      excluding ppt template
       workshop slides
                                                            see slide 2
Idea from Cameron Neylon

You are free to:                                                               Off the record
                                                                               questions,
                                                        excluding ppt template comments
                                                                               should be
                                                 copy, share, adapt or re-mix;
                                                                               flagged


                                                 photograph, film or broadcast;



                                                 blog, live-blog or post video of


this presentation provided that:
             You attribute the work to its author and respect the rights
             and licences associated with its components.
    Slide Concept by Cameron Neylon, who has waived all copyright and related or neighbouring rights. This slide only CCZero.
    2
    Social Media Icons adapted with permission from originals by Christopher Ross. Original images are available under GPL at:
2   http://www.thisismyurl.com/free-downloads/15-free-speech-bubble-icons-for-popular-websites
JISC Cross-Programme Discovery Workshop
                                                                                April 2012


discopen   Session objective: Explore possibilities offered by open data and address issues
           and highlight successful approaches
14:00      The data space: a bit of context (15min)
           Introductions (max 45 min)
15:00      Break
15:15      Addressing the issues with open data .Facilitated Discussion around:
           •what technological approach is being taken to opening the data (e.g. API,
           RSS, linked data)
           •how are you tackling licensing and IPR issues?
           •what are some of the key lessons you're learning along the way?
           •anything you didn't anticipate?
           •are you aware of any other projects or initiatives which your work would
           overlap with?
16:15      Break
16:30      Distilling the discussion
17:30      Close
JISC Cross-Programme Discovery Workshop
                                                  April 2012




the data space
a bit of context
data ain’t nothing new
  open data big data
linked data data analytics
   activity data paradata
  metadata
                 ta :     Why is everyone
               da re s suddenly talking
           ata whe ord
          d ry             about data?
                   w e
          eve new er
              y ywh
            nc er
         fa ev
data ain’t nothing new



Is this some kind    deriving    data-driven
of tipping point?     value     infrastructure
      ev
           en               efficiency
         int CEO
      ev o d s transparency data-driven
t al er
    kin yo ata: are       decision making
       g a ne
           b o is
              ut
                 it
data ain’t nothing new




                       none of this is
                     really new, right?
                     we know this stuff

legal issues                      s eem t o
           ethics          people         ic
                                  ing bas
  formats                  be mak
              standards       mi stakes!
   structures policies
if there was ever a time to apply
       good data practices, it is now
  open data big data
linked data data analytics
   activity data paradata   deriving
  metadata      the what                data-driven
                             value     infrastructure
                       the why
                                       efficiency
          the how      transparency data-driven
 legal issues                     decision making
            ethics
   formats     standards
    structures policies
if there was ever a time to apply
     good data practices, it is now
           data ain’t nothin new
    BUT there seems to be a groundswell

      enough data that is good enough
      enough political will of many hues
   enough technologies for big data: noSQL
  enough ways of showing data patterns: viz

our job is often to apply known good practices,
      while the appetite and will is there
JISC-funded work in this space

• We are in education and research: teaching,
  research, libraries, IT, corporate systems, UK
  shared services ...
• We have been led into the data space from a
  wide range of specialisms
• Our data space is multi-disciplinary
• It’s not just about new practices but the
  application of good practices to new challenges
different perspectives

• Things its useful to remember when talking to
  other projects, especially across programmes:
  – Your #1 objective may be their afterthought (and
    vice versa)
  – All projects operate within constraints: before you
    assume they are ignorant of solutions, ask if they
    had considered the approach you used, you might
    learn something
  – There is no such thing as a stupid question
technologies and licensing

                                   SPARQL
 platforms       OGL
                              CC
   DublinCore                       JSON
                       APIs
RSS/atom                      vocabs

             triples
  OAI-PMH
open = ?


 free at the                 CC BY
point of use
               metadata-only     CC Zero
  machine-
  readable             open content
          editable
    collaborative
use cases and business cases




                     I find it is helpful to
                  distinguish between use
                  cases and business cases
situating your project in the data space:
            learning                                   variables:
                      GLAM                              maturity
           resources
                     metadata                          standards
           metadata
                                                         drivers
  activity &                                            benefits
                                 research
 usage data
                                   data
 “paradata”                                         shared concerns:
                      DAT                               technologies
                      A                              licensing issues
administrative                  bibliographic      skills and capacity
   data                              data         appetite and demand


               etc!         geo-data
                                                each project is
                                                   unique
                 OPEN
                 DATA
EXAMPLE: JLeRN




Programme: Open Educational
Resources Phase 3
November 2011-July 2012
£50,000


                              http://jlernexperiment.wordpress.com/
EXAMPLE: LOCAH




Programme: JISC Expo
Dates
£x



                          http://blogs.ukoln.ac.uk/locah/
EXAMPLE: Huddersfield Libraries




Programme: Activity Data
Dates
£x



                           http://library.hud.ac.uk/blogs/projects/lidp/
EXAMPLE: Cambridge Libraries




Programme: Discovery: Open
Bibliographic Data Strand
Dates
£x



                             http://data.lib.cam.ac.uk/datasets.php
EXAMPLE: YOU




Programme:
Date:
£
JISC Cross-Programme Discovery Workshop
                                               April 2012




introductions
JISC Cross-Programme Discovery Workshop
                                                                                April 2012


discopen   Session objective: Explore possibilities offered by open data and address issues
           and highlight successful approaches
14:00      The data space: a bit of context (15min)
           Introductions (max 45 min)
15:00      Break
15:15      Addressing the issues with open data .Facilitated Discussion around:
           •what technological approach is being taken to opening the data (e.g. API,
           RSS, linked data)
           •how are you tackling licensing and IPR issues?
           •what are some of the key lessons you're learning along the way?
           •anything you didn't anticipate?
           •are you aware of any other projects or initiatives which your work would
           overlap with?
16:15      Break
16:30      Distilling the discussion
17:30      Close
JISC Cross-Programme Discovery Workshop
                                            April 2012




Discussion
Your Projects

• what technological approach is being taken to
  opening the data (e.g. API, RSS, linked data)
• how are you tackling licensing and IPR issues?
• what are some of the key lessons you're
  learning along the way?
• anything you didn't anticipate?
• are you aware of any other projects or
  initiatives which your work would overlap
  with?
JISC Cross-Programme Discovery Workshop
                                                                                April 2012


discopen   Session objective: Explore possibilities offered by open data and address issues
           and highlight successful approaches
14:00      The data space: a bit of context (15min)
           Introductions (max 45 min)
15:00      Break
15:15      Addressing the issues with open data .Facilitated Discussion around:
           •what technological approach is being taken to opening the data (e.g. API,
           RSS, linked data)
           •how are you tackling licensing and IPR issues?
           •what are some of the key lessons you're learning along the way?
           •anything you didn't anticipate?
           •are you aware of any other projects or initiatives which your work would
           overlap with?
16:15      Break
16:30      Distilling the discussion
17:30      Close
JISC Cross-Programme Discovery Workshop
                                                                  April 2012


                                                 ed
                                                    uc dis
                                                 da atio cov
                                                   ta    n      e
                                                      da and ry in
                                                        ta
                                                           ev rese
                                                             e ry a
                                                                 w h rc h
                                                                    ere :




discovery and open data                                  #discopen
Amber Thomas @ambrouk
Programme Manager, JISC Digital Infrastructure Team
                                                      excluding ppt template
                                                            see slide 2

discopen

  • 1.
    JISC Cross-Programme DiscoveryWorkshop April 2012 ed uc dis da atio cov ta n e da and ry in ta ev rese e ry a w h rc h ere : discovery and open data #discopen Amber Thomas @ambrouk Programme Manager, JISC Digital Infrastructure Team excluding ppt template workshop slides see slide 2
  • 2.
    Idea from CameronNeylon You are free to: Off the record questions, excluding ppt template comments should be copy, share, adapt or re-mix; flagged photograph, film or broadcast; blog, live-blog or post video of this presentation provided that: You attribute the work to its author and respect the rights and licences associated with its components. Slide Concept by Cameron Neylon, who has waived all copyright and related or neighbouring rights. This slide only CCZero. 2 Social Media Icons adapted with permission from originals by Christopher Ross. Original images are available under GPL at: 2 http://www.thisismyurl.com/free-downloads/15-free-speech-bubble-icons-for-popular-websites
  • 3.
    JISC Cross-Programme DiscoveryWorkshop April 2012 discopen Session objective: Explore possibilities offered by open data and address issues and highlight successful approaches 14:00 The data space: a bit of context (15min) Introductions (max 45 min) 15:00 Break 15:15 Addressing the issues with open data .Facilitated Discussion around: •what technological approach is being taken to opening the data (e.g. API, RSS, linked data) •how are you tackling licensing and IPR issues? •what are some of the key lessons you're learning along the way? •anything you didn't anticipate? •are you aware of any other projects or initiatives which your work would overlap with? 16:15 Break 16:30 Distilling the discussion 17:30 Close
  • 4.
    JISC Cross-Programme DiscoveryWorkshop April 2012 the data space a bit of context
  • 5.
    data ain’t nothingnew open data big data linked data data analytics activity data paradata metadata ta : Why is everyone da re s suddenly talking ata whe ord d ry about data? w e eve new er y ywh nc er fa ev
  • 6.
    data ain’t nothingnew Is this some kind deriving data-driven of tipping point? value infrastructure ev en efficiency int CEO ev o d s transparency data-driven t al er kin yo ata: are decision making g a ne b o is ut it
  • 7.
    data ain’t nothingnew none of this is really new, right? we know this stuff legal issues s eem t o ethics people ic ing bas formats be mak standards mi stakes! structures policies
  • 8.
    if there wasever a time to apply good data practices, it is now open data big data linked data data analytics activity data paradata deriving metadata the what data-driven value infrastructure the why efficiency the how transparency data-driven legal issues decision making ethics formats standards structures policies
  • 9.
    if there wasever a time to apply good data practices, it is now data ain’t nothin new BUT there seems to be a groundswell enough data that is good enough enough political will of many hues enough technologies for big data: noSQL enough ways of showing data patterns: viz our job is often to apply known good practices, while the appetite and will is there
  • 10.
    JISC-funded work inthis space • We are in education and research: teaching, research, libraries, IT, corporate systems, UK shared services ... • We have been led into the data space from a wide range of specialisms • Our data space is multi-disciplinary • It’s not just about new practices but the application of good practices to new challenges
  • 11.
    different perspectives • Thingsits useful to remember when talking to other projects, especially across programmes: – Your #1 objective may be their afterthought (and vice versa) – All projects operate within constraints: before you assume they are ignorant of solutions, ask if they had considered the approach you used, you might learn something – There is no such thing as a stupid question
  • 12.
    technologies and licensing SPARQL platforms OGL CC DublinCore JSON APIs RSS/atom vocabs triples OAI-PMH
  • 13.
    open = ? free at the CC BY point of use metadata-only CC Zero machine- readable open content editable collaborative
  • 14.
    use cases andbusiness cases I find it is helpful to distinguish between use cases and business cases
  • 15.
    situating your projectin the data space: learning variables: GLAM maturity resources metadata standards metadata drivers activity & benefits research usage data data “paradata” shared concerns: DAT technologies A licensing issues administrative bibliographic skills and capacity data data appetite and demand etc! geo-data each project is unique OPEN DATA
  • 16.
    EXAMPLE: JLeRN Programme: OpenEducational Resources Phase 3 November 2011-July 2012 £50,000 http://jlernexperiment.wordpress.com/
  • 17.
    EXAMPLE: LOCAH Programme: JISCExpo Dates £x http://blogs.ukoln.ac.uk/locah/
  • 18.
    EXAMPLE: Huddersfield Libraries Programme:Activity Data Dates £x http://library.hud.ac.uk/blogs/projects/lidp/
  • 19.
    EXAMPLE: Cambridge Libraries Programme:Discovery: Open Bibliographic Data Strand Dates £x http://data.lib.cam.ac.uk/datasets.php
  • 20.
  • 21.
    JISC Cross-Programme DiscoveryWorkshop April 2012 introductions
  • 22.
    JISC Cross-Programme DiscoveryWorkshop April 2012 discopen Session objective: Explore possibilities offered by open data and address issues and highlight successful approaches 14:00 The data space: a bit of context (15min) Introductions (max 45 min) 15:00 Break 15:15 Addressing the issues with open data .Facilitated Discussion around: •what technological approach is being taken to opening the data (e.g. API, RSS, linked data) •how are you tackling licensing and IPR issues? •what are some of the key lessons you're learning along the way? •anything you didn't anticipate? •are you aware of any other projects or initiatives which your work would overlap with? 16:15 Break 16:30 Distilling the discussion 17:30 Close
  • 23.
    JISC Cross-Programme DiscoveryWorkshop April 2012 Discussion
  • 24.
    Your Projects • whattechnological approach is being taken to opening the data (e.g. API, RSS, linked data) • how are you tackling licensing and IPR issues? • what are some of the key lessons you're learning along the way? • anything you didn't anticipate? • are you aware of any other projects or initiatives which your work would overlap with?
  • 25.
    JISC Cross-Programme DiscoveryWorkshop April 2012 discopen Session objective: Explore possibilities offered by open data and address issues and highlight successful approaches 14:00 The data space: a bit of context (15min) Introductions (max 45 min) 15:00 Break 15:15 Addressing the issues with open data .Facilitated Discussion around: •what technological approach is being taken to opening the data (e.g. API, RSS, linked data) •how are you tackling licensing and IPR issues? •what are some of the key lessons you're learning along the way? •anything you didn't anticipate? •are you aware of any other projects or initiatives which your work would overlap with? 16:15 Break 16:30 Distilling the discussion 17:30 Close
  • 26.
    JISC Cross-Programme DiscoveryWorkshop April 2012 ed uc dis da atio cov ta n e da and ry in ta ev rese e ry a w h rc h ere : discovery and open data #discopen Amber Thomas @ambrouk Programme Manager, JISC Digital Infrastructure Team excluding ppt template see slide 2