Bethan Ruddock @bethanar
MmIT CloudBusting, April 2013
You have valuable data that you want to share…




              … but it’s locked inside your systems


                       Image used under CC licence from http://www.flickr.com/photos/kdashy/2678539087/
Push discovery, help researchers find more stuff,
promote use of your collections, enable cool
things like data   mining and visualisations,
combine with other data sets, make the most of
your time and effort, store data in sustainable
formats, allow others to enhance your
data, embed in other sites and catalogues,
enable global access, back-up valuable
data, benchmark & collaborate
Image used under a CC licence from
http://www.flickr.com/photos/brenda-starr/4498078166/
Store & protect




                         Combine multiple data
                                 sets
                  Images used under CC licence from http://www.flickr.com/photos/austin2179/2327574713/
                                              and http://www.flickr.com/photos/streamishmc/3069595776
Store & protect
                  Some cloud services are
                  designed just for storage
                                  Or for storage and client-
                                          side collaboration




                   Images used under CC licence from http://www.flickr.com/photos/austin2179/2327574713/
                                                    and http://www.flickr.com/photos/lollyman/4424552903
For others, storage won’t
be the main function

      They’ll be designed
     to ‘do stuff’ with the
         data: combine it
     with other datasets;
      make it available in
           new formats &
                interfaces
                                              Store & combine
                                              multiple datasets
                              Images used under CC licence from http://www.flickr.com/photos/austin2179/2327574713/
                                                          and http://www.flickr.com/photos/streamishmc/3069595776
These services will usually be defined by what they do
with the data.
 You might not even think of them as cloud services…
The format your data needs to be in will depend
which cloud service you’re sending it to, and what
you want done with that data.


Storage & collaboration services:
           Native data format, or whatever you & your
                         collaborators need to work in
         Try to choose compact & sustainable formats
Combination services:
                    Whatever format the service requires
 This will usually be the same as the other data sets, or
                            transformable/interoperable
Check what formats you can export data in.
      CSV? SQL text format? HTML? Plain text?

Consider:
                           What format is most appropriate?
Archive catalogues – EAD?
Library catalogues – MARC? MODS XML?
Linked data – RDF? Dublin Core XML?
         Is it interoperable? Consistent? Transformable?
                 Will it enable the service to meet its aims?
                                          How is it licensed?
Image used under CC licence from
http://www.flickr.com/photos/tonynewell/1463945828/
Are there barriers to sharing your data?
                         Licensing & data ownership
                                         Loss of control
                              Legal barriers to sharing
                    No time/resource to output data
What are the risks of sharing?
                               Lose access to service
                                  Data compromised
What are the risks of not sharing?
                             Data is isolated in a ‘silo’
           Don’t meet sharing/outreach objectives
                Only have single, local copy of data
When choosing a cloud service:

Why do you want to open your
data to the cloud?
      Does this service meet your
                          needs?
  Does it meet your users needs?

Is your data in the right format?
     Or can you transform it to be?

Can you get your data out of the
cloud service?
       In an appropriate format?

Is the service interoperable?         Does the service put a certain
Is the service sustainable?           licence on your data?
                                                Who do they share it with?
                                                              Image used under a CC licence from
                                                      http://www.flickr.com/photos/kky/704056791/
Image used under CC licence from
http://www.flickr.com/photos/justanotherhuman/5795955558/
Pairs of resource descriptions, describing the same
resource using different schemas.
Does one of these schemas describe the resource better
than another?
What aspects of the resource have they described well?
Have they missed aspects, or described them badly?
How interoperable is each description? For humans? For
computers?
What would you need to do to this data to share it with
others?
What purpose do you think each schema would best be
used for?
bethan.ruddock@manchester.ac.uk @bethanar

                                                 Image used under CC licence from
                           http://www.flickr.com/photos/theilluminated/5386099858/

Opening up: bibliographic data sharing & interoperability

  • 2.
    Bethan Ruddock @bethanar MmITCloudBusting, April 2013
  • 3.
    You have valuabledata that you want to share… … but it’s locked inside your systems Image used under CC licence from http://www.flickr.com/photos/kdashy/2678539087/
  • 4.
    Push discovery, helpresearchers find more stuff, promote use of your collections, enable cool things like data mining and visualisations, combine with other data sets, make the most of your time and effort, store data in sustainable formats, allow others to enhance your data, embed in other sites and catalogues, enable global access, back-up valuable data, benchmark & collaborate
  • 5.
    Image used undera CC licence from http://www.flickr.com/photos/brenda-starr/4498078166/
  • 6.
    Store & protect Combine multiple data sets Images used under CC licence from http://www.flickr.com/photos/austin2179/2327574713/ and http://www.flickr.com/photos/streamishmc/3069595776
  • 7.
    Store & protect Some cloud services are designed just for storage Or for storage and client- side collaboration Images used under CC licence from http://www.flickr.com/photos/austin2179/2327574713/ and http://www.flickr.com/photos/lollyman/4424552903
  • 8.
    For others, storagewon’t be the main function They’ll be designed to ‘do stuff’ with the data: combine it with other datasets; make it available in new formats & interfaces Store & combine multiple datasets Images used under CC licence from http://www.flickr.com/photos/austin2179/2327574713/ and http://www.flickr.com/photos/streamishmc/3069595776
  • 9.
    These services willusually be defined by what they do with the data. You might not even think of them as cloud services…
  • 10.
    The format yourdata needs to be in will depend which cloud service you’re sending it to, and what you want done with that data. Storage & collaboration services: Native data format, or whatever you & your collaborators need to work in Try to choose compact & sustainable formats Combination services: Whatever format the service requires This will usually be the same as the other data sets, or transformable/interoperable
  • 11.
    Check what formatsyou can export data in. CSV? SQL text format? HTML? Plain text? Consider: What format is most appropriate? Archive catalogues – EAD? Library catalogues – MARC? MODS XML? Linked data – RDF? Dublin Core XML? Is it interoperable? Consistent? Transformable? Will it enable the service to meet its aims? How is it licensed?
  • 12.
    Image used underCC licence from http://www.flickr.com/photos/tonynewell/1463945828/
  • 13.
    Are there barriersto sharing your data? Licensing & data ownership Loss of control Legal barriers to sharing No time/resource to output data What are the risks of sharing? Lose access to service Data compromised What are the risks of not sharing? Data is isolated in a ‘silo’ Don’t meet sharing/outreach objectives Only have single, local copy of data
  • 14.
    When choosing acloud service: Why do you want to open your data to the cloud? Does this service meet your needs? Does it meet your users needs? Is your data in the right format? Or can you transform it to be? Can you get your data out of the cloud service? In an appropriate format? Is the service interoperable? Does the service put a certain Is the service sustainable? licence on your data? Who do they share it with? Image used under a CC licence from http://www.flickr.com/photos/kky/704056791/
  • 15.
    Image used underCC licence from http://www.flickr.com/photos/justanotherhuman/5795955558/
  • 16.
    Pairs of resourcedescriptions, describing the same resource using different schemas. Does one of these schemas describe the resource better than another? What aspects of the resource have they described well? Have they missed aspects, or described them badly? How interoperable is each description? For humans? For computers? What would you need to do to this data to share it with others? What purpose do you think each schema would best be used for?
  • 17.
    bethan.ruddock@manchester.ac.uk @bethanar Image used under CC licence from http://www.flickr.com/photos/theilluminated/5386099858/