Your SlideShare is downloading. ×
0
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
NISO Webinar:  Part 2: Managing Data for Scholarly Communications
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

NISO Webinar: Part 2: Managing Data for Scholarly Communications

425

Published on

The explosion of data creation across all scholarly disciplines necessitates corresponding efforts to create new solutions for its management and use. Ever-growing repositories and datasets within …

The explosion of data creation across all scholarly disciplines necessitates corresponding efforts to create new solutions for its management and use. Ever-growing repositories and datasets within require organization, identification, description, publication, discovery, citation, preservation, and curation to allow these materials to realize their potential in support of data-driven, often interdisciplinary research. What infrastructures and technical environments are required for this work? Can new approaches, specifications, standards and best practices be created? Are there partnerships and collaborations that exist or can be pursued? This webinar, Part 2 of a two-part NISO series on data, will explore these and other questions

Published in: Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
425
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. http://www.niso.org/news/events/2011/nisowebinars/semanticweb/ Managing Data forScholarly Communications PART 2: Technical Management October 19, 2011Speakers: Joan Starr, Mark McFarland, and MacKenzie Smith
  • 2. Dataset  Iden*fica*on  &  Cita*on:   DataCite  and  EZID   Joan  Starr   California  Digital  Library   October,  2011  
  • 3. Dataset  Iden*fica*on  &  Cita*on  Introduc*on  The  Researchers’  Challenge   Iden*fiers  are  a  tool  for  researchers  DataCite   “Helping  you  find,  access  and  reuse  data.”  EZID   Easy  crea*on  and  management  of  DataCite  DOIs  and  other   iden*fiers.  Next  steps      For  DataCite,  EZID  and  you!  
  • 4. California  Digital  Library  (CDL)  
  • 5. The  Researchers’  Challenge  
  • 6. Early  in  the  research  life  cycle  Data-­‐intensive  research   +   Wri*ng  up  the  results   Where’s   the  data?   What  if    I   move  it?   PERSISTENT  IDENTIFIERS   make  the  difference   by  Dave  Rogers  hWp://www.flickr.com/photos/dave-­‐rogers/2815036285/  
  • 7. Working  on  a  federated  team   Data-­‐intensive  research   +   Regional  research  center   +   Aging  infrastructure   Where’s   We  have  to   the  data?   move  it!   PERSISTENT  IDENTIFIERS   make  the  difference  ©All  rights  reserved  by  University  of  California,  hWp://www.flickr.com/photos/universityofcalifornia/5405812887  
  • 8. Making  a  career  move  •  Data-­‐intensive  research   +   •  Researcher(s)  on  the   move   I  know   where  my   data  is   and  I’m   taking  it   with  me!   PERSISTENT  IDENTIFIERS   make  the  difference   ©All  rights  reserved  by  University  of  California,     hWp://www.flickr.com/photos/universityofcalifornia/5406308654  
  • 9. Mee*ng  funder  requirements  •  Data-­‐intensive  research   +   •  Grantor  requirements   for  data  management   What  do  we   plan   put  here?   How  do  we   track  the  data?   PERSISTENT  IDENTIFIERS   make  the  difference   By  David  Mellis,  hWp://www.flickr.com/photos/mellis/7675610/  
  • 10. DataCite  German  Na8onal  Library  of  Economics  (ZBW)       Canada  Ins8tute  for  Scien8fic  and  Technical  Informa8on  German  Na8onal  Library  of  Science  and  Technology  (TIB)     (CISTI)  German  Na8onal  Library  of  Medicine  (ZB  MED)   Technical  Informa8on  Center  of  Denmark  GESIS  -­‐  Leibniz  Ins8tute  for  the  Social  Sciences,  Germany     Ins8tute  for  Scien8fic  &  Technical  Informa8on  (INIST-­‐Australian  Na8onal  Data  Service  (ANDS)   CNRS),  France    ETH  Zurich,  Switzerland   TU  DelS  Library,  The  Netherlands     The  Swedish  Na8onal  Data  Service  (SNDS)   The  Bri8sh  Library  ,  UK   California  Digital  Library  (CDL),  USA     Office  of  Scien8fic  &  Technical  Informa8on  (OSTI),  USA     Purdue  University  Library  
  • 11. DataCite  Metadata  V.  2.2  •  Small  required  set  =  cita*on  elements  •  Op*onal  descrip*ve  set:   –  extendable  lists   –  can  refer  to  other  standards,  schemes   –  domain-­‐neutral   –  rich  ability  to  describe  rela*onships  to  other   digital  objects  •  Metadata  Search  (MDS)  is  full-­‐text  indexed    
  • 12. DataCite  Metadata  V.  2.2   Required  proper8es   Op8onal  proper8es  1.  Iden8fier  (with  type  aWribute)   6.  Subject  (with  schema  aWribute)  2.  Creator  (with  name  iden*fier   7.  Contributor  (with  type  &  name  iden*fier   aWributes)   aWributes)  3.  Title  (with  op*onal  type  aWribute)   8.  Date  (with  type  aWribute)  4.  Publisher   9.  Language      5.  Publica8onYear   10.  ResourceType  (with  descrip*on  aWribute)   11.  AlternateIden*fier  (with  type  aWribute)   12.  RelatedIden*fier  (with  type  &rela*on   type  aWributes)   13.  Size       14.  Format       15.  Version   16.  Rights   17.  Descrip*on  (with  type  aWribute)  
  • 13. •  Get  iden*fiers  •  Add  loca*on  •  Add  metadata  •  Update  loca*on  •  Update  metadata  
  • 14. hWp://n2t.net/ezid  
  • 15. hWp://n2t.net/ezid  
  • 16. hWp://n2t.net/ezid  
  • 17. hWp://n2t.net/ezid  
  • 18. hWp://n2t.net/ezid  
  • 19. hWp://n2t.net/ezid  
  • 20. hWp://n2t.net/ezid  
  • 21. What  this  means…  
  • 22. What  this  means…  
  • 23. Next  Steps  DataCite  •   Dublin  Core  applica*on  profile  •   Content  Service  •   Metadata  v.  2.3  EZID  • UI  redesign  • Automated  link  checking  • Exposure  for  cita*ons   By  Nicola  Whitaker  hWp://www.flickr.com/photos/nicolawhitaker/111009156/  
  • 24. Next  Steps  for  you  •  Get  more  informa*on,  and  •  Try  EZID  for  yourself!   By  Nicola  Whitaker  hWp://www.flickr.com/photos/nicolawhitaker/111009156/  
  • 25. For  more  informa*on  EZID  EZID  applica*on:  hWp://n2t.net/ezid/    EZID  website:  hWp://www.cdlib.org/services/uc3/ezid/  UC3  website:  hWp://www.cdlib.org/services/uc3/  DataCite  DataCite  Home:  hWp://datacite.org/  DataCite  Metadata  Schema:   hWp://schema.datacite.org/meta/kernel-­‐2.2/index.html  DataCite  Metadata  Search:  hWp://search.datacite.org  Contact  Joan  Starr  at  uc3@ucop.edu  
  • 26. Ques*ons?   by  Horia  Varlan     hWp://www.flickr.com/photos/horiavarlan/4273168957/in/photostream/  
  • 27. Digital  Library  Services  in  the  Cloud   Mark  McFarland   Director,  Texas  Digital  Library  
  • 28. Outline  •  Who:  Texas  Digital  Library  •  Where:  on  the  cloud  •  Why:  mo*va*ons  •  When:  late  2010  •  What:  lessons  learned  June  2011   30  
  • 29. Who:  Texas  Digital  Library  •  Consor*um  of  higher  educa*on  ins*tu*ons  in  Texas  •  Current  services  include:   –  Ins*tu*on:  IR  (DSpace),  ETD  system   –  Faculty:  OJS,  OCS,  blogs,  wikis   –  Approximately  70  customer-­‐facing  service  instances  •  Legacy  hardware  included   –  Compute  servers   –  Storage  servers   –  Network  support  devices  June  2011   31  
  • 30. Where:  on  the  cloud  •  Migrated  customer-­‐facing  services  to  AWS   –  50  AWS  VM  instances  •  Maintained  some  services  on  local  hardware  •  Simplified  and  consolidated  system   architecture  June  2011   32  
  • 31. Why:  mo*va*ons  /  When:  late  2010  •  Disaster  recovery  plan   –  Prepare  for  data  center  move  •  Elas*c  capacity   –  New  members,  collec*ons  •  Personnel  savings   –  Fewer  competencies,  responsibili*es  •  Began  Oct  2010  June  2011   33  
  • 32. What:  lessons  learned  •  The  Good   –  Elas*c  capacity;  customers  did  not  no*ce  change   –  No  hardware  purchase  cycle  •  The  Mixed   –  Lower  personnel  costs;  failover  •  The  Unexpected   –  Development  tools;  concerns  about  AWS  being  in   U.S.;  excellent  management  console  June  2011   34  
  • 33. Future  •  Preserva*on   –  DuraCloud  •  Con*nue  to  evaluate   –  AWS  is  flexible  and  feature  rich,  but  may  s*ll  not   be  cost  effec*ve  June  2011   35  
  • 34. For  more  informa*on  about  the  TDL,  please  visit  the  Texas   Digital  Library  website  at  hWp://www.tdl.org     or  contact  us  at     info@tdl.org.    
  • 35. Data  Governance  and   Legal  Interoperability   MacKenzie  Smith,  Science  Fellow  ©  Crea*ve  Commons,  2011.  This  work  is  licensed  under  a  Crea*ve  Commons  AWribu*on  3.0  United  States  License.  
  • 36. Why  Data  Sharing  is  Good    •  research  reproducibility  •  fiscal  responsibility  •  broadest  possible  impact  •  large-­‐scale  data  interoperability   –  Includes  technical,  social,  legal  and  policy  aspects   –  usual  focus  on  technical/social   –  focus  here  on  legal/policy  aspects  
  • 37. Why  Data  Sharing  is  Hard  •  No  incen*ves  to  improve  data  quality,  provide   missing  documenta*on  •  Confiden*ality  and  privacy  concerns   (e.g.  HIPAA,  endangered  species)  •  Patents  and  commercial  poten*al  •  Closed  Access  to  journal  ar*cles  (i.e.  results)  •  IP  issues  very  complicated  
  • 38. Defini*ons  Data  governance  is  the  system  of  decision  rights  and   accountabili8es  that  describe  who  can  take  what  ac8ons   with  what  data,  and  when,  under  what  circumstances,  using   what  methods  •  strategies  for  data  quality  control  and  management,  and  processes  that   insure  important  data  assets  are  formally  managed  throughout  an   organiza*on;   –  organiza*ons  can  be  legal  en**es  like  universi*es,  or  virtual  organiza=ons   (e.g.  distributed  research  collabora*ons)   –  Includes  business  processes  and  risk  management;  •  laws  and  policies  associated  with  data;  •  ensures  that  data  can  be  trusted  and  that  people  are  accountable  for   ac*ons  affec*ng  the  data  
  • 39. Defini*ons  •  A"ribuon  is  legally-­‐imposed,  remedy  is  lawsuit  •  Credit  is  what  researchers  want    •  Citaon  is  the  norm  in  scholarly  communica*on,   to  provide  suppor*ng  evidence,  now  proxy  for   credit  AWribu*on  does  not  insure  credit  or  cita*on.    
  • 40. Legal  Mechanisms  for  Sharing  Data  1.   licenses   Require  aWribu*on  2.   contracts  3.   waivers     No  aWribu*on   requirement  
  • 41. Copyright  for  Data  •  Does  not  apply  to  facts,  e.g.,  most  scien*fic   data  •  Can  apply  to  a  collec=on  of  facts,  but  only  to   original  aspects,  not  facts  themselves  •  Can  extract  facts  from  a  copyrighted  database   without  infringing  
  • 42. Licenses  •  Licenses  are  not  contracts   –  depend  on  underlying  rights,  e.g.  copyright  or  sui  generis   rights   –  Copyright  is  a  bundle  of  rights,  automa*c  when  fixed,   limited  in  scope  and  dura*on  •  US  and  EU  differ  (EU  has  sui  generis  data  rights)   so  different  licenses  cover  copyright,  sui  generis   rights,  or  both  
  • 43. Licenses  •  Crea*ve  Commons  (CC-­‐BY)  example   –  applies  to  data  and  databases  to  the  extent  they’re   copyrightable   –  Only  data  uses  that  implicate  copyright  trigger   aWribu*on  requirement   –  uses  of  data  that  do  not  implicate  copyright,  e.g.  is  in   the  public  domain,  do  not  trigger  aWribu*on  
  • 44. Licenses  •  Hard  to  assess  copyright  for  par*cular  data   and  databases  •  Hard  to  know  when  license  applies,  creates   risks:   –  data  provider  be  misled   –  data  user  will  under  or  over  comply  
  • 45. Licenses  •  AWribu*on  requirements  are  inflexible,   causing  absurd  situa*ons   –  e.g.  providing  aWribu*on  to  1,000  providers       in  1,000  different  ways   –  known  as  ‘aWribu*on  stacking’    •  Could  provide  aWribu*on  and  s*ll  not  sa*sfy   norms  or  expecta*ons  
  • 46. Contracts  
  • 47. Contracts  •  Do  not  require  underlying  right     –  rely  on  offer/acceptance,  click  through,  terms  of  use   –  require  formali*es,  e.g.  aWribu*on  •  Downsides   –  confusing  obliga*ons,  no  standardiza*on,  each  user   agreement  can  have  different  requirements  •  Researchers  may  avoid  data  if  they  can’t   understand  the  terms  of  use  
  • 48. Contracts  Unlike  licenses,  contracts  only  binds  par=es  •  If  someone  obtains  licensed  data  and  shares  it,  anyone   who  obtains  data  from  that  user  is  s*ll  bound  by  the   license  •  If  data  had  been  shared  by  contract,  anyone  obtaining   data  from  the  second  party  is  not  bound  by  the   contract  since  they  aren’t  a  party  to  the  contract  •  In  this  respect,  contracts  are  more  limited  than  licenses  
  • 49. Contracts  •  Have  broader  reach  than  licenses   –  not  *ed  to  a  legal  right   –  can  take  away  rights  of  public  
  • 50. Example  
  • 51. Waivers  •  Provide  legal  certainty   –  No  need  to  decipher  copyright  protec*on  or  six  through  confusing   legalese   –  BeWer  than  silence,  to  avoid  forcing  people  to  guess  what  their  risks   are    •  Mean  loss  of  control   –  Can’t  require  aWribu*on  or  other  terms  •  Avoid  problems  and  rely  on  scholarly  norms   –  no  aWribu*on  stacking  or  inappropriate  obliga*ons  
  • 52. 3  levels:  Waiver,  Fall-­‐back  license,  Non-­‐asser*on  pledge  
  • 53. Summary  •  Law  is  messy,  each  approach  has  consequences  •  Licenses  –  (1)  legal  uncertainty  about  scope,  (2)   requirements  can  be  inconsistent  with  norms  •  Contracts  –  (1)  burdensome  requirements  with  custom   terms,  (2)  exceed  scope  of  rights  with  requirements  that   take  away  normal  rights  •  Waivers  –  (1)  avoid  problems,  but  (2)  lose  control  and   rely  on  norms  
  • 54. Summary  •  Each  approach  requires  loss  of  control  •  No  mechanism  imposes  legally-­‐binding  obliga*ons  in   way  that  perfectly  maps  to  scholarly  credit,  e.g.   cita*on  •  Ideal  solu*on  creates  the  least  fric*on  to  scien*fic   progress  while  giving  credit  where  due,  i.e.,  waivers   and  norms  (the  community  governs  itself)  

×