Your SlideShare is downloading. ×
  • Like
DataShare for UC Campuses
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply


Published in Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On SlideShare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. DataShare  for  the  UCs   6  February  2014    
  • 2. Where  we’re  going   Background   Demo  of  UCSF  DataShare   Technical  details   Other  details   Future  plans   Q&A   From  Flickr  by  Leo  Hidalgo  
  • 3. Goal       How   Catalyze  widespread  research  data   sharing   Develop  a  system  that  lowers  data   sharing  barriers  and  builds  an  engaged   user  community  
  • 4. Survey  of  users  by  Angela  Rizk-­‐Jackson   Has  your  research   group  provided  public   access  to  data?   Why?   Yes   No   How?   Other   Other   Journal   required   Funder   required   Repository   Website   n  =  114  
  • 5. Repository  choices…  
  • 6. Repository  choices…   Repositories     for  data   Discipline-­‐specific   General  content   Institutional   Non-­‐institutional   Publishers/for-­‐profits   Short-­‐term  projects  
  • 7. Repository  choices…   Which  is  more   important?   Depends   Institutional   •  All  data  associated  with   a  paper   •  Tells  a  story   •  Clearinghouse  for   researcher’s  works   ?   Which  should  a   researcher  use?   Both   Discipline-­‐specific   •  Some  of  data  for  a   given  paper   •  Discoverable   •  Integrated  systems   •  Collection  policies  
  • 8. Institutional   •  All  data  associated  with   a  paper   •  Tells  a  story   •  Clearinghouse  for   researcher’s  works  
  • 9. IR’s  are  SO   2002.   From  Flickr  by  Colin  ZHU   From  Flickr  by    johnsons531     From  Flickr  by    Ludie  Cochrane   From  Flickr  by    Kapil  Karekar  
  • 10. Last   year…   …  “Federal  agencies  investing  in  research  and   development  (more  than  $100  million  in  annual   expenditures)  must  have  clear  and  coordinated   policies  for  increasing  public  access  to  research   products.”  
  • 11. From  Flickr  by  wiccked   IR  
  • 12. But…   From  Flickr  by  jackcheng   Not  always  self-­‐service   Sometimes  complicated   Data?   “Old”  user  interfaces  
  • 13. Simplify  data  deposit  for  UC   researchers     Simple  metadata   Self-­‐service  upload  and  download   Branded  for  campus   Most  Important:     Institutional  Control  Over  Data  
  • 14. Background   Demo  of  UCSF  DataShare   Technical  details   Other  details   Future  plans   Q&A   From  Flickr  by  Leo  Hidalgo  
  • 15. Background   Demo  of  UCSF  DataShare   Technical  details   Other  details   Future  plans   Q&A   From  Flickr  by  Leo  Hidalgo  
  • 16. Technical  goals   •  Easy  submission   •  Persistent  citation   •  Preservation  assurance   •  Effective  discovery   From   •  Control  over  terms  of  use   •  All  the  benefits  of  a  centrally   hosted  service,  while   maintaining  campus  branding   and  identity   From  Flickr  by  Eric  Peacock  
  • 17. System  components   •  Easy  submission   UCSF  drag-­‐n-­‐drop  client   •  Persistent  citation   •  Preservation  assurance   •  Effective  discovery   •  Control  over  terms  of  use   Data  use  agreements  (DUAs)   •  All  the  benefits  of  a  centrally   DNS,  Apache,  CSS,  and   campus  Shibboleth  IdPs   hosted  service,  while   maintaining  campus  branding   and  identity   …  
  • 18. Deposit  interactions   Researcher   (data  producer)   DataShare  portal   Campus   IdP   Authenticate   with  campus   credentials   Shib   Drag-­‐n-­‐drop   client   Assemble  dataset   Add  metadata   Submit  to  Merritt   SDSC  cloud   Preservation  storage   Merritt   CSS   Atom   Discovery   Populate  XTF  index   (XTF)   Request  DOI   Register  metadata   Assign  DOI   Data  use   agreement   EZID   Request  DOI   Register  metadata   Assign  DOI   Primo   Harvest  for  A&I  discovery   DataCite   Data  Citation   Index   Harvest  for  A&I  discovery  
  • 19. Download  interactions   Researcher   Synchronous  for   small  datasets;   asynchronous  for   large  (>  500  MB)   Campus   IdP   Download  data   (data  consumer)   DataShare  portal   Drag-­‐n-­‐drop   client   Merritt   CSS   Discovery   (XTF)   Faceted  search  /  browse   SDSC  cloud   EZID   Retrieve  data   Primo   Faceted  search  /  browse   Data  use   agreement   Accept  DUA  terms   DataCite   Data  Citation   Index   Faceted  search  /  browse  
  • 20. Background   Demo  of  UCSF  DataShare   Technical  details   Other  details   Future  plans   Q&A   From  Flickr  by  Leo  Hidalgo  
  • 21. Campus  Library   Delivers  service  to  community   Shapes  user  interface,  URL,  branding   Customizes  key  components   Develops  help,  training   Roles   UC3  /  CDL   Guides  the  campus   Preserves  content  in  Merritt   Connects  to  EZID   Deploys  XTF  for  discovery   Works  with  vendors   SDSC   Maintains  production  storage   infrastructure   Holds  three  independent   copies  of  content  
  • 22. Branding  &   Customization   From  Flickr  by    Diorama  Sky   •  •  •  •  Logo   URL   Contact  information   Other…?  
  • 23. Cost   •  EZID  accounts   From  Flickr  by  Maura  Teague   –  Existing  campus  memberships  provide  unlimited   DOIs     •  Merritt  recharge  proposal  (awaiting  UCOP  approval)   –  Pay-­‐as-­‐you-­‐go  $0.40/GB/year   –  Paid-­‐up  (for  10  years)  $2.93/GB   –  Threshold  pricing  100,  200,  500  GBs      1,  2,  5,  10,  20,  50,  100  TBs    
  • 24. Cost   Anticipated  cost  of  providing  all  campus  ladder-­‐track   faculty  with  5  GBs  for  10  years   Campus   Faculty   Threshold   Paid-­‐up  cost   Berkeley   1,260   10  TB   $  29,300   Davis   1,240   10  TB   $  29,300   Irvine   1,051   10  TB   $  29,300   Los  Angeles   1,701   10  TB   $  29,300   Merced        159        1  TB   $      2,930   Riverside        561      5  TB   $  14,650   San  Diego   1,109   10  TB   $  29,300   San  Francisco        366      2  TB   $      5,860   Santa  Barbara        746      5  TB   $  14,650   Santa  Cruz        485      5  TB   $  14,650   Source:  http://legacy-­‐    
  • 25. Governance         &  Agreements   Goal:     Simplify  &  Scale  Data  Use  &   Deposit  Agreements  
  • 26. Governance         &  Agreements   Data     User   ODL  or   similar   CDL   Terms  of   service   UC  Campus   ODL  or  similar     Terms  of   service   Data   Depositor  
  • 27. Background   Demo  of  UCSF  DataShare   Technical  details   Other  details   Next  steps  &  future  plans   Q&A   From  Flickr  by  Leo  Hidalgo  
  • 28. Who   Decides?   •  CDL  to  work  with  each  campus  to   implement  &  shape  service   •  Campus-­‐to-­‐campus  interaction   •  Group  meetings  as  needed   •  SAG1  check-­‐ins   •  Communication  (…)  
  • 29. From  Flickr  by  Mischievous  One   This  is  a  group  project  
  • 30. From  Flickr  by  Alice  Bartlett   Two  heads  are   better  than   one!  
  • 31. From  Flickr  by  Emil  Nordén   •  •  •  •  •  •  •  •  •  •  •  •  eScholarship  connection   ORCID   Altmetrics   Solr/Blacklight  for  discovery   Expand  metadata  options   Embargoes   Restricted  access  for  peer  review   Annotations   Export  to  citation  managers   Staging  area   Private  storage   Mapping  metadata/GIS  support  
  • 32. Communication   Google  Groups  Web  Forum  
  • 33. Communication   UC3  confluence  site  
  • 34. Communication   From  Flickr  by  gsagos/nho   •  Listserv?   •  Twitter  @DataShareOrg   •  …?  
  • 35. Communication  
  • 36. DASH:     Helping  Community   T Repositories   ob eR evi seD What  Makes  DASH  Unique:   •  Modern,  intuitive  user  interface  for  superior  user  experience   •  Freely  available  code  for  download  and  use  by  anyone   •  User-­‐friendly  API(s)  to  ensure  interoperability  with  existing   repositories  (e.g.,  SWORD  for  deposit;  Atom,  OAI-­‐PMH,   ResourceSync  for  populating  the  discovery  index).   •  Customizable  interfaces  that  can  be  altered  easily  to  reflect  service   provider  branding   •  Authentication  via  institutional  Identity  Management  Systems  
  • 37. Next  Steps  –     Next  2  Weeks   •  details  to  be  established   –  who’s  interested   –  tech  contact  for  interested   campuses   –  communication  lines   From  Flickr  by  Themactep  
  • 38. Next  Steps  –     Next  2  Months   •  get  DataShare  up  and  running   –  Shibboleth  configuration  &   other  authentication   –  Domains/URLs  established   –  Customizations  –  logos  etc.   From  Flickr  by  Themactep  
  • 39. Next  Steps  –     Longer  term   •  in-­‐person  meeting?   •  CDL  camp?   •  communication/outreach?   From  Flickr  by  Themactep  
  • 40. Acknowledgements   •  •  •  •  Stephen  Abrams   Trisha  Cruse   Carly  Strasser   Perry  Willett   •  Geoffrey  Boushey   •  Julia  Kochi   •  Megan  Laurence   •  Anirvan  Chatterjee   •  Angela  Rizk-­‐Jackson   •  Maninder  Kahlon