DataShare  for  the  UCs  

6  February  2014  
  
Where  we’re  going  
Background  
Demo  of  UCSF  DataShare  
Technical  details  
Other  details  
Future  plans  
Q&A  ...
Goal  
  
  
How  

Catalyze  widespread  research  data  
sharing  
Develop  a  system  that  lowers  data  
sharing  bar...
Survey  of  users  by  Angela  Rizk-­‐Jackson  
Has  your  research  
group  provided  public  
access  to  data?  

Why? ...
Repository  choices…  
Repository  choices…  
Repositories    
for  data  

Discipline-­‐specific  

General  content  

Institutional  

Non-­‐in...
Repository  choices…  
Which  is  more  
important?  

Depends  

Institutional  
•  All  data  associated  with  
a  pape...
Institutional  
•  All  data  associated  with  
a  paper  
•  Tells  a  story  
•  Clearinghouse  for  
researcher’s  wor...
IR’s  are  SO  
2002.  

From  Flickr  by  Colin  ZHU  

From  Flickr  by    johnsons531  
  

From  Flickr  by    Ludie  ...
Last  
year…  

…  “Federal  agencies  investing  in  research  and  
development  (more  than  $100  million  in  annual ...
From  Flickr  by  wiccked  

IR  
But…  

From  Flickr  by  jackcheng  

Not  always  self-­‐service  
Sometimes  complicated  
Data?  
“Old”  user  interfa...
Simplify  data  deposit  for  UC  
researchers  
  
Simple  metadata  
Self-­‐service  upload  and  download  
Branded  fo...
Background  
Demo  of  UCSF  DataShare  
Technical  details  
Other  details  
Future  plans  
Q&A  

From  Flickr  by  Le...
Background  
Demo  of  UCSF  DataShare  
Technical  details  
Other  details  
Future  plans  
Q&A  

From  Flickr  by  Le...
Technical  goals  
•  Easy  submission  
•  Persistent  citation  
•  Preservation  assurance  
•  Effective  discovery  
F...
System  components  
•  Easy  submission  

UCSF  drag-­‐n-­‐drop  client  

•  Persistent  citation  
•  Preservation  as...
Deposit  interactions  
Researcher  
(data  producer)  
datashare.campus.edu  

DataShare  portal  
Campus  
IdP  
Authent...
Download  interactions  
Researcher  
Synchronous  for  
small  datasets;  
asynchronous  for  
large  (>  500  MB)  

Cam...
Background  
Demo  of  UCSF  DataShare  
Technical  details  
Other  details  
Future  plans  
Q&A  

From  Flickr  by  Le...
Campus  Library  
Delivers  service  to  community  
Shapes  user  interface,  URL,  branding  
Customizes  key  component...
Branding  &  
Customization  

From  Flickr  by    Diorama  Sky  

• 
• 
• 
• 

Logo  
URL  
Contact  information  
Other…...
Cost  
•  EZID  accounts  

From  Flickr  by  Maura  Teague  

–  Existing  campus  memberships  provide  unlimited  
DOIs...
Cost  
Anticipated  cost  of  providing  all  campus  ladder-­‐track  
faculty  with  5  GBs  for  10  years  
Campus  

F...
Governance        
&  Agreements  

Goal:    
Simplify  &  Scale  Data  Use  &  
Deposit  Agreements  
Governance        
&  Agreements  

Data    
User  

ODL  or  
similar  

CDL  

Terms  of  
service  

UC  Campus  

ODL ...
Background  
Demo  of  UCSF  DataShare  
Technical  details  
Other  details  
Next  steps  &  future  plans  
Q&A  

From...
Who  
Decides?  
•  CDL  to  work  with  each  campus  to  
implement  &  shape  service  
•  Campus-­‐to-­‐campus  intera...
From  Flickr  by  Mischievous  One  

This  is  a  group  project  
From  Flickr  by  Alice  Bartlett  

Two  heads  are  
better  than  
one!  
From  Flickr  by  Emil  Nordén  

• 
• 
• 
• 
• 
• 
• 
• 
• 
• 
• 
• 

eScholarship  connection  
ORCID  
Altmetrics  
Sol...
Communication  
Google  Groups  Web  Forum  
Communication  
UC3  confluence  site    
confluence.ucop.edu/display/Curation/DataShare+for+UCs  
Communication  

From  Flickr  by  gsagos/nho  

•  Listserv?  
•  Twitter  @DataShareOrg  
•  …?  
Communication  
github.com/CDLUC3/datashare  
DASH:    
Helping  Community  
T Repositories  

ob
eR
evi
seD
What  Makes  DASH  Unique:  

•  Modern,  intuitive  user  ...
Next  Steps  –    
Next  2  Weeks  
•  details  to  be  established  
–  who’s  interested  
–  tech  contact  for  intere...
Next  Steps  –    
Next  2  Months  
•  get  DataShare  up  and  running  
–  Shibboleth  configuration  &  
other  authent...
Next  Steps  –    
Longer  term  
•  in-­‐person  meeting?  
•  CDL  camp?  
•  communication/outreach?  

From  Flickr  b...
Acknowledgements  
• 
• 
• 
• 

Stephen  Abrams  
Trisha  Cruse  
Carly  Strasser  
Perry  Willett  

•  Geoffrey  Boushey ...
DataShare for UC Campuses
Upcoming SlideShare
Loading in...5
×

DataShare for UC Campuses

1,281

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,281
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
9
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

DataShare for UC Campuses

  1. 1. DataShare  for  the  UCs   6  February  2014    
  2. 2. Where  we’re  going   Background   Demo  of  UCSF  DataShare   Technical  details   Other  details   Future  plans   Q&A   From  Flickr  by  Leo  Hidalgo  
  3. 3. Goal       How   Catalyze  widespread  research  data   sharing   Develop  a  system  that  lowers  data   sharing  barriers  and  builds  an  engaged   user  community  
  4. 4. Survey  of  users  by  Angela  Rizk-­‐Jackson   Has  your  research   group  provided  public   access  to  data?   Why?   Yes   No   How?   Other   Other   Journal   required   Funder   required   Repository   Website   n  =  114  
  5. 5. Repository  choices…  
  6. 6. Repository  choices…   Repositories     for  data   Discipline-­‐specific   General  content   Institutional   Non-­‐institutional   Publishers/for-­‐profits   Short-­‐term  projects  
  7. 7. Repository  choices…   Which  is  more   important?   Depends   Institutional   •  All  data  associated  with   a  paper   •  Tells  a  story   •  Clearinghouse  for   researcher’s  works   ?   Which  should  a   researcher  use?   Both   Discipline-­‐specific   •  Some  of  data  for  a   given  paper   •  Discoverable   •  Integrated  systems   •  Collection  policies  
  8. 8. Institutional   •  All  data  associated  with   a  paper   •  Tells  a  story   •  Clearinghouse  for   researcher’s  works  
  9. 9. IR’s  are  SO   2002.   From  Flickr  by  Colin  ZHU   From  Flickr  by    johnsons531     From  Flickr  by    Ludie  Cochrane   From  Flickr  by    Kapil  Karekar  
  10. 10. Last   year…   …  “Federal  agencies  investing  in  research  and   development  (more  than  $100  million  in  annual   expenditures)  must  have  clear  and  coordinated   policies  for  increasing  public  access  to  research   products.”  
  11. 11. From  Flickr  by  wiccked   IR  
  12. 12. But…   From  Flickr  by  jackcheng   Not  always  self-­‐service   Sometimes  complicated   Data?   “Old”  user  interfaces  
  13. 13. Simplify  data  deposit  for  UC   researchers     Simple  metadata   Self-­‐service  upload  and  download   Branded  for  campus   Most  Important:     Institutional  Control  Over  Data  
  14. 14. Background   Demo  of  UCSF  DataShare   Technical  details   Other  details   Future  plans   Q&A   From  Flickr  by  Leo  Hidalgo  
  15. 15. Background   Demo  of  UCSF  DataShare   Technical  details   Other  details   Future  plans   Q&A   From  Flickr  by  Leo  Hidalgo  
  16. 16. Technical  goals   •  Easy  submission   •  Persistent  citation   •  Preservation  assurance   •  Effective  discovery   From  www.dimensionsinfo.com   •  Control  over  terms  of  use   •  All  the  benefits  of  a  centrally   hosted  service,  while   maintaining  campus  branding   and  identity   From  Flickr  by  Eric  Peacock  
  17. 17. System  components   •  Easy  submission   UCSF  drag-­‐n-­‐drop  client   •  Persistent  citation   •  Preservation  assurance   •  Effective  discovery   •  Control  over  terms  of  use   Data  use  agreements  (DUAs)   •  All  the  benefits  of  a  centrally   DNS,  Apache,  CSS,  and   campus  Shibboleth  IdPs   hosted  service,  while   maintaining  campus  branding   datashare.berkeley.edu   datashare.ucdavis.edu   and  identity   datashare.uci.edu   datashare.ucla.edu   …  
  18. 18. Deposit  interactions   Researcher   (data  producer)   datashare.campus.edu   DataShare  portal   Campus   IdP   Authenticate   with  campus   credentials   Shib   Drag-­‐n-­‐drop   client   Assemble  dataset   Add  metadata   Submit  to  Merritt   SDSC  cloud   Preservation  storage   Merritt   CSS   Atom   Discovery   Populate  XTF  index   (XTF)   Request  DOI   Register  metadata   Assign  DOI   Data  use   agreement   EZID   Request  DOI   Register  metadata   Assign  DOI   Primo   Harvest  for  A&I  discovery   DataCite   Data  Citation   Index   Harvest  for  A&I  discovery  
  19. 19. Download  interactions   Researcher   Synchronous  for   small  datasets;   asynchronous  for   large  (>  500  MB)   Campus   IdP   Download  data   (data  consumer)   datashare.campus.edu   DataShare  portal   Drag-­‐n-­‐drop   client   Merritt   CSS   Discovery   (XTF)   Faceted  search  /  browse   SDSC  cloud   EZID   Retrieve  data   Primo   Faceted  search  /  browse   Data  use   agreement   Accept  DUA  terms   DataCite   Data  Citation   Index   Faceted  search  /  browse  
  20. 20. Background   Demo  of  UCSF  DataShare   Technical  details   Other  details   Future  plans   Q&A   From  Flickr  by  Leo  Hidalgo  
  21. 21. Campus  Library   Delivers  service  to  community   Shapes  user  interface,  URL,  branding   Customizes  key  components   Develops  help,  training   Roles   UC3  /  CDL   Guides  the  campus   Preserves  content  in  Merritt   Connects  to  EZID   Deploys  XTF  for  discovery   Works  with  vendors   SDSC   Maintains  production  storage   infrastructure   Holds  three  independent   copies  of  content  
  22. 22. Branding  &   Customization   From  Flickr  by    Diorama  Sky   •  •  •  •  Logo   URL   Contact  information   Other…?  
  23. 23. Cost   •  EZID  accounts   From  Flickr  by  Maura  Teague   –  Existing  campus  memberships  provide  unlimited   DOIs     •  Merritt  recharge  proposal  (awaiting  UCOP  approval)   –  Pay-­‐as-­‐you-­‐go  $0.40/GB/year   –  Paid-­‐up  (for  10  years)  $2.93/GB   –  Threshold  pricing  100,  200,  500  GBs      1,  2,  5,  10,  20,  50,  100  TBs    
  24. 24. Cost   Anticipated  cost  of  providing  all  campus  ladder-­‐track   faculty  with  5  GBs  for  10  years   Campus   Faculty   Threshold   Paid-­‐up  cost   Berkeley   1,260   10  TB   $  29,300   Davis   1,240   10  TB   $  29,300   Irvine   1,051   10  TB   $  29,300   Los  Angeles   1,701   10  TB   $  29,300   Merced        159        1  TB   $      2,930   Riverside        561      5  TB   $  14,650   San  Diego   1,109   10  TB   $  29,300   San  Francisco        366      2  TB   $      5,860   Santa  Barbara        746      5  TB   $  14,650   Santa  Cruz        485      5  TB   $  14,650   Source:  http://legacy-­‐its.ucop.edu/uwnews/stat/headcount_fte/oct2013/welcome.html    
  25. 25. Governance         &  Agreements   Goal:     Simplify  &  Scale  Data  Use  &   Deposit  Agreements  
  26. 26. Governance         &  Agreements   Data     User   ODL  or   similar   CDL   Terms  of   service   UC  Campus   ODL  or  similar     Terms  of   service   Data   Depositor  
  27. 27. Background   Demo  of  UCSF  DataShare   Technical  details   Other  details   Next  steps  &  future  plans   Q&A   From  Flickr  by  Leo  Hidalgo  
  28. 28. Who   Decides?   •  CDL  to  work  with  each  campus  to   implement  &  shape  service   •  Campus-­‐to-­‐campus  interaction   •  Group  meetings  as  needed   •  SAG1  check-­‐ins   •  Communication  (…)  
  29. 29. From  Flickr  by  Mischievous  One   This  is  a  group  project  
  30. 30. From  Flickr  by  Alice  Bartlett   Two  heads  are   better  than   one!  
  31. 31. From  Flickr  by  Emil  Nordén   •  •  •  •  •  •  •  •  •  •  •  •  eScholarship  connection   ORCID   Altmetrics   Solr/Blacklight  for  discovery   Expand  metadata  options   Embargoes   Restricted  access  for  peer  review   Annotations   Export  to  citation  managers   Staging  area   Private  storage   Mapping  metadata/GIS  support  
  32. 32. Communication   Google  Groups  Web  Forum  
  33. 33. Communication   UC3  confluence  site     confluence.ucop.edu/display/Curation/DataShare+for+UCs  
  34. 34. Communication   From  Flickr  by  gsagos/nho   •  Listserv?   •  Twitter  @DataShareOrg   •  …?  
  35. 35. Communication   github.com/CDLUC3/datashare  
  36. 36. DASH:     Helping  Community   T Repositories   ob eR evi seD What  Makes  DASH  Unique:   •  Modern,  intuitive  user  interface  for  superior  user  experience   •  Freely  available  code  for  download  and  use  by  anyone   •  User-­‐friendly  API(s)  to  ensure  interoperability  with  existing   repositories  (e.g.,  SWORD  for  deposit;  Atom,  OAI-­‐PMH,   ResourceSync  for  populating  the  discovery  index).   •  Customizable  interfaces  that  can  be  altered  easily  to  reflect  service   provider  branding   •  Authentication  via  institutional  Identity  Management  Systems  
  37. 37. Next  Steps  –     Next  2  Weeks   •  details  to  be  established   –  who’s  interested   –  tech  contact  for  interested   campuses   –  communication  lines   From  Flickr  by  Themactep  
  38. 38. Next  Steps  –     Next  2  Months   •  get  DataShare  up  and  running   –  Shibboleth  configuration  &   other  authentication   –  Domains/URLs  established   –  Customizations  –  logos  etc.   From  Flickr  by  Themactep  
  39. 39. Next  Steps  –     Longer  term   •  in-­‐person  meeting?   •  CDL  camp?   •  communication/outreach?   From  Flickr  by  Themactep  
  40. 40. Acknowledgements   •  •  •  •  Stephen  Abrams   Trisha  Cruse   Carly  Strasser   Perry  Willett   •  Geoffrey  Boushey   •  Julia  Kochi   •  Megan  Laurence   •  Anirvan  Chatterjee   •  Angela  Rizk-­‐Jackson   •  Maninder  Kahlon  
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×