brian.hole@ubiquitypress.com	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  www.ubiquitypress.com	
  /	
  @ubiquitypress	
  
Brian	
  Hole	
  
The	
  Now	
  and	
  Future	
  of	
  Data	
  Publishing,	
  Oxford,	
  22	
  May	
  2013	
  
Data	
  availability	
  policies	
  
and	
  licensing	
  
brian.hole@ubiquitypress.com	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  www.ubiquitypress.com	
  /	
  @ubiquitypress	
  
The	
  Social	
  Contract	
  
of	
  Science	
  
•  ValidaLon	
  
•  DisseminaLon	
  
•  Further	
  development	
  
ScienLfic	
  MalpracLce	
  
•  Publishers	
  
•  Researchers	
  
•  Libraries,	
  repositories…	
  
brian.hole@ubiquitypress.com	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  www.ubiquitypress.com	
  /	
  @ubiquitypress	
  
brian.hole@ubiquitypress.com	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  www.ubiquitypress.com	
  /	
  @ubiquitypress	
  
•  To	
  allow	
  use	
  material	
  for	
  research:	
  private	
  study,	
  criLcism	
  
and	
  review.	
  Academics,	
  public,	
  private	
  sector.	
  
We	
  need	
  fair	
  use	
  copyright	
  excep8ons	
  	
  
•  To	
  allow	
  mining	
  of	
  both	
  text	
  and	
  data,	
  by	
  academics	
  and	
  
private	
  sector.	
  
•  To	
  allow	
  material	
  to	
  be	
  freely	
  used	
  in	
  teaching	
  and	
  exams	
  
•  Copyright	
  excepLons	
  are	
  currently	
  not	
  harmonized	
  across	
  
the	
  world,	
  so	
  researchers	
  have	
  to	
  deal	
  with	
  a	
  different	
  set	
  
of	
  excepLons	
  in	
  each	
  country	
  
•  The	
  Hargreaves	
  report	
  recommended	
  this	
  for	
  the	
  UK	
  but	
  
it	
  is	
  not	
  yet	
  in	
  law	
  
brian.hole@ubiquitypress.com	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  www.ubiquitypress.com	
  /	
  @ubiquitypress	
  
Text	
  and	
  data	
  mining	
  
[the	
  benefits	
  of	
  text	
  mining	
  include]:	
  “increased	
  researcher	
  efficiency;	
  
unlocking	
  hidden	
  informaLon	
  and	
  developing	
  new	
  knowledge;	
  exploring	
  
new	
  horizons;	
  improved	
  research	
  and	
  evidence	
  base;	
  and	
  improving	
  the	
  
search	
  process	
  and	
  quality.	
  Broader	
  economic	
  and	
  societal	
  benefits	
  
include	
  cost	
  savings	
  and	
  producLvity	
  gains,	
  innovaLve	
  new	
  service	
  
development,	
  new	
  business	
  models	
  and	
  new	
  medical	
  treatments.”	
  
JISC	
  
“The	
  downstream	
  value	
  of	
  high	
  quality,	
  high	
  throughput	
  chemical	
  
informaLon	
  extracted	
  from	
  the	
  literature	
  can	
  be	
  measured	
  against	
  
convenLonal	
  abstracLon	
  services…	
  with	
  a	
  combined	
  annual	
  turnover	
  of	
  
perhaps	
  $500-­‐1,000	
  million	
  dollars.	
  We	
  believe	
  our	
  tools	
  are	
  capable	
  of	
  
building	
  the	
  next	
  and	
  beeer	
  generaLon	
  of	
  services.”	
  
Peter	
  Murray-­‐Rust	
  
brian.hole@ubiquitypress.com	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  www.ubiquitypress.com	
  /	
  @ubiquitypress	
  
“Licences	
  for	
  Europe”	
  
•  Focus	
  is	
  to	
  create	
  new	
  licenses	
  to	
  enable	
  TDM	
  
•  I.e.	
  researcher	
  would	
  need	
  one	
  license	
  from	
  each	
  
publisher.	
  Much	
  TDM	
  work	
  involves	
  hundreds	
  of	
  
publishers,	
  can	
  take	
  weeks	
  just	
  for	
  one.	
  
•  Focus	
  pre-­‐determined	
  from	
  start:	
  to	
  come	
  up	
  with	
  
proposals	
  on	
  licenses	
  only.	
  Discussion	
  of	
  excepLons	
  
allowed	
  but	
  not	
  to	
  be	
  part	
  of	
  recommendaLons.	
  
•  Unbalanced	
  setup:	
  large	
  corporate	
  publishers,	
  technology	
  
sector	
  poorly	
  represented.	
  
Working	
  Group	
  4:	
  Text	
  and	
  Data	
  Mining	
  
•  Where	
  we	
  are	
  now:	
  civil	
  society	
  walk-­‐out.	
  Not	
  prepared	
  to	
  
endorse	
  licenses	
  as	
  acceptable.	
  Workshop	
  tba	
  Q4	
  2013.	
  
•  Tell	
  your	
  publisher	
  or	
  associaLon	
  that	
  this	
  is	
  important	
  to	
  you.	
  
brian.hole@ubiquitypress.com	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  www.ubiquitypress.com	
  /	
  @ubiquitypress	
  
Links	
  
@ubiquitypress	
  
brian.hole@ubiquitypress.com	
  
hep://www.ubiquitypress.com	
  
	
  
Leeer	
  on	
  Licenses	
  for	
  Europe	
  concerns:	
  
hep://www.coadec.com/more-­‐licences-­‐for-­‐europe	
  	
  

Data availability policies and licensing

  • 1.
    brian.hole@ubiquitypress.com                              www.ubiquitypress.com  /  @ubiquitypress   Brian  Hole   The  Now  and  Future  of  Data  Publishing,  Oxford,  22  May  2013   Data  availability  policies   and  licensing  
  • 2.
    brian.hole@ubiquitypress.com                              www.ubiquitypress.com  /  @ubiquitypress   The  Social  Contract   of  Science   •  ValidaLon   •  DisseminaLon   •  Further  development   ScienLfic  MalpracLce   •  Publishers   •  Researchers   •  Libraries,  repositories…  
  • 3.
    brian.hole@ubiquitypress.com                              www.ubiquitypress.com  /  @ubiquitypress  
  • 4.
    brian.hole@ubiquitypress.com                              www.ubiquitypress.com  /  @ubiquitypress   •  To  allow  use  material  for  research:  private  study,  criLcism   and  review.  Academics,  public,  private  sector.   We  need  fair  use  copyright  excep8ons     •  To  allow  mining  of  both  text  and  data,  by  academics  and   private  sector.   •  To  allow  material  to  be  freely  used  in  teaching  and  exams   •  Copyright  excepLons  are  currently  not  harmonized  across   the  world,  so  researchers  have  to  deal  with  a  different  set   of  excepLons  in  each  country   •  The  Hargreaves  report  recommended  this  for  the  UK  but   it  is  not  yet  in  law  
  • 5.
    brian.hole@ubiquitypress.com                              www.ubiquitypress.com  /  @ubiquitypress   Text  and  data  mining   [the  benefits  of  text  mining  include]:  “increased  researcher  efficiency;   unlocking  hidden  informaLon  and  developing  new  knowledge;  exploring   new  horizons;  improved  research  and  evidence  base;  and  improving  the   search  process  and  quality.  Broader  economic  and  societal  benefits   include  cost  savings  and  producLvity  gains,  innovaLve  new  service   development,  new  business  models  and  new  medical  treatments.”   JISC   “The  downstream  value  of  high  quality,  high  throughput  chemical   informaLon  extracted  from  the  literature  can  be  measured  against   convenLonal  abstracLon  services…  with  a  combined  annual  turnover  of   perhaps  $500-­‐1,000  million  dollars.  We  believe  our  tools  are  capable  of   building  the  next  and  beeer  generaLon  of  services.”   Peter  Murray-­‐Rust  
  • 6.
    brian.hole@ubiquitypress.com                              www.ubiquitypress.com  /  @ubiquitypress   “Licences  for  Europe”   •  Focus  is  to  create  new  licenses  to  enable  TDM   •  I.e.  researcher  would  need  one  license  from  each   publisher.  Much  TDM  work  involves  hundreds  of   publishers,  can  take  weeks  just  for  one.   •  Focus  pre-­‐determined  from  start:  to  come  up  with   proposals  on  licenses  only.  Discussion  of  excepLons   allowed  but  not  to  be  part  of  recommendaLons.   •  Unbalanced  setup:  large  corporate  publishers,  technology   sector  poorly  represented.   Working  Group  4:  Text  and  Data  Mining   •  Where  we  are  now:  civil  society  walk-­‐out.  Not  prepared  to   endorse  licenses  as  acceptable.  Workshop  tba  Q4  2013.   •  Tell  your  publisher  or  associaLon  that  this  is  important  to  you.  
  • 7.
    brian.hole@ubiquitypress.com                              www.ubiquitypress.com  /  @ubiquitypress   Links   @ubiquitypress   brian.hole@ubiquitypress.com   hep://www.ubiquitypress.com     Leeer  on  Licenses  for  Europe  concerns:   hep://www.coadec.com/more-­‐licences-­‐for-­‐europe