Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Copyright Reform and Open Data

Copyright is one of the greatest barrier to Open Data. This presentation for insidegovernment UK shows the struggle between those who want to reform copyright and those opposed to reform

Copyright Reform and Open Data

  1. 1. Copyright Reform and Open Data Peter Murray-Rust contentmine.org Improving Public Service Delivery through Open Data, InsideGovernment, London, 2015-03-19 • Copyright restricts innovation • Governments want to reform it to generate wealth • Content/text mining can generate wealth • However: massive concerted opposition
  2. 2. The Right to Read is the Right to Mine http://contentmine.org
  3. 3. Scientific and Medical publication (STM)[+] • World Citizens pay $400,000,000,000… • … for research in 1,500,000 articles … • … cost $300,000 each to create … • … $7000 each to “publish” [*]… • … $10,000,000,000 from academic libraries … • … to “publishers” who forbid access to 99.9% of citizens of the world … • 85% of medical research is wasted (not published, badly conceived, duplicated, …) [+] Figures probably +- 50 % [*] arXiV preprint server costs $7 USD per paper
  4. 4. http://chemicaltagger.ch.cam.ac.uk/ • Typical Typical chemical synthesis Ca 10 million of these paragraphs/YEAR ; Chemical information market >= 1 Billion USD
  5. 5. Open Content Mining of FACTs Machines can interpret chemical reactions We have done 500,000 patents. There are > 3,000,000 reactions/year. Added value > 1B Eur.
  6. 6. Typical Clinical Trial Effect of drug over time. It takes ½ day to extract data by hand. contentmine.org can do it in 1 second, BUT will technically break copyright.
  7. 7. Prof. Ian Hargreaves (2011): "David Cameron's exam question”: "Could it be true that laws designed more than three centuries ago with the express purpose of creating economic incentives for innovation by protecting creators' rights are today obstructing innovation and economic growth?” “yes. We have found that the UK's intellectual property framework, especially with regard to copyright, is falling behind what is needed.” "Digital Opportunity" by Prof Ian Hargreaves - http://www.ipo.gov.uk/ipreview.htm. Licensed under CC BY 3.0 via Wikipedia - https://en.wikipedia.org/wiki/File:Digital_Opportunity.jpg#/media/File:Digital_Opportunity.jpg
  8. 8. http://www.jisc.ac.uk/reports/value-and-benefits-of-text-mining In order to be 'mined', text must be accessed, copied, analysed, annotated and related to existing information and understanding. Even if the user has access rights to the material, making annotated copies can be illegal under current copyright law without the permission of the copyright holder. there is a new conundrum: the market intervention of copyright – originally intended to protect creative producers – may be inhibiting new knowledge discovery and innovation. It is often unclear whether text mining is a permissible use
  9. 9. • “creative use of these large data sets in the US health care sector could generate more than $300bn in value per annum” [MGI, McKinsey] • Gartner Inc. has identified 'Big Data' and 'Next-Generation Analytics' as two of the 'Top 10 Strategic Technologies' for 2012. • Given the volume of text generated by business, academic and social activities – in for example competitor reports, research publications or customer opinions on social networking sites – text mining is, however, highly important. [JISC] • there are some tasks that simply could not be achieved without using text mining. For example, a major pharmaceutical company used text mining tools to evaluate 50,000 patents in 18 months. This would have taken 50 person years to achieve manually, meaning that it would not even have been contemplated. [JISC] “Big Data – and Analytics (ContentMining)
  10. 10. Hargreaves’ Recommendations 2011[*] • Government should deliver copyright exceptions at national level to realise all the opportunities within the EU framework, including format shifting, parody, non- commercial research, and library archiving. • The UK should also promote at EU level an exception to support text and data analytics. The UK should give a lead at EU level to develop a further copyright exception designed to build into the EU framework adaptability to new technologies. [*] Now UK law (2014-06 and 2014-10)
  11. 11. 2014-10 this has now been overturned by a higher court
  12. 12. PUBLISHER TDM LICENCE INITIATIVES GENERALLY DO NOT HELP • Publishers have started offering their own TDM licences and policies • Their licences often impose unfair (and in the case of the UK, unenforceable) constraints on researchers’ freedom to exploit TDM, e.g., requiring users to employ publisher’s API, putting unnecessary restrictions on how much can be copied, or how fast it can be copied. • Why “unenforceable”? Because, as noted earlier, UK law specifically states that any contract or licence term that prevents anyone from doing TDM in the manner prescribed in the new exception shall be deemed null and void. • Really need a test case on these attempted restrictions. • Springer and Royal Society offer generous TDM provisions. • So why are so many publishers offering restrictive licences in the UK? Maybe they hope licensees are ignorant of the strength of the new law, or the publishers in fact don’t know about it. So they are either deliberately misleading, or ignorant Prof Charles Oppenheim and contentmine.org
  13. 13. As revealed by the 'public domain calculator’ established by Europeana, there is a staggering complexity in the determination of the different copyright term lengths in member states, some of them requiring knowledge about the circumstances of the author's death or about the situation of the author's heirs at the time of her death – Public domain material is frequently “re-copyrighted” (Copyfraud) Which English storybook will never be public domain[1]? [1]Peter Pan The Public domain
  14. 14. The Right to Read is The Right To Mine PMR in 2012: http://blog.okfn.org/2012/06/01/the-right-to-read-is-the-right-to-mine/
  15. 15. The Hague Declaration[*] • Intellectual property was not designed to regulate the free flow of facts, data and ideas, nor should it. • Freedom to analyse and pursue intellectual curiosity without fear of monitoring or repercussions must not be eroded in the digital environment. • Ethics around the use of data and content mining will need to continue to evolve in response to changing technology • Open access as “a comprehensive source of human knowledge and cultural heritage” must be pursued in order to increase the uptake of and equality of access to content mining technologies. • Innovation and commercial research based on the use of facts, data, and ideas should not be restricted by intellectual property law. • [*] Drafted 2014-12 , convened by LIBER (Association of European Research Libraries)
  16. 16. The Hague Declaration on Knowledge Discovery in the Digital Age 2015 • The potential benefits of Text and Data Mining are vast and include: • • Addressing grand challenges such as climate change and global epidemics • Improving population health, wealth and development • Creating new jobs and employment • Exponentially increasing the speed and progress of science through new insights and greater efficiency of research • Increasing transparency of Governments and their actions • Fostering innovation and collaboration and boosting the impact of open science • Creating tools for education and research • Providing new and richer cultural insights • Speeding economic and social development in all parts of the globe • LIBER – Association of European Research Libraries
  17. 17. Pirate Party, MEP
  18. 18. Some of Reda’s 25 recommendations • introduction of a single European Copyright Title based on Article 118 TFEU that would apply directly and uniformly across the Union, • the EU legislator should further lower the barriers for re-use of public sector information by exempting works produced by the public sector - within the political, legal and administrative process - from copyright protection; • the ability to freely link from one resource to another is one of the fundamental building blocks of the Internet; • Emphasises that the exception for caricature, parody and pastiche should apply regardless of the purpose of the parodic use; • Stresses the need to enable automated analytical techniques for text and data (e.g. 'text and data mining') for all purposes, provided that the permission to read the work has been acquired;
  19. 19. Some Contacts/references • JISC report on benefits of TDM - D. McDonald and U. Kelly, The value and benefits of text mining (2012), http://www.jisc.ac.uk/reports/value-and-benefits-of-text-mining • Official guidance on the new UK copyright exception for TDM - https://www.gov.uk/government/uploads/system/uploads/attachm ent_data/file/315014/copyright-guidance-research.pdf • Excellent general overview of the change to UK law and its implications - http://copyrightuser.org/topics/text-and-data- mining/ - provides link to the precise wording in the law • http://contentmine.org
  20. 20. THE NEW UK LAW • Came into force in June 2014 • Specific exception to copyright for TDM • From the official guidance to the new exception issued by HMG: “Text and data mining usually requires copying of the work to be analysed. An exception to copyright exists which allows researchers to make copies of any copyright material for the purpose of computational analysis if they already have the right to read the work (that is, they have ‘lawful access’ to the work). This exception only permits the making of copies for the purpose of text and data mining for non-commercial research. Publishers and content providers will be able to apply reasonable measures to maintain their network security or stability but these measures should not prevent or unreasonably restrict researcher’s ability to text and data mine. Contract terms that stop researchers making copies to carry out text and data mining will be unenforceable.” • Does not apply to database right. Interesting problem if a particular database enjoys both copyright and database right! • UK researchers do not have to ask for permission, pay fees, etc., to do such TDM • What is, or is not “non-commercial”? Not always clear! The question must be asked at the time the TDM was undertaken, so unexpected commercial benefits at the end of the project as OK, so long as at the time the intent was non-commercial • “Lawful access” usually means licensed content, whether OA or a subscription to the materials, but also includes lawful access to printed works held in, say, a library • Prof Charles Oppenheim and contentmine.org

×