Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Brian Hole - Text and Data Mining - European Parliament presentation

Presentation given to the Copyright & Research and Innovation Policy meeting at the European Parliament, 12 November 2013

  • Login to see the comments

  • Be the first to like this

Brian Hole - Text and Data Mining - European Parliament presentation

  1. 1. / @ubiquitypress Brian Hole Copyright & Research and Innovation Policy meeting, European Parliament, Brussels, 12 November 2013 Text and data mining
  2. 2. / @ubiquitypress The Social Contract of Science • Validation • Dissemination • Further development Scientific Malpractice
  3. 3. / @ubiquitypress • Open access and open data (with CC-By and CC0 licenses) would mean that all research text and data were available for mining, reuse and analysis • But legacy publishers are resisting open practices Text and data mining The ideal situation: • A fair dealing exception is required that allows for academic (and arguably other, e.g. commercial) mining of both text and data For other cases: • Research in general • Teaching Exceptions also required for:
  4. 4. / @ubiquitypress • TDM involves multiple, highly heterogenous sources, not only in journals and books, but anywhere on the Internet. Licensing cannot practically scale to cover this. • TDM is simply reading of content, a right researchers already have. Copyright was never intended to cover such use. This is temporary copying for reading, not creative use. Copyright should therefore be amended, not additional licenses imposed to perpetuate the problem. Licensing Additional licensing is not a suitable solution: • TDM licenses would prevent progress, prevent efficient use of taxpayer money, prevent growth of SMEs, and block work that prevents deaths.
  5. 5. / @ubiquitypress • TDM is not a highly frequent activity, and involves touching each resource only once. • This is much lower than normal user behaviour and crawling by other services. False objections • Any higher level of load could be easily and cost-effectively managed – benefits of additional use and citation far outway this. ‘Server overload’: • Scientific facts and information are not your content. The need to control access to ‘our content’: • This results in building a reputation of being against open science and scientific progress.