Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

What is Text and Data Mining (TDM)?


Published on

What is Text and Data Mining (TDM)? How does it fit with Open Access and Open Science? Why librarians, information professionals and research support administrators should care about it? This webinar describes TDM, explains the importance of machine accessibility to Open Access content and describes how open content can be used for TDM purposes. It also provides examples for TDM readings and courses designed for those who work in libraries or research offices providing research support.

• slide 5 —>
• slide 6 —>
• slide 11 —>
• slide 13 —>
• slide 16 —>

Published in: Science
  • Be the first to comment

  • Be the first to like this

What is Text and Data Mining (TDM)?

  1. 1. Infographic Access the connector: Discovery services: Proprietary APIs Connector layer frontiers Crossref COREPublisher Connector PubMedOAsubset arXiv Dataset Numberofresources 492,462 59,512 172,812 1,831,877 Open Access articles seamlessly accessible by everyone 7% of the total content available from the above publishers is Open Access Every record contains metadata and full text All resources are accessible via ResourceSync and more publishers on the way... Every resource is automatically synchronised across all clients The largest datasets for text mining Gold Open Access - arXiv: 1,261,533 - PubMed Central (OA subset):1,582,188 - CORE Publisher Connector:1,660,625 For the largest collection of Green & Gold Open Access content, look at pdf Title Authors Publisher DOI ... Master copy through ResourceSync sitemaps Synchronised copy automated synchronisation immediate propagation of deletion 1,107,091
  2. 2. Presentation of the expertise directory Knoth, P., Anastasiou, L., Pearce, S. and Pontika, M. (2018) Towards a Global Comprehensive Dataset of Open Access Papers for Text Analytics, Open Repositories 2018, Bozeman, Montana FORCE2017 Conference – workshop on ”Improve interoperability across publisher platforms to support text and data mining” – 33 publishers attended
  3. 3. Dataset statistics Source type Details Number of open access articles Repositories and full OA publishers (OpenAIRE and CORE) 3,667 data sources globally harvested using OAI-PMH 9,033,808 CORE Publisher Connector Elsevier 1,191,785 Springer 540,889 Frontiers 65,927 PLoS 179,571 Total publisher connector 1,978,172 Total Dataset 11,011,980 Knoth, P., Anastasiou, L., Pearce, S. and Pontika, M. (2018) Towards a Global Comprehensive Dataset of Open Access Papers for Text Analytics, Open Repositories 2018, Bozeman, Montana
  4. 4. Promotion of the expertise directory Knoth, P., Pontika, N., Anastasiou, L. Releasing 1.8 million open access publications from publisher systems for text and data mining, LSE Blog pactofsocialsciences/2018/03/22/releasing-1-8-million-open-access-publications-from-publisher-systems-for-text-and-data-mining/
  5. 5. • Established and maintain a close collaboration with researchers • Extensive experience in advocacy, i.e. open access • Knowledgeable about the repository’s collection • Participate in the Academic Institution’s Research Committees • Knowledgeable of your repository’s collection • Familiarity with Copyright issues and Creative Commons Licen ses TDM & Research Support Staff
  6. 6. Where to find TDM related material - I 3 TDM taxonomies developed by the project: • Text and Data Mining • TDM Methods • TDM workflows OMTD tutorials and courses url :
  7. 7. Where to find TDM related material - II Educational training videos introducing TDM concepts Other TDM training materials
  8. 8. TDM taxonomy url :
  9. 9. Introduction to TDM course - I Created by OU and LIBER in c ollaboration with Cambridge University. • First technical TDM course addressed to research support staff. • Presents OMTD and guides how to use it. • Hands-on examples on basic TDM processes
  10. 10. Introduction to TDM course - II Suggested readings Introductory videos
  11. 11. Introduction to TDM course - III Quizzes Claim course
  12. 12. Introduction to TDM course - IV
  13. 13. Thank you!