Advertisement

What is Text and Data Mining (TDM)?

OpenAIRE
OpenAIRE
Oct. 10, 2018
Advertisement

More Related Content

Advertisement

What is Text and Data Mining (TDM)?

  1. Infographic Access the connector: http://publisher-connector.core.ac.uk/resourcesync Discovery services: Proprietary APIs Connector layer frontiers Crossref COREPublisher Connector PubMedOAsubset arXiv Dataset Numberofresources 492,462 59,512 172,812 1,831,877 Open Access articles seamlessly accessible by everyone 7% of the total content available from the above publishers is Open Access Every record contains metadata and full text All resources are accessible via ResourceSync and more publishers on the way... Every resource is automatically synchronised across all clients The largest datasets for text mining Gold Open Access - arXiv: 1,261,533 - PubMed Central (OA subset):1,582,188 - CORE Publisher Connector:1,660,625 For the largest collection of Green & Gold Open Access content, look at https://core.ac.uk/services#dataset pdf Title Authors Publisher DOI ... Master copy through ResourceSync sitemaps Synchronised copy automated synchronisation immediate propagation of deletion 1,107,091
  2. Presentation of the expertise directory Knoth, P., Anastasiou, L., Pearce, S. and Pontika, M. (2018) Towards a Global Comprehensive Dataset of Open Access Papers for Text Analytics, Open Repositories 2018, Bozeman, Montana FORCE2017 Conference – workshop on ”Improve interoperability across publisher platforms to support text and data mining” – 33 publishers attended
  3. Dataset statistics Source type Details Number of open access articles Repositories and full OA publishers (OpenAIRE and CORE) 3,667 data sources globally harvested using OAI-PMH 9,033,808 CORE Publisher Connector Elsevier 1,191,785 Springer 540,889 Frontiers 65,927 PLoS 179,571 Total publisher connector 1,978,172 Total Dataset 11,011,980 Knoth, P., Anastasiou, L., Pearce, S. and Pontika, M. (2018) Towards a Global Comprehensive Dataset of Open Access Papers for Text Analytics, Open Repositories 2018, Bozeman, Montana
  4. Promotion of the expertise directory Knoth, P., Pontika, N., Anastasiou, L. Releasing 1.8 million open access publications from publisher systems for text and data mining, LSE Blog http://blogs.lse.ac.uk/im pactofsocialsciences/2018/03/22/releasing-1-8-million-open-access-publications-from-publisher-systems-for-text-and-data-mining/
  5. • Established and maintain a close collaboration with researchers • Extensive experience in advocacy, i.e. open access • Knowledgeable about the repository’s collection • Participate in the Academic Institution’s Research Committees • Knowledgeable of your repository’s collection • Familiarity with Copyright issues and Creative Commons Licen ses TDM & Research Support Staff
  6. Where to find TDM related material - I 3 TDM taxonomies developed by the project: • Text and Data Mining • TDM Methods • TDM workflows OMTD tutorials and courses url : https://www.fosteropenscience.eu/openminted
  7. Where to find TDM related material - II Educational training videos introducing TDM concepts Other TDM training materials
  8. TDM taxonomy url : https://www.fosteropenscience.eu/openminted
  9. Introduction to TDM course - I Created by OU and LIBER in c ollaboration with Cambridge University. • First technical TDM course addressed to research support staff. • Presents OMTD and guides how to use it. • Hands-on examples on basic TDM processes
  10. Introduction to TDM course - II Suggested readings Introductory videos
  11. Introduction to TDM course - III Quizzes Claim course
  12. Introduction to TDM course - IV
  13. Thank you!
Advertisement