SlideShare a Scribd company logo
Text mining
What is text and data mining?
Text Mining is an interdisciplinary field combining techniques
from linguistics, computer science and statistics to build tools
that can efficiently retrieve and extract information from
digital text.
http://blogs.plos.org/everyone/2013/04/17/announcing-the-plos-text-mining-collection/
It uses powerful computers to find links between drugs
and side effects, or genes and diseases, that are hidden
within the vast scientific literature. These are discoveries
that a person scouring through papers one by one may
never notice.
http://www.theguardian.com/science/2012/may/23/text-mining-research-tool-forbidden
What is the issue?
• Researchers find it impractical to negotiate multiple bilateral
agreements with hundreds of subscription-based publishers in
order to authorize TDM of subscribed content.
• Subscription-based publishers find it impractical to negotiate
multiple bilateral agreements with thousands of researchers and
institutions in order to authorize TDM of subscribed content.
• All parties would benefit from support of standard APIs and data
representations in order to enable TDM across both open
access and subscription-based publishers.
How to solve it?
• Crossref REST API: designed to allow researchers to
easily harvest full text documents from all
participating publishers regardless of their business
model (e.g. open access, subscription).
Common API Summary
• Content Negotiation (Required)
• New Metadata (Required)
• Full text URIs
• License URIs
• Rate Limiting Headers (optional)
How does it
work?
Step 1: A researcher identifies the articles they are interested in
The search engines they use bring back results from lots of different publishers. They can also use
CrossRef to search.
The searches they run bring back results showing publications from a range of publishers, in
different locations and using different business models.
The challenge is to harvest all these articles in order to be able to mine them, without engaging in
individual transactions with each publisher.
How to do that?
Each of those articles has a DOI, or digital
object identifier. Each DOI is unique and
identifies the paper. Researchers are familiar
with DOIs and are used to working with them.
Search engines will allow them to download
DOIs as a list, the researcher does not
need to go to each paper to extract the DOI
from it.
2. The researcher takes the DOIs that correspond to the articles they are interested in
10.5555/12345678
10.5556/12345679
10.1016/12345680
10.8080/12345681
10.1155/12345682
10.1100/12345683
10.5555/12345684
10.1007/12345685
10.1111/12345686
10.2406/12345687
10.3994/12345688
10.5006/12345689
Click to download
3. The researcher gives this list to the CrossRef Text and Data Mining API
And that tells them
Where the full-text is located What they are allowed to do with it
What are they are allowed to do with the content?
This is communicated by licence information that publishers give to CrossRef.
Some publishers ask researchers to agree to an additional licence to be able to use their content for
mining.
Researchers are able to log in to CrossRef TDM with their ORCID ID where they can view and accept
publisher licences ALL in one place. No multiple actions are needed.
The publishers do not charge researchers for this, and CrossRef does not charge
researchers for the service.
4. The researcher uses that information to go directly to each publisher via CrossRef. It is a central
channel for them visit thousands of publishers via one request or transaction
Where they will be identified in a number of ways:
 No identification (Open Access content)
 IP recognition/log in credentials
 IP recognition/log in credentials + CrossRef
token (API key) from the TDM service
Benefits
• Streamlines researcher access to distributed full text for
TDM
• Enables machine-to-machine, automated access for
recognized TDM (i.e. researchers won’t be locked out of publisher sites)
• Enables article-level licensing info and easy mechanism
for supplemental T&Cs for text and data mining
(publishers discussing model license via STM)
Thank you!

More Related Content

What's hot

Registering content to enable connections - Rachael Lammey
Registering content to enable connections - Rachael LammeyRegistering content to enable connections - Rachael Lammey
Registering content to enable connections - Rachael Lammey
Crossref
 
Similarity check webinar
Similarity check webinar Similarity check webinar
Similarity check webinar
Crossref
 
ORCID: An Overview - Alice Meadows
ORCID: An Overview - Alice MeadowsORCID: An Overview - Alice Meadows
ORCID: An Overview - Alice Meadows
Crossref
 
CrossCheck iThenticate Admin Webinar
CrossCheck iThenticate Admin WebinarCrossCheck iThenticate Admin Webinar
CrossCheck iThenticate Admin Webinar
Crossref
 
Introducing Crossref Similarity Check
Introducing Crossref Similarity CheckIntroducing Crossref Similarity Check
Introducing Crossref Similarity Check
Crossref
 
Introduction to Crossref
Introduction to CrossrefIntroduction to Crossref
Introduction to Crossref
Crossref
 
Crossref LIVE UK Online
Crossref LIVE UK OnlineCrossref LIVE UK Online
Crossref LIVE UK Online
Crossref
 
New product developments - Jennifer Lin - London LIVE 2017
New product developments - Jennifer Lin - London LIVE 2017New product developments - Jennifer Lin - London LIVE 2017
New product developments - Jennifer Lin - London LIVE 2017
Crossref
 
Checking for Originality: Crossref Similarity Check
Checking for Originality: Crossref Similarity CheckChecking for Originality: Crossref Similarity Check
Checking for Originality: Crossref Similarity Check
Crossref
 
Who is using your metadata - Ginny Hendricks
Who is using your metadata - Ginny HendricksWho is using your metadata - Ginny Hendricks
Who is using your metadata - Ginny Hendricks
Crossref
 
Introduction to DataCite - Martin Fenner
Introduction to DataCite - Martin FennerIntroduction to DataCite - Martin Fenner
Introduction to DataCite - Martin Fenner
Crossref
 
introduction to crossmark lastest
introduction to crossmark lastestintroduction to crossmark lastest
introduction to crossmark lastest
Crossref
 
CrossRef Branding Update
CrossRef Branding UpdateCrossRef Branding Update
CrossRef Branding Update
Crossref
 
Geoffrey Bilder: Strategic Initiatives Update #crossref15
Geoffrey Bilder: Strategic Initiatives Update #crossref15Geoffrey Bilder: Strategic Initiatives Update #crossref15
Geoffrey Bilder: Strategic Initiatives Update #crossref15
Crossref
 
2013 CrossRef Annual Meeting Flash Update CrossCheck and CrossMark Rachael La...
2013 CrossRef Annual Meeting Flash Update CrossCheck and CrossMark Rachael La...2013 CrossRef Annual Meeting Flash Update CrossCheck and CrossMark Rachael La...
2013 CrossRef Annual Meeting Flash Update CrossCheck and CrossMark Rachael La...
Crossref
 
Crossref webinar: Stephanie Dawson - SciencOpen Metadata 091118
Crossref webinar: Stephanie Dawson - SciencOpen Metadata 091118Crossref webinar: Stephanie Dawson - SciencOpen Metadata 091118
Crossref webinar: Stephanie Dawson - SciencOpen Metadata 091118
Crossref
 
Content Registration at Crossref - LIVE Kuala Lumpur
Content Registration at Crossref - LIVE Kuala LumpurContent Registration at Crossref - LIVE Kuala Lumpur
Content Registration at Crossref - LIVE Kuala Lumpur
Crossref
 
Crossmark Update Webinar
Crossmark Update WebinarCrossmark Update Webinar
Crossmark Update Webinar
Crossref
 
Crossref similarity check update webinar Sept 2016
Crossref similarity check update webinar Sept 2016Crossref similarity check update webinar Sept 2016
Crossref similarity check update webinar Sept 2016
Crossref
 
Introduction to Crossref - Crossref LIVE Kuala Lumpur
Introduction to Crossref - Crossref LIVE Kuala LumpurIntroduction to Crossref - Crossref LIVE Kuala Lumpur
Introduction to Crossref - Crossref LIVE Kuala Lumpur
Crossref
 

What's hot (20)

Registering content to enable connections - Rachael Lammey
Registering content to enable connections - Rachael LammeyRegistering content to enable connections - Rachael Lammey
Registering content to enable connections - Rachael Lammey
 
Similarity check webinar
Similarity check webinar Similarity check webinar
Similarity check webinar
 
ORCID: An Overview - Alice Meadows
ORCID: An Overview - Alice MeadowsORCID: An Overview - Alice Meadows
ORCID: An Overview - Alice Meadows
 
CrossCheck iThenticate Admin Webinar
CrossCheck iThenticate Admin WebinarCrossCheck iThenticate Admin Webinar
CrossCheck iThenticate Admin Webinar
 
Introducing Crossref Similarity Check
Introducing Crossref Similarity CheckIntroducing Crossref Similarity Check
Introducing Crossref Similarity Check
 
Introduction to Crossref
Introduction to CrossrefIntroduction to Crossref
Introduction to Crossref
 
Crossref LIVE UK Online
Crossref LIVE UK OnlineCrossref LIVE UK Online
Crossref LIVE UK Online
 
New product developments - Jennifer Lin - London LIVE 2017
New product developments - Jennifer Lin - London LIVE 2017New product developments - Jennifer Lin - London LIVE 2017
New product developments - Jennifer Lin - London LIVE 2017
 
Checking for Originality: Crossref Similarity Check
Checking for Originality: Crossref Similarity CheckChecking for Originality: Crossref Similarity Check
Checking for Originality: Crossref Similarity Check
 
Who is using your metadata - Ginny Hendricks
Who is using your metadata - Ginny HendricksWho is using your metadata - Ginny Hendricks
Who is using your metadata - Ginny Hendricks
 
Introduction to DataCite - Martin Fenner
Introduction to DataCite - Martin FennerIntroduction to DataCite - Martin Fenner
Introduction to DataCite - Martin Fenner
 
introduction to crossmark lastest
introduction to crossmark lastestintroduction to crossmark lastest
introduction to crossmark lastest
 
CrossRef Branding Update
CrossRef Branding UpdateCrossRef Branding Update
CrossRef Branding Update
 
Geoffrey Bilder: Strategic Initiatives Update #crossref15
Geoffrey Bilder: Strategic Initiatives Update #crossref15Geoffrey Bilder: Strategic Initiatives Update #crossref15
Geoffrey Bilder: Strategic Initiatives Update #crossref15
 
2013 CrossRef Annual Meeting Flash Update CrossCheck and CrossMark Rachael La...
2013 CrossRef Annual Meeting Flash Update CrossCheck and CrossMark Rachael La...2013 CrossRef Annual Meeting Flash Update CrossCheck and CrossMark Rachael La...
2013 CrossRef Annual Meeting Flash Update CrossCheck and CrossMark Rachael La...
 
Crossref webinar: Stephanie Dawson - SciencOpen Metadata 091118
Crossref webinar: Stephanie Dawson - SciencOpen Metadata 091118Crossref webinar: Stephanie Dawson - SciencOpen Metadata 091118
Crossref webinar: Stephanie Dawson - SciencOpen Metadata 091118
 
Content Registration at Crossref - LIVE Kuala Lumpur
Content Registration at Crossref - LIVE Kuala LumpurContent Registration at Crossref - LIVE Kuala Lumpur
Content Registration at Crossref - LIVE Kuala Lumpur
 
Crossmark Update Webinar
Crossmark Update WebinarCrossmark Update Webinar
Crossmark Update Webinar
 
Crossref similarity check update webinar Sept 2016
Crossref similarity check update webinar Sept 2016Crossref similarity check update webinar Sept 2016
Crossref similarity check update webinar Sept 2016
 
Introduction to Crossref - Crossref LIVE Kuala Lumpur
Introduction to Crossref - Crossref LIVE Kuala LumpurIntroduction to Crossref - Crossref LIVE Kuala Lumpur
Introduction to Crossref - Crossref LIVE Kuala Lumpur
 

Viewers also liked

Using Funding Data
Using Funding DataUsing Funding Data
Using Funding Data
Crossref
 
Good Practice Publishing
Good Practice PublishingGood Practice Publishing
Good Practice Publishing
Crossref
 
Getting started with Content Registration 012617
Getting started with Content Registration 012617Getting started with Content Registration 012617
Getting started with Content Registration 012617
Crossref
 
Cited-by Linking
Cited-by Linking Cited-by Linking
Cited-by Linking
Crossref
 
Multiple Resolution and handling content available in multiple places
Multiple Resolution and handling content available in multiple placesMultiple Resolution and handling content available in multiple places
Multiple Resolution and handling content available in multiple places
Crossref
 
Getting started with Reference Linking
Getting started with Reference LinkingGetting started with Reference Linking
Getting started with Reference Linking
Crossref
 
Preprints & Scholarly Infrastructure
Preprints & Scholarly InfrastructurePreprints & Scholarly Infrastructure
Preprints & Scholarly Infrastructure
Crossref
 
CrossMark How To
CrossMark How ToCrossMark How To
CrossMark How To
Crossref
 
Welcome and What's Happening at Crossref
Welcome and What's Happening at CrossrefWelcome and What's Happening at Crossref
Welcome and What's Happening at Crossref
Crossref
 
Crossref Support
Crossref SupportCrossref Support
Crossref Support
Crossref
 
How Libraries Use Publisher Metadata - Crossref Community Webinar
How Libraries Use Publisher Metadata - Crossref Community WebinarHow Libraries Use Publisher Metadata - Crossref Community Webinar
How Libraries Use Publisher Metadata - Crossref Community Webinar
Crossref
 
Crossref's work with Wikimedia and Event Data
Crossref's work with Wikimedia and Event DataCrossref's work with Wikimedia and Event Data
Crossref's work with Wikimedia and Event Data
Crossref
 
Der Nobelpreis geht an: Vitamin C
Der Nobelpreis geht an: Vitamin CDer Nobelpreis geht an: Vitamin C
Der Nobelpreis geht an: Vitamin CDr Rath
 
Text Mining for Second Screen
Text Mining for Second ScreenText Mining for Second Screen
Text Mining for Second ScreenIvan Demin
 
Semantische Systeme 3 0
Semantische Systeme 3 0Semantische Systeme 3 0
Semantische Systeme 3 0
Andreas Blumauer
 
Processing Big Data in Real-Time - Yanai Franchi, Tikal
Processing Big Data in Real-Time - Yanai Franchi, TikalProcessing Big Data in Real-Time - Yanai Franchi, Tikal
Processing Big Data in Real-Time - Yanai Franchi, Tikal
Codemotion Tel Aviv
 
EBM DataLab Presentation from OpenCon Oxford
EBM DataLab Presentation from OpenCon OxfordEBM DataLab Presentation from OpenCon Oxford
EBM DataLab Presentation from OpenCon Oxford
Crossref
 
Presentation from ALA Midwinter 2014 on Elsevier's new Text and Data Mining P...
Presentation from ALA Midwinter 2014 on Elsevier's new Text and Data Mining P...Presentation from ALA Midwinter 2014 on Elsevier's new Text and Data Mining P...
Presentation from ALA Midwinter 2014 on Elsevier's new Text and Data Mining P...
Chris Shillum
 
Crossref Community Webinar - Asia Pacific 12-14-2016
Crossref Community Webinar - Asia Pacific 12-14-2016Crossref Community Webinar - Asia Pacific 12-14-2016
Crossref Community Webinar - Asia Pacific 12-14-2016
Crossref
 
Visual data mining with HeatMiner
Visual data mining with HeatMinerVisual data mining with HeatMiner
Visual data mining with HeatMiner
CloudNSci
 

Viewers also liked (20)

Using Funding Data
Using Funding DataUsing Funding Data
Using Funding Data
 
Good Practice Publishing
Good Practice PublishingGood Practice Publishing
Good Practice Publishing
 
Getting started with Content Registration 012617
Getting started with Content Registration 012617Getting started with Content Registration 012617
Getting started with Content Registration 012617
 
Cited-by Linking
Cited-by Linking Cited-by Linking
Cited-by Linking
 
Multiple Resolution and handling content available in multiple places
Multiple Resolution and handling content available in multiple placesMultiple Resolution and handling content available in multiple places
Multiple Resolution and handling content available in multiple places
 
Getting started with Reference Linking
Getting started with Reference LinkingGetting started with Reference Linking
Getting started with Reference Linking
 
Preprints & Scholarly Infrastructure
Preprints & Scholarly InfrastructurePreprints & Scholarly Infrastructure
Preprints & Scholarly Infrastructure
 
CrossMark How To
CrossMark How ToCrossMark How To
CrossMark How To
 
Welcome and What's Happening at Crossref
Welcome and What's Happening at CrossrefWelcome and What's Happening at Crossref
Welcome and What's Happening at Crossref
 
Crossref Support
Crossref SupportCrossref Support
Crossref Support
 
How Libraries Use Publisher Metadata - Crossref Community Webinar
How Libraries Use Publisher Metadata - Crossref Community WebinarHow Libraries Use Publisher Metadata - Crossref Community Webinar
How Libraries Use Publisher Metadata - Crossref Community Webinar
 
Crossref's work with Wikimedia and Event Data
Crossref's work with Wikimedia and Event DataCrossref's work with Wikimedia and Event Data
Crossref's work with Wikimedia and Event Data
 
Der Nobelpreis geht an: Vitamin C
Der Nobelpreis geht an: Vitamin CDer Nobelpreis geht an: Vitamin C
Der Nobelpreis geht an: Vitamin C
 
Text Mining for Second Screen
Text Mining for Second ScreenText Mining for Second Screen
Text Mining for Second Screen
 
Semantische Systeme 3 0
Semantische Systeme 3 0Semantische Systeme 3 0
Semantische Systeme 3 0
 
Processing Big Data in Real-Time - Yanai Franchi, Tikal
Processing Big Data in Real-Time - Yanai Franchi, TikalProcessing Big Data in Real-Time - Yanai Franchi, Tikal
Processing Big Data in Real-Time - Yanai Franchi, Tikal
 
EBM DataLab Presentation from OpenCon Oxford
EBM DataLab Presentation from OpenCon OxfordEBM DataLab Presentation from OpenCon Oxford
EBM DataLab Presentation from OpenCon Oxford
 
Presentation from ALA Midwinter 2014 on Elsevier's new Text and Data Mining P...
Presentation from ALA Midwinter 2014 on Elsevier's new Text and Data Mining P...Presentation from ALA Midwinter 2014 on Elsevier's new Text and Data Mining P...
Presentation from ALA Midwinter 2014 on Elsevier's new Text and Data Mining P...
 
Crossref Community Webinar - Asia Pacific 12-14-2016
Crossref Community Webinar - Asia Pacific 12-14-2016Crossref Community Webinar - Asia Pacific 12-14-2016
Crossref Community Webinar - Asia Pacific 12-14-2016
 
Visual data mining with HeatMiner
Visual data mining with HeatMinerVisual data mining with HeatMiner
Visual data mining with HeatMiner
 

Similar to Text and Data Mining

Introduction to CrossRef Text and Data Mining Webinar
Introduction to CrossRef Text and Data Mining WebinarIntroduction to CrossRef Text and Data Mining Webinar
Introduction to CrossRef Text and Data Mining Webinar
Crossref
 
UKSG Conference 2015 - CrossRef Text and Data Mining Services: one year in Ra...
UKSG Conference 2015 - CrossRef Text and Data Mining Services: one year in Ra...UKSG Conference 2015 - CrossRef Text and Data Mining Services: one year in Ra...
UKSG Conference 2015 - CrossRef Text and Data Mining Services: one year in Ra...
UKSG: connecting the knowledge community
 
CrossRef Text and Data Mining
CrossRef Text and Data MiningCrossRef Text and Data Mining
CrossRef Text and Data Mining
Crossref
 
CrossRef Text & Data Mining - UKSG 2015
CrossRef Text & Data Mining - UKSG 2015CrossRef Text & Data Mining - UKSG 2015
CrossRef Text & Data Mining - UKSG 2015
Crossref
 
Who is using your content?
Who is using your content? Who is using your content?
Who is using your content?
Crossref
 
Revelations about relations in connecting research: content types, data and i...
Revelations about relations in connecting research: content types, data and i...Revelations about relations in connecting research: content types, data and i...
Revelations about relations in connecting research: content types, data and i...
Jisc
 
Roy "Accelerating ML/AI Based R&D through Text & Data Mining"
Roy "Accelerating ML/AI Based R&D through Text & Data Mining"Roy "Accelerating ML/AI Based R&D through Text & Data Mining"
Roy "Accelerating ML/AI Based R&D through Text & Data Mining"
National Information Standards Organization (NISO)
 
FAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research CommonsFAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research Commons
Carole Goble
 
Why we need oa infrastructure - STM Association Beyond Open Access Seminar
Why we need oa infrastructure - STM Association Beyond Open Access SeminarWhy we need oa infrastructure - STM Association Beyond Open Access Seminar
Why we need oa infrastructure - STM Association Beyond Open Access Seminar
National Information Standards Organization (NISO)
 
VODAN Africa IN.pptx
VODAN Africa IN.pptxVODAN Africa IN.pptx
VODAN Africa IN.pptx
Getu Tadele
 
2013 CrossRef Workshops Text Data Mining Geoffrey Bilder
2013 CrossRef Workshops Text Data Mining Geoffrey Bilder2013 CrossRef Workshops Text Data Mining Geoffrey Bilder
2013 CrossRef Workshops Text Data Mining Geoffrey Bilder
Crossref
 
Multi-agent interactions on the Web through Linked Data Notifications
Multi-agent interactions on the Web through Linked Data NotificationsMulti-agent interactions on the Web through Linked Data Notifications
Multi-agent interactions on the Web through Linked Data Notifications
Jean-Paul Calbimonte
 
Crossref Content Registration - LIVE Mumbai
Crossref Content Registration - LIVE MumbaiCrossref Content Registration - LIVE Mumbai
Crossref Content Registration - LIVE Mumbai
Crossref
 
A comparative study between commercial and open source discovery tools
A comparative study between commercial and open source discovery toolsA comparative study between commercial and open source discovery tools
A comparative study between commercial and open source discovery tools
SusantaSethi3
 
Show and tell program 04 2014-09-04
Show and tell program 04 2014-09-04Show and tell program 04 2014-09-04
Show and tell program 04 2014-09-04
nihshowandtell
 
Evaluation of Web Scale Discovery Services
Evaluation of Web Scale Discovery ServicesEvaluation of Web Scale Discovery Services
Evaluation of Web Scale Discovery Services
Nikesh Narayanan
 
Webinar: Lucidworks + Thomson Reuters for Improved Investment Performance
Webinar: Lucidworks + Thomson Reuters for Improved Investment PerformanceWebinar: Lucidworks + Thomson Reuters for Improved Investment Performance
Webinar: Lucidworks + Thomson Reuters for Improved Investment Performance
Lucidworks
 
Internet and open source concepts
Internet and open source conceptsInternet and open source concepts
Internet and open source concepts
Sachidananda M H
 
UKSG 2018 Lightning Talk - Annotations as research objects: findable, indexab...
UKSG 2018 Lightning Talk - Annotations as research objects: findable, indexab...UKSG 2018 Lightning Talk - Annotations as research objects: findable, indexab...
UKSG 2018 Lightning Talk - Annotations as research objects: findable, indexab...
UKSG: connecting the knowledge community
 
Show and tell program 04 2014-09-04
Show and tell program 04 2014-09-04Show and tell program 04 2014-09-04
Show and tell program 04 2014-09-04David Phillips
 

Similar to Text and Data Mining (20)

Introduction to CrossRef Text and Data Mining Webinar
Introduction to CrossRef Text and Data Mining WebinarIntroduction to CrossRef Text and Data Mining Webinar
Introduction to CrossRef Text and Data Mining Webinar
 
UKSG Conference 2015 - CrossRef Text and Data Mining Services: one year in Ra...
UKSG Conference 2015 - CrossRef Text and Data Mining Services: one year in Ra...UKSG Conference 2015 - CrossRef Text and Data Mining Services: one year in Ra...
UKSG Conference 2015 - CrossRef Text and Data Mining Services: one year in Ra...
 
CrossRef Text and Data Mining
CrossRef Text and Data MiningCrossRef Text and Data Mining
CrossRef Text and Data Mining
 
CrossRef Text & Data Mining - UKSG 2015
CrossRef Text & Data Mining - UKSG 2015CrossRef Text & Data Mining - UKSG 2015
CrossRef Text & Data Mining - UKSG 2015
 
Who is using your content?
Who is using your content? Who is using your content?
Who is using your content?
 
Revelations about relations in connecting research: content types, data and i...
Revelations about relations in connecting research: content types, data and i...Revelations about relations in connecting research: content types, data and i...
Revelations about relations in connecting research: content types, data and i...
 
Roy "Accelerating ML/AI Based R&D through Text & Data Mining"
Roy "Accelerating ML/AI Based R&D through Text & Data Mining"Roy "Accelerating ML/AI Based R&D through Text & Data Mining"
Roy "Accelerating ML/AI Based R&D through Text & Data Mining"
 
FAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research CommonsFAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research Commons
 
Why we need oa infrastructure - STM Association Beyond Open Access Seminar
Why we need oa infrastructure - STM Association Beyond Open Access SeminarWhy we need oa infrastructure - STM Association Beyond Open Access Seminar
Why we need oa infrastructure - STM Association Beyond Open Access Seminar
 
VODAN Africa IN.pptx
VODAN Africa IN.pptxVODAN Africa IN.pptx
VODAN Africa IN.pptx
 
2013 CrossRef Workshops Text Data Mining Geoffrey Bilder
2013 CrossRef Workshops Text Data Mining Geoffrey Bilder2013 CrossRef Workshops Text Data Mining Geoffrey Bilder
2013 CrossRef Workshops Text Data Mining Geoffrey Bilder
 
Multi-agent interactions on the Web through Linked Data Notifications
Multi-agent interactions on the Web through Linked Data NotificationsMulti-agent interactions on the Web through Linked Data Notifications
Multi-agent interactions on the Web through Linked Data Notifications
 
Crossref Content Registration - LIVE Mumbai
Crossref Content Registration - LIVE MumbaiCrossref Content Registration - LIVE Mumbai
Crossref Content Registration - LIVE Mumbai
 
A comparative study between commercial and open source discovery tools
A comparative study between commercial and open source discovery toolsA comparative study between commercial and open source discovery tools
A comparative study between commercial and open source discovery tools
 
Show and tell program 04 2014-09-04
Show and tell program 04 2014-09-04Show and tell program 04 2014-09-04
Show and tell program 04 2014-09-04
 
Evaluation of Web Scale Discovery Services
Evaluation of Web Scale Discovery ServicesEvaluation of Web Scale Discovery Services
Evaluation of Web Scale Discovery Services
 
Webinar: Lucidworks + Thomson Reuters for Improved Investment Performance
Webinar: Lucidworks + Thomson Reuters for Improved Investment PerformanceWebinar: Lucidworks + Thomson Reuters for Improved Investment Performance
Webinar: Lucidworks + Thomson Reuters for Improved Investment Performance
 
Internet and open source concepts
Internet and open source conceptsInternet and open source concepts
Internet and open source concepts
 
UKSG 2018 Lightning Talk - Annotations as research objects: findable, indexab...
UKSG 2018 Lightning Talk - Annotations as research objects: findable, indexab...UKSG 2018 Lightning Talk - Annotations as research objects: findable, indexab...
UKSG 2018 Lightning Talk - Annotations as research objects: findable, indexab...
 
Show and tell program 04 2014-09-04
Show and tell program 04 2014-09-04Show and tell program 04 2014-09-04
Show and tell program 04 2014-09-04
 

More from Crossref

Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...
Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...
Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...
Crossref
 
Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021
Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021  Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021
Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021
Crossref
 
Seminario web ‘Crossmark’, en español
Seminario web ‘Crossmark’, en español Seminario web ‘Crossmark’, en español
Seminario web ‘Crossmark’, en español
Crossref
 
Working with ROR as a Crossref member: what you need to know
Working with ROR as a Crossref member: what you need to knowWorking with ROR as a Crossref member: what you need to know
Working with ROR as a Crossref member: what you need to know
Crossref
 
Преимущества и варианты использования метаданных в Crossref / The Value and ...
Преимущества и варианты использования метаданных в Crossref /  The Value and ...Преимущества и варианты использования метаданных в Crossref /  The Value and ...
Преимущества и варианты использования метаданных в Crossref / The Value and ...
Crossref
 
Seminario web ‘Similarity Check’, en español
Seminario web ‘Similarity Check’, en españolSeminario web ‘Similarity Check’, en español
Seminario web ‘Similarity Check’, en español
Crossref
 
Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...
Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...
Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...
Crossref
 
Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...
Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...
Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...
Crossref
 
Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...
Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...
Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...
Crossref
 
Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...
Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...
Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...
Crossref
 
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
Crossref
 
Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ...
 Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ... Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ...
Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ...
Crossref
 
Los Metadatos Para la Comunidad de Investigacion
Los Metadatos Para la Comunidad de InvestigacionLos Metadatos Para la Comunidad de Investigacion
Los Metadatos Para la Comunidad de Investigacion
Crossref
 
تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...
تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...
تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...
Crossref
 
Content Registration, Crossref ALJEBI, Indonesia
Content Registration, Crossref ALJEBI, IndonesiaContent Registration, Crossref ALJEBI, Indonesia
Content Registration, Crossref ALJEBI, Indonesia
Crossref
 
crossmark update
crossmark updatecrossmark update
crossmark update
Crossref
 
Participation reports webinar December 2020
Participation reports webinar December 2020Participation reports webinar December 2020
Participation reports webinar December 2020
Crossref
 
Participation reports webinar November 2020
Participation reports webinar November 2020Participation reports webinar November 2020
Participation reports webinar November 2020
Crossref
 
Introduction to Crossmark/Crossmark: O que é e como usar
Introduction to Crossmark/Crossmark: O que é e como usarIntroduction to Crossmark/Crossmark: O que é e como usar
Introduction to Crossmark/Crossmark: O que é e como usar
Crossref
 
Registro y actualización de contenido en Crossref | Content Registration at C...
Registro y actualización de contenido en Crossref | Content Registration at C...Registro y actualización de contenido en Crossref | Content Registration at C...
Registro y actualización de contenido en Crossref | Content Registration at C...
Crossref
 

More from Crossref (20)

Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...
Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...
Crossref LIVE: The Benefits of Open Infrastructure (APAC time zones) - 29th O...
 
Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021
Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021  Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021
Crossref LIVE Chinese网络研讨会——Crossref简介 – 14 Oct 2021
 
Seminario web ‘Crossmark’, en español
Seminario web ‘Crossmark’, en español Seminario web ‘Crossmark’, en español
Seminario web ‘Crossmark’, en español
 
Working with ROR as a Crossref member: what you need to know
Working with ROR as a Crossref member: what you need to knowWorking with ROR as a Crossref member: what you need to know
Working with ROR as a Crossref member: what you need to know
 
Преимущества и варианты использования метаданных в Crossref / The Value and ...
Преимущества и варианты использования метаданных в Crossref /  The Value and ...Преимущества и варианты использования метаданных в Crossref /  The Value and ...
Преимущества и варианты использования метаданных в Crossref / The Value and ...
 
Seminario web ‘Similarity Check’, en español
Seminario web ‘Similarity Check’, en españolSeminario web ‘Similarity Check’, en español
Seminario web ‘Similarity Check’, en español
 
Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...
Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...
Crossref LIVE Indonesia: One Search Platform (Drs. Muhammad Syarif Bando pres...
 
Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...
Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...
Crossref LIVE Indonesia: The Future of Indonesian Journal Policy (with Dr. Lu...
 
Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...
Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...
Crossref LIVE Indonesia: The Value and Use of Crossref Metadata, CRLIVE-ID 15...
 
Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...
Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...
Crossref LIVE Indonesia: Content Registration at Crossref, CRLIVE-ID 14 July ...
 
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
Crossref LIVE Indonesia: An Introduction to Crossref, CRLIVE-ID 13 July 2021
 
Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ...
 Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ... Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ...
Crossref İçerik Kaydı Webinarı, Türkçe | Content Registration at Crossref , ...
 
Los Metadatos Para la Comunidad de Investigacion
Los Metadatos Para la Comunidad de InvestigacionLos Metadatos Para la Comunidad de Investigacion
Los Metadatos Para la Comunidad de Investigacion
 
تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...
تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...
تسجيل المحتوي مع كروس رف – ندوة عبر الانترنت باللغة العربية | Content Registr...
 
Content Registration, Crossref ALJEBI, Indonesia
Content Registration, Crossref ALJEBI, IndonesiaContent Registration, Crossref ALJEBI, Indonesia
Content Registration, Crossref ALJEBI, Indonesia
 
crossmark update
crossmark updatecrossmark update
crossmark update
 
Participation reports webinar December 2020
Participation reports webinar December 2020Participation reports webinar December 2020
Participation reports webinar December 2020
 
Participation reports webinar November 2020
Participation reports webinar November 2020Participation reports webinar November 2020
Participation reports webinar November 2020
 
Introduction to Crossmark/Crossmark: O que é e como usar
Introduction to Crossmark/Crossmark: O que é e como usarIntroduction to Crossmark/Crossmark: O que é e como usar
Introduction to Crossmark/Crossmark: O que é e como usar
 
Registro y actualización de contenido en Crossref | Content Registration at C...
Registro y actualización de contenido en Crossref | Content Registration at C...Registro y actualización de contenido en Crossref | Content Registration at C...
Registro y actualización de contenido en Crossref | Content Registration at C...
 

Recently uploaded

Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdfSupercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Access Innovations, Inc.
 
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...
Orkestra
 
Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...
Sebastiano Panichella
 
Obesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditionsObesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditions
Faculty of Medicine And Health Sciences
 
Acorn Recovery: Restore IT infra within minutes
Acorn Recovery: Restore IT infra within minutesAcorn Recovery: Restore IT infra within minutes
Acorn Recovery: Restore IT infra within minutes
IP ServerOne
 
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
OECD Directorate for Financial and Enterprise Affairs
 
Eureka, I found it! - Special Libraries Association 2021 Presentation
Eureka, I found it! - Special Libraries Association 2021 PresentationEureka, I found it! - Special Libraries Association 2021 Presentation
Eureka, I found it! - Special Libraries Association 2021 Presentation
Access Innovations, Inc.
 
0x01 - Newton's Third Law: Static vs. Dynamic Abusers
0x01 - Newton's Third Law:  Static vs. Dynamic Abusers0x01 - Newton's Third Law:  Static vs. Dynamic Abusers
0x01 - Newton's Third Law: Static vs. Dynamic Abusers
OWASP Beja
 
Media as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern EraMedia as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern Era
faizulhassanfaiz1670
 
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptxsomanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
Howard Spence
 
Bitcoin Lightning wallet and tic-tac-toe game XOXO
Bitcoin Lightning wallet and tic-tac-toe game XOXOBitcoin Lightning wallet and tic-tac-toe game XOXO
Bitcoin Lightning wallet and tic-tac-toe game XOXO
Matjaž Lipuš
 
María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024
eCommerce Institute
 
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Sebastiano Panichella
 
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdfBonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
khadija278284
 
Getting started with Amazon Bedrock Studio and Control Tower
Getting started with Amazon Bedrock Studio and Control TowerGetting started with Amazon Bedrock Studio and Control Tower
Getting started with Amazon Bedrock Studio and Control Tower
Vladimir Samoylov
 
International Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software TestingInternational Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software Testing
Sebastiano Panichella
 

Recently uploaded (16)

Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdfSupercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
Supercharge your AI - SSP Industry Breakout Session 2024-v2_1.pdf
 
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...
 
Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...Announcement of 18th IEEE International Conference on Software Testing, Verif...
Announcement of 18th IEEE International Conference on Software Testing, Verif...
 
Obesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditionsObesity causes and management and associated medical conditions
Obesity causes and management and associated medical conditions
 
Acorn Recovery: Restore IT infra within minutes
Acorn Recovery: Restore IT infra within minutesAcorn Recovery: Restore IT infra within minutes
Acorn Recovery: Restore IT infra within minutes
 
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
Competition and Regulation in Professional Services – KLEINER – June 2024 OEC...
 
Eureka, I found it! - Special Libraries Association 2021 Presentation
Eureka, I found it! - Special Libraries Association 2021 PresentationEureka, I found it! - Special Libraries Association 2021 Presentation
Eureka, I found it! - Special Libraries Association 2021 Presentation
 
0x01 - Newton's Third Law: Static vs. Dynamic Abusers
0x01 - Newton's Third Law:  Static vs. Dynamic Abusers0x01 - Newton's Third Law:  Static vs. Dynamic Abusers
0x01 - Newton's Third Law: Static vs. Dynamic Abusers
 
Media as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern EraMedia as a Mind Controlling Strategy In Old and Modern Era
Media as a Mind Controlling Strategy In Old and Modern Era
 
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptxsomanykidsbutsofewfathers-140705000023-phpapp02.pptx
somanykidsbutsofewfathers-140705000023-phpapp02.pptx
 
Bitcoin Lightning wallet and tic-tac-toe game XOXO
Bitcoin Lightning wallet and tic-tac-toe game XOXOBitcoin Lightning wallet and tic-tac-toe game XOXO
Bitcoin Lightning wallet and tic-tac-toe game XOXO
 
María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024María Carolina Martínez - eCommerce Day Colombia 2024
María Carolina Martínez - eCommerce Day Colombia 2024
 
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...Doctoral Symposium at the 17th IEEE International Conference on Software Test...
Doctoral Symposium at the 17th IEEE International Conference on Software Test...
 
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdfBonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
Bonzo subscription_hjjjjjjjj5hhhhhhh_2024.pdf
 
Getting started with Amazon Bedrock Studio and Control Tower
Getting started with Amazon Bedrock Studio and Control TowerGetting started with Amazon Bedrock Studio and Control Tower
Getting started with Amazon Bedrock Studio and Control Tower
 
International Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software TestingInternational Workshop on Artificial Intelligence in Software Testing
International Workshop on Artificial Intelligence in Software Testing
 

Text and Data Mining

  • 2. What is text and data mining? Text Mining is an interdisciplinary field combining techniques from linguistics, computer science and statistics to build tools that can efficiently retrieve and extract information from digital text. http://blogs.plos.org/everyone/2013/04/17/announcing-the-plos-text-mining-collection/ It uses powerful computers to find links between drugs and side effects, or genes and diseases, that are hidden within the vast scientific literature. These are discoveries that a person scouring through papers one by one may never notice. http://www.theguardian.com/science/2012/may/23/text-mining-research-tool-forbidden
  • 3. What is the issue? • Researchers find it impractical to negotiate multiple bilateral agreements with hundreds of subscription-based publishers in order to authorize TDM of subscribed content. • Subscription-based publishers find it impractical to negotiate multiple bilateral agreements with thousands of researchers and institutions in order to authorize TDM of subscribed content. • All parties would benefit from support of standard APIs and data representations in order to enable TDM across both open access and subscription-based publishers.
  • 4. How to solve it? • Crossref REST API: designed to allow researchers to easily harvest full text documents from all participating publishers regardless of their business model (e.g. open access, subscription).
  • 5. Common API Summary • Content Negotiation (Required) • New Metadata (Required) • Full text URIs • License URIs • Rate Limiting Headers (optional)
  • 7. Step 1: A researcher identifies the articles they are interested in The search engines they use bring back results from lots of different publishers. They can also use CrossRef to search.
  • 8. The searches they run bring back results showing publications from a range of publishers, in different locations and using different business models. The challenge is to harvest all these articles in order to be able to mine them, without engaging in individual transactions with each publisher.
  • 9. How to do that? Each of those articles has a DOI, or digital object identifier. Each DOI is unique and identifies the paper. Researchers are familiar with DOIs and are used to working with them. Search engines will allow them to download DOIs as a list, the researcher does not need to go to each paper to extract the DOI from it.
  • 10. 2. The researcher takes the DOIs that correspond to the articles they are interested in 10.5555/12345678 10.5556/12345679 10.1016/12345680 10.8080/12345681 10.1155/12345682 10.1100/12345683 10.5555/12345684 10.1007/12345685 10.1111/12345686 10.2406/12345687 10.3994/12345688 10.5006/12345689 Click to download
  • 11. 3. The researcher gives this list to the CrossRef Text and Data Mining API And that tells them Where the full-text is located What they are allowed to do with it
  • 12. What are they are allowed to do with the content? This is communicated by licence information that publishers give to CrossRef. Some publishers ask researchers to agree to an additional licence to be able to use their content for mining. Researchers are able to log in to CrossRef TDM with their ORCID ID where they can view and accept publisher licences ALL in one place. No multiple actions are needed. The publishers do not charge researchers for this, and CrossRef does not charge researchers for the service.
  • 13. 4. The researcher uses that information to go directly to each publisher via CrossRef. It is a central channel for them visit thousands of publishers via one request or transaction Where they will be identified in a number of ways:  No identification (Open Access content)  IP recognition/log in credentials  IP recognition/log in credentials + CrossRef token (API key) from the TDM service
  • 14. Benefits • Streamlines researcher access to distributed full text for TDM • Enables machine-to-machine, automated access for recognized TDM (i.e. researchers won’t be locked out of publisher sites) • Enables article-level licensing info and easy mechanism for supplemental T&Cs for text and data mining (publishers discussing model license via STM)

Editor's Notes

  1. Why did CrossRef develop this service? Applies to OA content too. Let’s just illustrate these issues.
  2. The CrossRef Common API is the main aspect of this service and is designed to allow researchers to easily harvest full text documents from all participating publishers regardless of their business model (e.g. open access, subscription). It makes use of CrossRef DOI content negotiation to provide researchers with links to the full text of content located on the publisher’s site. The publisher remains responsible for actually delivering the full text of the content requested. Thus, open access publishers can simply deliver the requested content while subscription based publishers continue to support subscriptions using their existing access control systems.
  3. Wide range of papers from a wide range of publishers – spread of business models and geographical locations.
  4. Explain API = basically an interface that software uses to interact with other software.