SlideShare a Scribd company logo
OpenHarvester
Muhammad Javed, Ph.D.
Ontology Engineer

Tech Lead (Scholars@Cornell)

Cornell University Library
A java prototype that processes the result set of pre-downloaded data
(from a database) and allows one to claim his/her publications from a
ranked list.
WORK ZONE AHEAD
Next 10 Miles Mins
It’s a preliminary work
Reason 1: Symplectic Elements do not “search” publications, neither in
CrossRef nor in EPubmed Central. (Supplementary Sources)
Reason 2: Adrenaline in my veins to explore what data can be
harvested using open citation APIs.
Reason 2.1: Now I can access the data, can I harvest publications
for a researcher ?
Three reasons why I did this preliminary work:
Reason 3: To understand, what data is required to successfully
find publications for a researcher ?
Limitations:
1. Works on pre-downloaded data.
1. Step 1: Search database and download result set.
2. Step II: Process result set and harvest researcher’s publications.
2. No name diacritic handling.
3. Name handling requires some more tweaks.
4. and more..
1
Name
USER PROFILE
…
(String)
• Dean B. Krafft
• D. Krafft
• Dean Blackmar Krafft
• Dean Krafft• Dean Krafft
Search name in
the author list
(Dean OR Krafft)
2
Set of
Publications
Result Set
3
Ranked list of
Publications
Ranking
4
List of my
Publications
List Review
from top
Claim a Publication
5
6
Update Profile 7
Re-Ranking
(based on update profile) 8
• Dean Krafft
Dean B. Krafft
User Profile:
1. NameVariants
2. Start/EndYear
3. Affiliations
4. Co-authors
5. Subject Areas
6. Identifiers (personal)
7. Identifiers (publications)
8. and more…
Two Step Process
java -jar CrossrefDataDownloader.jar Dean+Krafft CROSSREF
Step 1: Download Data
Downloads result set and save files in a folder named as “Dean+Krafft”
Step 2: Process Result Set
Processes result set files from folder “Dean+Krafft”
search string output base-folder
Homepage View
Search Database View
Claim Publications View (1)
Claim Publications View (2)
Name Variant added
Co-authors added
Claimed Publications List
Claim Publications View (3)
Claimed publication list can
be downloaded in multiple
formats.
VIVO
CONNECTER
OpenHarvester
User Profile
Search Publications
Load Claimed
Publications in VIVO
1
Name Affiliation
• Cornell University Library
• Department of Computer
Science
• Cornell University
Co-Authors
• Simeon Warner
• Carl Lagoze
• ….
…
Year Range
• 1978 (start)
• 2016 (end)
Subject
• Computer Science
• Library & Info. Sci.
USER PROFILE
…
(String)
• Dean B. Krafft
• D. Krafft
• Dean Blackmar Krafft
• Dean Krafft• Dean Krafft
Search name in
the author list
(Dean OR Krafft)
2
List of
Publications
Result Set
3
Ranked list of
Publications
Ranking
4
List of my
Publications
List Review
(from top)
Claim a Publication
5
6
Update Profile 7
Re-Ranking
(based on update profile) 8
• Dean Krafft
Dean Krafft
USER PROFILE

More Related Content

What's hot

Linked data intro primer
Linked data intro primerLinked data intro primer
Linked data intro primer
Karen Estlund
 
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
CEDAR: Center for Expanded Data Annotation and Retrieval
 
How to make your published data findable, accessible, interoperable and reusable
How to make your published data findable, accessible, interoperable and reusableHow to make your published data findable, accessible, interoperable and reusable
How to make your published data findable, accessible, interoperable and reusable
Phoenix Bioinformatics
 
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
CEDAR: Center for Expanded Data Annotation and Retrieval
 
Reproducible research: practice
Reproducible research: practiceReproducible research: practice
Reproducible research: practice
C. Tobin Magle
 
Federating Research Profiling Data
Federating Research Profiling DataFederating Research Profiling Data
Federating Research Profiling Data
ericmeeks
 
iAuthor.cn: ORCID China Services and International Identifier for Researchers
iAuthor.cn: ORCID China Services and International Identifier for ResearchersiAuthor.cn: ORCID China Services and International Identifier for Researchers
iAuthor.cn: ORCID China Services and International Identifier for Researchers
jianyongzhang
 
A Generic Scientific Data Model and Ontology for Representation of Chemical Data
A Generic Scientific Data Model and Ontology for Representation of Chemical DataA Generic Scientific Data Model and Ontology for Representation of Chemical Data
A Generic Scientific Data Model and Ontology for Representation of Chemical Data
Stuart Chalk
 
Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...
Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...
Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...
Stuart Chalk
 
An Open Repository Model for Acquiring Knowledge About Scientific Experiments
An Open Repository Model for Acquiring Knowledge About Scientific ExperimentsAn Open Repository Model for Acquiring Knowledge About Scientific Experiments
An Open Repository Model for Acquiring Knowledge About Scientific Experiments
CEDAR: Center for Expanded Data Annotation and Retrieval
 
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail ScienceSQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
University of Washington
 
Scientific Units in the Electronic Age
Scientific Units in the Electronic AgeScientific Units in the Electronic Age
Scientific Units in the Electronic Age
Stuart Chalk
 
Reuse of Repository Data
Reuse of Repository DataReuse of Repository Data
Reuse of Repository Data
Valerie Enriquez
 
Datat and donuts: how to write a data management plan
Datat and donuts: how to write a data management planDatat and donuts: how to write a data management plan
Datat and donuts: how to write a data management plan
C. Tobin Magle
 
Data Archiving and Sharing
Data Archiving and SharingData Archiving and Sharing
Data Archiving and Sharing
C. Tobin Magle
 
Citation management
Citation managementCitation management
Citation management
Caroline Thompson
 
Martin Rasmussen: Ensuring availability and quality of research data through ...
Martin Rasmussen: Ensuring availability and quality of research data through ...Martin Rasmussen: Ensuring availability and quality of research data through ...
Martin Rasmussen: Ensuring availability and quality of research data through ...
"Open Access - Open Data" conference, 13th/14th December, 2010
 
Research and Citation tools
Research and Citation toolsResearch and Citation tools
Research and Citation tools
Kristen T
 
Data Publishing Workflows with Dataverse
Data Publishing Workflows with DataverseData Publishing Workflows with Dataverse
Data Publishing Workflows with Dataverse
Micah Altman
 

What's hot (20)

Linked data intro primer
Linked data intro primerLinked data intro primer
Linked data intro primer
 
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
Embracing Semantic Technology for Better Metadata Authoring in Biomedicine (S...
 
How to make your published data findable, accessible, interoperable and reusable
How to make your published data findable, accessible, interoperable and reusableHow to make your published data findable, accessible, interoperable and reusable
How to make your published data findable, accessible, interoperable and reusable
 
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
Metadata in the BioSample Online Repository are Impaired by Numerous Anomalie...
 
Reproducible research: practice
Reproducible research: practiceReproducible research: practice
Reproducible research: practice
 
Federating Research Profiling Data
Federating Research Profiling DataFederating Research Profiling Data
Federating Research Profiling Data
 
iAuthor.cn: ORCID China Services and International Identifier for Researchers
iAuthor.cn: ORCID China Services and International Identifier for ResearchersiAuthor.cn: ORCID China Services and International Identifier for Researchers
iAuthor.cn: ORCID China Services and International Identifier for Researchers
 
A Generic Scientific Data Model and Ontology for Representation of Chemical Data
A Generic Scientific Data Model and Ontology for Representation of Chemical DataA Generic Scientific Data Model and Ontology for Representation of Chemical Data
A Generic Scientific Data Model and Ontology for Representation of Chemical Data
 
Natcatchpoleslides
NatcatchpoleslidesNatcatchpoleslides
Natcatchpoleslides
 
Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...
Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...
Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...
 
An Open Repository Model for Acquiring Knowledge About Scientific Experiments
An Open Repository Model for Acquiring Knowledge About Scientific ExperimentsAn Open Repository Model for Acquiring Knowledge About Scientific Experiments
An Open Repository Model for Acquiring Knowledge About Scientific Experiments
 
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail ScienceSQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
SQL is Dead; Long Live SQL: Lightweight Query Services for Long Tail Science
 
Scientific Units in the Electronic Age
Scientific Units in the Electronic AgeScientific Units in the Electronic Age
Scientific Units in the Electronic Age
 
Reuse of Repository Data
Reuse of Repository DataReuse of Repository Data
Reuse of Repository Data
 
Datat and donuts: how to write a data management plan
Datat and donuts: how to write a data management planDatat and donuts: how to write a data management plan
Datat and donuts: how to write a data management plan
 
Data Archiving and Sharing
Data Archiving and SharingData Archiving and Sharing
Data Archiving and Sharing
 
Citation management
Citation managementCitation management
Citation management
 
Martin Rasmussen: Ensuring availability and quality of research data through ...
Martin Rasmussen: Ensuring availability and quality of research data through ...Martin Rasmussen: Ensuring availability and quality of research data through ...
Martin Rasmussen: Ensuring availability and quality of research data through ...
 
Research and Citation tools
Research and Citation toolsResearch and Citation tools
Research and Citation tools
 
Data Publishing Workflows with Dataverse
Data Publishing Workflows with DataverseData Publishing Workflows with Dataverse
Data Publishing Workflows with Dataverse
 

Similar to Open Harvester - Search publications for a researcher from CrossRef, PubMed and DBLP

Bearcat Search: Implementing Federated Searching at the Newman Library
Bearcat Search: Implementing Federated Searching at the Newman LibraryBearcat Search: Implementing Federated Searching at the Newman Library
Bearcat Search: Implementing Federated Searching at the Newman Library
Newman Library
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reports
Gaignard Alban
 
2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)
Stian Soiland-Reyes
 
2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)
Stian Soiland-Reyes
 
Collaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna WorkflowsCollaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna Workflows
Andrea Wiggins
 
Discovery Systems Used in Academic Libraries Projects & Case Study
Discovery Systems Used in Academic Libraries Projects & Case StudyDiscovery Systems Used in Academic Libraries Projects & Case Study
Discovery Systems Used in Academic Libraries Projects & Case Study
Hong (Jenny) Jing
 
Cassavabase workshop IITA oct2016
Cassavabase workshop IITA oct2016Cassavabase workshop IITA oct2016
Cassavabase workshop IITA oct2016
solgenomics
 
Scaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrScaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solr
Trey Grainger
 
Scott Edmunds ISMB talk on Big Data Publishing
Scott Edmunds ISMB talk on Big Data PublishingScott Edmunds ISMB talk on Big Data Publishing
Scott Edmunds ISMB talk on Big Data Publishing
GigaScience, BGI Hong Kong
 
Meetup SF - Amundsen
Meetup SF  -  AmundsenMeetup SF  -  Amundsen
Meetup SF - Amundsen
Philippe Mizrahi
 
Final presentation
Final presentationFinal presentation
Final presentation
Nitish Upreti
 
Computer Scientists Retrieval - PDF Report
Computer Scientists Retrieval - PDF ReportComputer Scientists Retrieval - PDF Report
Cassavabase workshop ibadan March17
Cassavabase workshop ibadan March17Cassavabase workshop ibadan March17
Cassavabase workshop ibadan March17
solgenomics
 
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
Yongyao Jiang
 
Presentation of LUCERO at EURECOM
Presentation of LUCERO at EURECOMPresentation of LUCERO at EURECOM
Presentation of LUCERO at EURECOMMathieu d'Aquin
 
Exploring the Semantic Web
Exploring the Semantic WebExploring the Semantic Web
Exploring the Semantic Web
Roberto García
 
Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015
Jisc
 
RARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research Objects
Carole Goble
 
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Spark Summit
 

Similar to Open Harvester - Search publications for a researcher from CrossRef, PubMed and DBLP (20)

Bearcat Search: Implementing Federated Searching at the Newman Library
Bearcat Search: Implementing Federated Searching at the Newman LibraryBearcat Search: Implementing Federated Searching at the Newman Library
Bearcat Search: Implementing Federated Searching at the Newman Library
 
Sharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reportsSharing massive data analysis: from provenance to linked experiment reports
Sharing massive data analysis: from provenance to linked experiment reports
 
2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)
 
2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)
 
Collaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna WorkflowsCollaborative Data Analysis with Taverna Workflows
Collaborative Data Analysis with Taverna Workflows
 
Discovery Systems Used in Academic Libraries Projects & Case Study
Discovery Systems Used in Academic Libraries Projects & Case StudyDiscovery Systems Used in Academic Libraries Projects & Case Study
Discovery Systems Used in Academic Libraries Projects & Case Study
 
Cassavabase workshop IITA oct2016
Cassavabase workshop IITA oct2016Cassavabase workshop IITA oct2016
Cassavabase workshop IITA oct2016
 
Scaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrScaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solr
 
Scott Edmunds ISMB talk on Big Data Publishing
Scott Edmunds ISMB talk on Big Data PublishingScott Edmunds ISMB talk on Big Data Publishing
Scott Edmunds ISMB talk on Big Data Publishing
 
Meetup SF - Amundsen
Meetup SF  -  AmundsenMeetup SF  -  Amundsen
Meetup SF - Amundsen
 
Final presentation
Final presentationFinal presentation
Final presentation
 
Computer Scientists Retrieval - PDF Report
Computer Scientists Retrieval - PDF ReportComputer Scientists Retrieval - PDF Report
Computer Scientists Retrieval - PDF Report
 
Cassavabase workshop ibadan March17
Cassavabase workshop ibadan March17Cassavabase workshop ibadan March17
Cassavabase workshop ibadan March17
 
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
 
Presentation of LUCERO at EURECOM
Presentation of LUCERO at EURECOMPresentation of LUCERO at EURECOM
Presentation of LUCERO at EURECOM
 
File class.48
File class.48File class.48
File class.48
 
Exploring the Semantic Web
Exploring the Semantic WebExploring the Semantic Web
Exploring the Semantic Web
 
Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015
 
RARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research Objects
 
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
Building a Dataset Search Engine with Spark and Elasticsearch: Spark Summit E...
 

More from Muhammad Javed

Extending Local Data: "Where to start from"
Extending Local Data: "Where to start from"Extending Local Data: "Where to start from"
Extending Local Data: "Where to start from"
Muhammad Javed
 
VIZ-VIVO: Towards Visualizations-driven Linked Data Navigation
VIZ-VIVO: Towards Visualizations-driven Linked Data NavigationVIZ-VIVO: Towards Visualizations-driven Linked Data Navigation
VIZ-VIVO: Towards Visualizations-driven Linked Data Navigation
Muhammad Javed
 
Scholars@Cornell: Visualizing the Scholarship Data
Scholars@Cornell: Visualizing the Scholarship DataScholars@Cornell: Visualizing the Scholarship Data
Scholars@Cornell: Visualizing the Scholarship Data
Muhammad Javed
 
Scholars@Cornell: An Envision - My unfulfilled Dream.
Scholars@Cornell: An Envision  - My unfulfilled Dream.Scholars@Cornell: An Envision  - My unfulfilled Dream.
Scholars@Cornell: An Envision - My unfulfilled Dream.
Muhammad Javed
 
VIVO for visualization and analysis
VIVO for visualization and analysisVIVO for visualization and analysis
VIVO for visualization and analysis
Muhammad Javed
 
VIVO: A Community-driven Research Information Management System: Challenges a...
VIVO: A Community-driven Research Information Management System: Challenges a...VIVO: A Community-driven Research Information Management System: Challenges a...
VIVO: A Community-driven Research Information Management System: Challenges a...
Muhammad Javed
 
Scholars@Cornell: From Data in Peace to Data in Use. (VIVO'18)
Scholars@Cornell: From Data in Peace to Data in Use. (VIVO'18)Scholars@Cornell: From Data in Peace to Data in Use. (VIVO'18)
Scholars@Cornell: From Data in Peace to Data in Use. (VIVO'18)
Muhammad Javed
 
Scholars@Cornell: Visualizing the Scholarship data
Scholars@Cornell: Visualizing the Scholarship dataScholars@Cornell: Visualizing the Scholarship data
Scholars@Cornell: Visualizing the Scholarship data
Muhammad Javed
 
Scholars@Cornell: Visualizing the scholarly record
Scholars@Cornell: Visualizing the scholarly recordScholars@Cornell: Visualizing the scholarly record
Scholars@Cornell: Visualizing the scholarly record
Muhammad Javed
 

More from Muhammad Javed (9)

Extending Local Data: "Where to start from"
Extending Local Data: "Where to start from"Extending Local Data: "Where to start from"
Extending Local Data: "Where to start from"
 
VIZ-VIVO: Towards Visualizations-driven Linked Data Navigation
VIZ-VIVO: Towards Visualizations-driven Linked Data NavigationVIZ-VIVO: Towards Visualizations-driven Linked Data Navigation
VIZ-VIVO: Towards Visualizations-driven Linked Data Navigation
 
Scholars@Cornell: Visualizing the Scholarship Data
Scholars@Cornell: Visualizing the Scholarship DataScholars@Cornell: Visualizing the Scholarship Data
Scholars@Cornell: Visualizing the Scholarship Data
 
Scholars@Cornell: An Envision - My unfulfilled Dream.
Scholars@Cornell: An Envision  - My unfulfilled Dream.Scholars@Cornell: An Envision  - My unfulfilled Dream.
Scholars@Cornell: An Envision - My unfulfilled Dream.
 
VIVO for visualization and analysis
VIVO for visualization and analysisVIVO for visualization and analysis
VIVO for visualization and analysis
 
VIVO: A Community-driven Research Information Management System: Challenges a...
VIVO: A Community-driven Research Information Management System: Challenges a...VIVO: A Community-driven Research Information Management System: Challenges a...
VIVO: A Community-driven Research Information Management System: Challenges a...
 
Scholars@Cornell: From Data in Peace to Data in Use. (VIVO'18)
Scholars@Cornell: From Data in Peace to Data in Use. (VIVO'18)Scholars@Cornell: From Data in Peace to Data in Use. (VIVO'18)
Scholars@Cornell: From Data in Peace to Data in Use. (VIVO'18)
 
Scholars@Cornell: Visualizing the Scholarship data
Scholars@Cornell: Visualizing the Scholarship dataScholars@Cornell: Visualizing the Scholarship data
Scholars@Cornell: Visualizing the Scholarship data
 
Scholars@Cornell: Visualizing the scholarly record
Scholars@Cornell: Visualizing the scholarly recordScholars@Cornell: Visualizing the scholarly record
Scholars@Cornell: Visualizing the scholarly record
 

Recently uploaded

Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
James Polillo
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
alex933524
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Subhajit Sahu
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
theahmadsaood
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 

Recently uploaded (20)

Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
Jpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization SampleJpolillo Amazon PPC - Bid Optimization Sample
Jpolillo Amazon PPC - Bid Optimization Sample
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
Tabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflowsTabula.io Cheatsheet: automate your data workflows
Tabula.io Cheatsheet: automate your data workflows
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
Algorithmic optimizations for Dynamic Levelwise PageRank (from STICD) : SHORT...
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 

Open Harvester - Search publications for a researcher from CrossRef, PubMed and DBLP

  • 1. OpenHarvester Muhammad Javed, Ph.D. Ontology Engineer Tech Lead (Scholars@Cornell) Cornell University Library A java prototype that processes the result set of pre-downloaded data (from a database) and allows one to claim his/her publications from a ranked list.
  • 2. WORK ZONE AHEAD Next 10 Miles Mins It’s a preliminary work
  • 3. Reason 1: Symplectic Elements do not “search” publications, neither in CrossRef nor in EPubmed Central. (Supplementary Sources) Reason 2: Adrenaline in my veins to explore what data can be harvested using open citation APIs. Reason 2.1: Now I can access the data, can I harvest publications for a researcher ? Three reasons why I did this preliminary work: Reason 3: To understand, what data is required to successfully find publications for a researcher ?
  • 4. Limitations: 1. Works on pre-downloaded data. 1. Step 1: Search database and download result set. 2. Step II: Process result set and harvest researcher’s publications. 2. No name diacritic handling. 3. Name handling requires some more tweaks. 4. and more..
  • 5. 1 Name USER PROFILE … (String) • Dean B. Krafft • D. Krafft • Dean Blackmar Krafft • Dean Krafft• Dean Krafft Search name in the author list (Dean OR Krafft) 2 Set of Publications Result Set 3 Ranked list of Publications Ranking 4 List of my Publications List Review from top Claim a Publication 5 6 Update Profile 7 Re-Ranking (based on update profile) 8 • Dean Krafft Dean B. Krafft
  • 6. User Profile: 1. NameVariants 2. Start/EndYear 3. Affiliations 4. Co-authors 5. Subject Areas 6. Identifiers (personal) 7. Identifiers (publications) 8. and more…
  • 7. Two Step Process java -jar CrossrefDataDownloader.jar Dean+Krafft CROSSREF Step 1: Download Data Downloads result set and save files in a folder named as “Dean+Krafft” Step 2: Process Result Set Processes result set files from folder “Dean+Krafft” search string output base-folder
  • 8.
  • 12. Claim Publications View (2) Name Variant added Co-authors added Claimed Publications List
  • 13. Claim Publications View (3) Claimed publication list can be downloaded in multiple formats.
  • 15. 1 Name Affiliation • Cornell University Library • Department of Computer Science • Cornell University Co-Authors • Simeon Warner • Carl Lagoze • …. … Year Range • 1978 (start) • 2016 (end) Subject • Computer Science • Library & Info. Sci. USER PROFILE … (String) • Dean B. Krafft • D. Krafft • Dean Blackmar Krafft • Dean Krafft• Dean Krafft Search name in the author list (Dean OR Krafft) 2 List of Publications Result Set 3 Ranked list of Publications Ranking 4 List of my Publications List Review (from top) Claim a Publication 5 6 Update Profile 7 Re-Ranking (based on update profile) 8 • Dean Krafft Dean Krafft USER PROFILE