SlideShare a Scribd company logo
What can we learn from topic modeling on 350M documents? 
William Gunn 
Head of Academic Outreach 
Mendeley 
@mrgunn – https://orcid.org/0000-0002-3555-2054
Who am I? 
PhD Biomedical Science 
I've been active in online science communities since 1995 
Established the community program at Mendeley – 1700 advisors from 650 schools in 60 countries. 
Lead the outreach to librarian, academic research, and tech communities
Based in London, Mendeley is researchers, graduates and software developers from...
Two new approaches 
Embed a tool within the researcher workflow to capture data 
Capture new kinds of data – usage of research objects, not just citations of papers.
...and aggregates 
data in the cloud 
Mendeley extracts research data… 
Collecting rich signals 
from domain experts.
Rich user profile data
TEAM Project 
academic knowledge management solutions 
•Algorithms to determine the content similarity of academic papers 
•Performing text disambiguation and entity recognition to differentiate between and relate similar in-text entities and authors of research papers. 
•Developing semantic technologies and semantic web languages with the focus of metadata integration/validation 
•Investigate profiling and user analysis technologies, e.g. based on search logs and document interaction. 
•We will also improve folksonomies and through that, ontologies of text. 
• Finally, tagging behaviour will be analysed to improve tag recommendations and strategies. 
•http://team-project.tugraz.at/blog/
Semantics vs. Syntax 
•Language expresses semantics via syntax 
•Syntax is all a computer sees in a research article. 
•How do we get to semantics? 
•Topic Modeling!
Distribution of Topics 
0% 
5% 
10% 
15% 
20% 
25% 
30% 
35% 
Bio 
Phys 
Engineer 
Comp 
Sci 
Psych & 
Edu 
Business 
Law 
Other
Subcategories of Comp. Sci. 
0% 
5% 
10% 
15% 
20% 
AI 
HCI 
Info Sci 
Software 
Eng 
Networks
Generated topics – Comp. Sci.
Generated Topics - Biology
Categorization As A Process 
Thing 
Process 
Reaction 
Catalysis 
Enzymatic
Categorization As A Process 
Thing 
Process 
Reaction 
Catalysis 
Enzymatic
Categorization is imperfect
Cateories change over time
Code Project 
Use case = mining research papers for facts to add to LOD repositories and light-weight ontologies. 
•Crowd-sourcing enabled semantic enrichment & integration techniques for integrating facts contained in unstructured information into the LOD cloud 
•Federated, provenance-enabled querying methods for fact discovery in LOD repositories 
•Web-based visual analysis interfaces to support human based analysis, integration and organisation of facts 
•Socio-economic factors – roles, revenue-models and value chains – realisable in the envisioned ecosystem. 
•http://code-research.eu/
Metrics as a discovery tool
Google Analytics for Research
Building a reproducibility dataset 
•Mendeley and Science Exchange have started the Reproducibility Initiative 
•working with Figshare & PLOS to host data & replication reports 
•building open datasets backing high- impact work 
•extending the “executable paper” concept to biomedical research
Make it porous & part of the web. 
All these examples show that the main motivation for people to get data (pictures, bookmarks, etc) off their computers and on the web is because it helps them find more of the same. 
Communities must be open if they are to thrive.
www.mendeley.com 
william.gunn@mendeley.com @mrgunn

More Related Content

What's hot

International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
IJDKP
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
albert ca
 
International Journal of Data Mining & Knowledge Management Process(IJDKP)
International Journal of Data Mining & Knowledge Management Process(IJDKP)International Journal of Data Mining & Knowledge Management Process(IJDKP)
International Journal of Data Mining & Knowledge Management Process(IJDKP)
albert ca
 
Introduction to UC San Diego’s Integrated Digital Infrastructure
Introduction to UC San Diego’s Integrated Digital InfrastructureIntroduction to UC San Diego’s Integrated Digital Infrastructure
Introduction to UC San Diego’s Integrated Digital Infrastructure
Larry Smarr
 
AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...
AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...
AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...
Dr. Haxel Consult
 
International Journal of Data Mining & Knowledge Management Process(IJDKP)
International Journal of Data Mining & Knowledge Management Process(IJDKP)International Journal of Data Mining & Knowledge Management Process(IJDKP)
International Journal of Data Mining & Knowledge Management Process(IJDKP)
albert ca
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
IJDKP
 
Call for Papers - International Journal of Data Mining & Knowledge Management...
Call for Papers - International Journal of Data Mining & Knowledge Management...Call for Papers - International Journal of Data Mining & Knowledge Management...
Call for Papers - International Journal of Data Mining & Knowledge Management...
IJDKP
 
Project Topics in Data Mining
Project Topics in Data MiningProject Topics in Data Mining
Project Topics in Data Mining
Phdtopiccom
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
IJDKP
 
Web Mining Project Ideas
Web Mining Project IdeasWeb Mining Project Ideas
Web Mining Project Ideas
Phdtopiccom
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
IJDKP
 
Data Mining & Knowledge Management Process (IJDKP)
Data Mining & Knowledge Management Process (IJDKP)Data Mining & Knowledge Management Process (IJDKP)
Data Mining & Knowledge Management Process (IJDKP)
IJDKP
 
Demonstrating a Framework for KOS-based Recommendations Systems
Demonstrating a Framework for KOS-based Recommendations SystemsDemonstrating a Framework for KOS-based Recommendations Systems
Demonstrating a Framework for KOS-based Recommendations SystemsGESIS
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
IJDKP
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
IJDKP
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
IJDKP
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
IJDKP
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
IJDKP
 
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
National Information Standards Organization (NISO)
 

What's hot (20)

International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
 
International Journal of Data Mining & Knowledge Management Process(IJDKP)
International Journal of Data Mining & Knowledge Management Process(IJDKP)International Journal of Data Mining & Knowledge Management Process(IJDKP)
International Journal of Data Mining & Knowledge Management Process(IJDKP)
 
Introduction to UC San Diego’s Integrated Digital Infrastructure
Introduction to UC San Diego’s Integrated Digital InfrastructureIntroduction to UC San Diego’s Integrated Digital Infrastructure
Introduction to UC San Diego’s Integrated Digital Infrastructure
 
AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...
AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...
AI-SDV 2021: Jay ven Eman - implementation-of-new-technology-within-a-big-pha...
 
International Journal of Data Mining & Knowledge Management Process(IJDKP)
International Journal of Data Mining & Knowledge Management Process(IJDKP)International Journal of Data Mining & Knowledge Management Process(IJDKP)
International Journal of Data Mining & Knowledge Management Process(IJDKP)
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
 
Call for Papers - International Journal of Data Mining & Knowledge Management...
Call for Papers - International Journal of Data Mining & Knowledge Management...Call for Papers - International Journal of Data Mining & Knowledge Management...
Call for Papers - International Journal of Data Mining & Knowledge Management...
 
Project Topics in Data Mining
Project Topics in Data MiningProject Topics in Data Mining
Project Topics in Data Mining
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
 
Web Mining Project Ideas
Web Mining Project IdeasWeb Mining Project Ideas
Web Mining Project Ideas
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
 
Data Mining & Knowledge Management Process (IJDKP)
Data Mining & Knowledge Management Process (IJDKP)Data Mining & Knowledge Management Process (IJDKP)
Data Mining & Knowledge Management Process (IJDKP)
 
Demonstrating a Framework for KOS-based Recommendations Systems
Demonstrating a Framework for KOS-based Recommendations SystemsDemonstrating a Framework for KOS-based Recommendations Systems
Demonstrating a Framework for KOS-based Recommendations Systems
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
 
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )International Journal of Data Mining & Knowledge Management Process ( IJDKP )
International Journal of Data Mining & Knowledge Management Process ( IJDKP )
 
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
 

Similar to VIVO 2013 Topic Modeling Entity Extraction

Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
William Gunn
 
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Enrico Motta
 
Linking Heterogeneous Scholarly Data Sources in an Interoperable Setting: the...
Linking Heterogeneous Scholarly Data Sources in an Interoperable Setting: the...Linking Heterogeneous Scholarly Data Sources in an Interoperable Setting: the...
Linking Heterogeneous Scholarly Data Sources in an Interoperable Setting: the...
Platforma Otwartej Nauki
 
THOR Workshop - Introduction
THOR Workshop - IntroductionTHOR Workshop - Introduction
THOR Workshop - Introduction
Maaike Duine
 
Why would a publisher care about open data?
Why would a publisher care about open data?Why would a publisher care about open data?
Why would a publisher care about open data?
Anita de Waard
 
Data-X-Sparse-v2
Data-X-Sparse-v2Data-X-Sparse-v2
Data-X-Sparse-v2
Ikhlaq Sidhu
 
Managing 'Big Data' in the social sciences: the contribution of an analytico-...
Managing 'Big Data' in the social sciences: the contribution of an analytico-...Managing 'Big Data' in the social sciences: the contribution of an analytico-...
Managing 'Big Data' in the social sciences: the contribution of an analytico-...
CILIP MDG
 
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...María Poveda Villalón
 
Lightning Talk Session - Connecting Altmetric (K. Capretta)
Lightning Talk Session - Connecting Altmetric (K. Capretta)Lightning Talk Session - Connecting Altmetric (K. Capretta)
Lightning Talk Session - Connecting Altmetric (K. Capretta)
ORCID, Inc
 
SemTecBiz 2012: Corporate Semantic Web
SemTecBiz 2012: Corporate Semantic WebSemTecBiz 2012: Corporate Semantic Web
SemTecBiz 2012: Corporate Semantic Web
Adrian Paschke
 
A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
Herbert Van de Sompel
 
Data-X-v3.1
Data-X-v3.1Data-X-v3.1
Data-X-v3.1
Ikhlaq Sidhu
 
Information entanglement
Information entanglementInformation entanglement
Information entanglement
Willard Van De Bogart
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the parts
Carole Goble
 
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
Sarah Anna Stewart
 
Intro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data ScientistsIntro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data Scientists
Sri Ambati
 
Linked Open Data_mlanet13
Linked Open Data_mlanet13Linked Open Data_mlanet13
Linked Open Data_mlanet13
Kristi Holmes
 
Session 0.0 poster minutes madness
Session 0.0   poster minutes madnessSession 0.0   poster minutes madness
Session 0.0 poster minutes madness
semanticsconference
 
Towards the Intelligent Internet of Everything
Towards the Intelligent Internet of EverythingTowards the Intelligent Internet of Everything
Towards the Intelligent Internet of Everything
RECAP Project
 
The Future of Research Communications and e-Scholarship: Are we there yet?
The Future of Research Communications and e-Scholarship: Are we there yet?The Future of Research Communications and e-Scholarship: Are we there yet?
The Future of Research Communications and e-Scholarship: Are we there yet?
National Information Standards Organization (NISO)
 

Similar to VIVO 2013 Topic Modeling Entity Extraction (20)

Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
 
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
 
Linking Heterogeneous Scholarly Data Sources in an Interoperable Setting: the...
Linking Heterogeneous Scholarly Data Sources in an Interoperable Setting: the...Linking Heterogeneous Scholarly Data Sources in an Interoperable Setting: the...
Linking Heterogeneous Scholarly Data Sources in an Interoperable Setting: the...
 
THOR Workshop - Introduction
THOR Workshop - IntroductionTHOR Workshop - Introduction
THOR Workshop - Introduction
 
Why would a publisher care about open data?
Why would a publisher care about open data?Why would a publisher care about open data?
Why would a publisher care about open data?
 
Data-X-Sparse-v2
Data-X-Sparse-v2Data-X-Sparse-v2
Data-X-Sparse-v2
 
Managing 'Big Data' in the social sciences: the contribution of an analytico-...
Managing 'Big Data' in the social sciences: the contribution of an analytico-...Managing 'Big Data' in the social sciences: the contribution of an analytico-...
Managing 'Big Data' in the social sciences: the contribution of an analytico-...
 
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
A Reuse-based Lightweight Method for Developing Linked Data Ontologies and Vo...
 
Lightning Talk Session - Connecting Altmetric (K. Capretta)
Lightning Talk Session - Connecting Altmetric (K. Capretta)Lightning Talk Session - Connecting Altmetric (K. Capretta)
Lightning Talk Session - Connecting Altmetric (K. Capretta)
 
SemTecBiz 2012: Corporate Semantic Web
SemTecBiz 2012: Corporate Semantic WebSemTecBiz 2012: Corporate Semantic Web
SemTecBiz 2012: Corporate Semantic Web
 
A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
 
Data-X-v3.1
Data-X-v3.1Data-X-v3.1
Data-X-v3.1
 
Information entanglement
Information entanglementInformation entanglement
Information entanglement
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the parts
 
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...
 
Intro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data ScientistsIntro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data Scientists
 
Linked Open Data_mlanet13
Linked Open Data_mlanet13Linked Open Data_mlanet13
Linked Open Data_mlanet13
 
Session 0.0 poster minutes madness
Session 0.0   poster minutes madnessSession 0.0   poster minutes madness
Session 0.0 poster minutes madness
 
Towards the Intelligent Internet of Everything
Towards the Intelligent Internet of EverythingTowards the Intelligent Internet of Everything
Towards the Intelligent Internet of Everything
 
The Future of Research Communications and e-Scholarship: Are we there yet?
The Future of Research Communications and e-Scholarship: Are we there yet?The Future of Research Communications and e-Scholarship: Are we there yet?
The Future of Research Communications and e-Scholarship: Are we there yet?
 

More from William Gunn

AAAS 2014: How the Web Changes Collaboration
AAAS 2014: How the Web Changes CollaborationAAAS 2014: How the Web Changes Collaboration
AAAS 2014: How the Web Changes Collaboration
William Gunn
 
LISA VII: The Scientific and Technical Foundation for Altmetrics in the Unite...
LISA VII: The Scientific and Technical Foundation for Altmetrics in the Unite...LISA VII: The Scientific and Technical Foundation for Altmetrics in the Unite...
LISA VII: The Scientific and Technical Foundation for Altmetrics in the Unite...William Gunn
 
The Scientific and Technical Foundation for Altmetrics in the United States
The Scientific and Technical Foundation for Altmetrics in the United StatesThe Scientific and Technical Foundation for Altmetrics in the United States
The Scientific and Technical Foundation for Altmetrics in the United StatesWilliam Gunn
 
AGU2012: Creating a Collaborative Network for Scientists
AGU2012: Creating a Collaborative Network for ScientistsAGU2012: Creating a Collaborative Network for Scientists
AGU2012: Creating a Collaborative Network for ScientistsWilliam Gunn
 
Academia to Entrepreneur: Why and How to Leave Academia Behind
Academia to Entrepreneur: Why and How to Leave Academia BehindAcademia to Entrepreneur: Why and How to Leave Academia Behind
Academia to Entrepreneur: Why and How to Leave Academia Behind
William Gunn
 
Social metrics for Research: Quantity and Quality
Social metrics for Research: Quantity and QualitySocial metrics for Research: Quantity and Quality
Social metrics for Research: Quantity and QualityWilliam Gunn
 
ASIST 2013 Panel: Altmetrics at Mendeley
ASIST 2013 Panel: Altmetrics at MendeleyASIST 2013 Panel: Altmetrics at Mendeley
ASIST 2013 Panel: Altmetrics at MendeleyWilliam Gunn
 
Code4lib 2012: Building Research Applications with Mendeley
Code4lib 2012: Building Research Applications with MendeleyCode4lib 2012: Building Research Applications with Mendeley
Code4lib 2012: Building Research Applications with Mendeley
William Gunn
 
Beyond Academia: Communicating your Work in Academia and Beyond
Beyond Academia: Communicating your Work in Academia and Beyond Beyond Academia: Communicating your Work in Academia and Beyond
Beyond Academia: Communicating your Work in Academia and Beyond
William Gunn
 
Charleston 2013: The Social Side of Research
Charleston 2013: The Social Side of ResearchCharleston 2013: The Social Side of Research
Charleston 2013: The Social Side of ResearchWilliam Gunn
 
Science Online 2013: Data Visualization Using R
Science Online 2013: Data Visualization Using RScience Online 2013: Data Visualization Using R
Science Online 2013: Data Visualization Using R
William Gunn
 
ESIP FED Spring 2012: Evolving Networks of Expertise
ESIP FED Spring 2012: Evolving Networks of ExpertiseESIP FED Spring 2012: Evolving Networks of Expertise
ESIP FED Spring 2012: Evolving Networks of ExpertiseWilliam Gunn
 
Charleston 2012: Altmetrics: Analyzing the Value in Scholarly Content
Charleston 2012: Altmetrics: Analyzing the Value in Scholarly ContentCharleston 2012: Altmetrics: Analyzing the Value in Scholarly Content
Charleston 2012: Altmetrics: Analyzing the Value in Scholarly ContentWilliam Gunn
 
VIVO 2010 2010 Paper
VIVO 2010 2010 PaperVIVO 2010 2010 Paper
VIVO 2010 2010 Paper
William Gunn
 
Mendeley Open Repositories 2011 Paper
Mendeley Open Repositories 2011 PaperMendeley Open Repositories 2011 Paper
Mendeley Open Repositories 2011 PaperWilliam Gunn
 
Beyond the PDF 2011 Paper
Beyond the PDF 2011 PaperBeyond the PDF 2011 Paper
Beyond the PDF 2011 PaperWilliam Gunn
 
Connecting Researchers with Information - and Unlocking It!
Connecting Researchers with Information - and Unlocking It!Connecting Researchers with Information - and Unlocking It!
Connecting Researchers with Information - and Unlocking It!William Gunn
 
Sci Tech Forum LA 2013: New Directions in Scholarly Communication
Sci Tech Forum LA 2013: New Directions in Scholarly CommunicationSci Tech Forum LA 2013: New Directions in Scholarly Communication
Sci Tech Forum LA 2013: New Directions in Scholarly CommunicationWilliam Gunn
 
Open Science Summit 2011: It's Time We Changed How Science is Done
Open Science Summit 2011: It's Time We Changed How Science is DoneOpen Science Summit 2011: It's Time We Changed How Science is Done
Open Science Summit 2011: It's Time We Changed How Science is Done
William Gunn
 
VIVO 2011 Paper
VIVO 2011 PaperVIVO 2011 Paper
VIVO 2011 Paper
William Gunn
 

More from William Gunn (20)

AAAS 2014: How the Web Changes Collaboration
AAAS 2014: How the Web Changes CollaborationAAAS 2014: How the Web Changes Collaboration
AAAS 2014: How the Web Changes Collaboration
 
LISA VII: The Scientific and Technical Foundation for Altmetrics in the Unite...
LISA VII: The Scientific and Technical Foundation for Altmetrics in the Unite...LISA VII: The Scientific and Technical Foundation for Altmetrics in the Unite...
LISA VII: The Scientific and Technical Foundation for Altmetrics in the Unite...
 
The Scientific and Technical Foundation for Altmetrics in the United States
The Scientific and Technical Foundation for Altmetrics in the United StatesThe Scientific and Technical Foundation for Altmetrics in the United States
The Scientific and Technical Foundation for Altmetrics in the United States
 
AGU2012: Creating a Collaborative Network for Scientists
AGU2012: Creating a Collaborative Network for ScientistsAGU2012: Creating a Collaborative Network for Scientists
AGU2012: Creating a Collaborative Network for Scientists
 
Academia to Entrepreneur: Why and How to Leave Academia Behind
Academia to Entrepreneur: Why and How to Leave Academia BehindAcademia to Entrepreneur: Why and How to Leave Academia Behind
Academia to Entrepreneur: Why and How to Leave Academia Behind
 
Social metrics for Research: Quantity and Quality
Social metrics for Research: Quantity and QualitySocial metrics for Research: Quantity and Quality
Social metrics for Research: Quantity and Quality
 
ASIST 2013 Panel: Altmetrics at Mendeley
ASIST 2013 Panel: Altmetrics at MendeleyASIST 2013 Panel: Altmetrics at Mendeley
ASIST 2013 Panel: Altmetrics at Mendeley
 
Code4lib 2012: Building Research Applications with Mendeley
Code4lib 2012: Building Research Applications with MendeleyCode4lib 2012: Building Research Applications with Mendeley
Code4lib 2012: Building Research Applications with Mendeley
 
Beyond Academia: Communicating your Work in Academia and Beyond
Beyond Academia: Communicating your Work in Academia and Beyond Beyond Academia: Communicating your Work in Academia and Beyond
Beyond Academia: Communicating your Work in Academia and Beyond
 
Charleston 2013: The Social Side of Research
Charleston 2013: The Social Side of ResearchCharleston 2013: The Social Side of Research
Charleston 2013: The Social Side of Research
 
Science Online 2013: Data Visualization Using R
Science Online 2013: Data Visualization Using RScience Online 2013: Data Visualization Using R
Science Online 2013: Data Visualization Using R
 
ESIP FED Spring 2012: Evolving Networks of Expertise
ESIP FED Spring 2012: Evolving Networks of ExpertiseESIP FED Spring 2012: Evolving Networks of Expertise
ESIP FED Spring 2012: Evolving Networks of Expertise
 
Charleston 2012: Altmetrics: Analyzing the Value in Scholarly Content
Charleston 2012: Altmetrics: Analyzing the Value in Scholarly ContentCharleston 2012: Altmetrics: Analyzing the Value in Scholarly Content
Charleston 2012: Altmetrics: Analyzing the Value in Scholarly Content
 
VIVO 2010 2010 Paper
VIVO 2010 2010 PaperVIVO 2010 2010 Paper
VIVO 2010 2010 Paper
 
Mendeley Open Repositories 2011 Paper
Mendeley Open Repositories 2011 PaperMendeley Open Repositories 2011 Paper
Mendeley Open Repositories 2011 Paper
 
Beyond the PDF 2011 Paper
Beyond the PDF 2011 PaperBeyond the PDF 2011 Paper
Beyond the PDF 2011 Paper
 
Connecting Researchers with Information - and Unlocking It!
Connecting Researchers with Information - and Unlocking It!Connecting Researchers with Information - and Unlocking It!
Connecting Researchers with Information - and Unlocking It!
 
Sci Tech Forum LA 2013: New Directions in Scholarly Communication
Sci Tech Forum LA 2013: New Directions in Scholarly CommunicationSci Tech Forum LA 2013: New Directions in Scholarly Communication
Sci Tech Forum LA 2013: New Directions in Scholarly Communication
 
Open Science Summit 2011: It's Time We Changed How Science is Done
Open Science Summit 2011: It's Time We Changed How Science is DoneOpen Science Summit 2011: It's Time We Changed How Science is Done
Open Science Summit 2011: It's Time We Changed How Science is Done
 
VIVO 2011 Paper
VIVO 2011 PaperVIVO 2011 Paper
VIVO 2011 Paper
 

Recently uploaded

如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
yqqaatn0
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
RASHMI M G
 
nodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptxnodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptx
alishadewangan1
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
PRIYANKA PATEL
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
KrushnaDarade1
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
ISI 2024: Application Form (Extended), Exam Date (Out), Eligibility
ISI 2024: Application Form (Extended), Exam Date (Out), EligibilityISI 2024: Application Form (Extended), Exam Date (Out), Eligibility
ISI 2024: Application Form (Extended), Exam Date (Out), Eligibility
SciAstra
 
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Studia Poinsotiana
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills MN
 
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdfMudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
frank0071
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
kejapriya1
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
Richard Gill
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
Gokturk Mehmet Dilci
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
Lokesh Patil
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
Abdul Wali Khan University Mardan,kP,Pakistan
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
IshaGoswami9
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
David Osipyan
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
RenuJangid3
 

Recently uploaded (20)

如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
如何办理(uvic毕业证书)维多利亚大学毕业证本科学位证书原版一模一样
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
 
nodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptxnodule formation by alisha dewangan.pptx
nodule formation by alisha dewangan.pptx
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
ISI 2024: Application Form (Extended), Exam Date (Out), Eligibility
ISI 2024: Application Form (Extended), Exam Date (Out), EligibilityISI 2024: Application Form (Extended), Exam Date (Out), Eligibility
ISI 2024: Application Form (Extended), Exam Date (Out), Eligibility
 
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
 
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdfMudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
Mudde & Rovira Kaltwasser. - Populism - a very short introduction [2017].pdf
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
 

VIVO 2013 Topic Modeling Entity Extraction

  • 1. What can we learn from topic modeling on 350M documents? William Gunn Head of Academic Outreach Mendeley @mrgunn – https://orcid.org/0000-0002-3555-2054
  • 2. Who am I? PhD Biomedical Science I've been active in online science communities since 1995 Established the community program at Mendeley – 1700 advisors from 650 schools in 60 countries. Lead the outreach to librarian, academic research, and tech communities
  • 3. Based in London, Mendeley is researchers, graduates and software developers from...
  • 4. Two new approaches Embed a tool within the researcher workflow to capture data Capture new kinds of data – usage of research objects, not just citations of papers.
  • 5. ...and aggregates data in the cloud Mendeley extracts research data… Collecting rich signals from domain experts.
  • 7. TEAM Project academic knowledge management solutions •Algorithms to determine the content similarity of academic papers •Performing text disambiguation and entity recognition to differentiate between and relate similar in-text entities and authors of research papers. •Developing semantic technologies and semantic web languages with the focus of metadata integration/validation •Investigate profiling and user analysis technologies, e.g. based on search logs and document interaction. •We will also improve folksonomies and through that, ontologies of text. • Finally, tagging behaviour will be analysed to improve tag recommendations and strategies. •http://team-project.tugraz.at/blog/
  • 8. Semantics vs. Syntax •Language expresses semantics via syntax •Syntax is all a computer sees in a research article. •How do we get to semantics? •Topic Modeling!
  • 9. Distribution of Topics 0% 5% 10% 15% 20% 25% 30% 35% Bio Phys Engineer Comp Sci Psych & Edu Business Law Other
  • 10. Subcategories of Comp. Sci. 0% 5% 10% 15% 20% AI HCI Info Sci Software Eng Networks
  • 11.
  • 12. Generated topics – Comp. Sci.
  • 14. Categorization As A Process Thing Process Reaction Catalysis Enzymatic
  • 15. Categorization As A Process Thing Process Reaction Catalysis Enzymatic
  • 18. Code Project Use case = mining research papers for facts to add to LOD repositories and light-weight ontologies. •Crowd-sourcing enabled semantic enrichment & integration techniques for integrating facts contained in unstructured information into the LOD cloud •Federated, provenance-enabled querying methods for fact discovery in LOD repositories •Web-based visual analysis interfaces to support human based analysis, integration and organisation of facts •Socio-economic factors – roles, revenue-models and value chains – realisable in the envisioned ecosystem. •http://code-research.eu/
  • 19.
  • 20.
  • 21.
  • 22. Metrics as a discovery tool
  • 24. Building a reproducibility dataset •Mendeley and Science Exchange have started the Reproducibility Initiative •working with Figshare & PLOS to host data & replication reports •building open datasets backing high- impact work •extending the “executable paper” concept to biomedical research
  • 25. Make it porous & part of the web. All these examples show that the main motivation for people to get data (pictures, bookmarks, etc) off their computers and on the web is because it helps them find more of the same. Communities must be open if they are to thrive.