SlideShare a Scribd company logo
Content Mining:
Technology and Policy Developments
@jenny_molloy EuropePMC AGM
What is content?
What is mining?
1982
“Automatically generating logical representations of
text passages... by means of an analysis of the
coherence structure of the passages.”
Jerry R. Hobbs, Donald E. Walker, and Robert A. Amsler. 1982. Natural language access to structured text. In Proceedings of the 9th
conference on Computational linguistics - Volume 1(COLING '82), Ján Horecký (Ed.), Vol. 1. Academia Praha, , Czechoslovakia, 127-132.
DOI=10.3115/991813.991833 http://dx.doi.org/10.3115/991813.991833
2008
“The use of automated methods for exploiting
the enormous amount of knowledge available in
the biomedical literature.”
Cohen, K. Bretonnel; Hunter, Lawrence (2008). "Getting Started in Text Mining". PLoS Computational
Biology 4 (1): e20. doi:10.1371/journal.pcbi.0040020. PMC 2217579.PMID 18225946.
Mining Examples
Building bacterial supertrees
Mining chemical reactions
Better genome annotation
Only ~4% phylogenetic analyses
make underlying data available.
Supertrees
Content Mining enables AUTOMATED
extraction from daily literature and
conversion to NeXML:
- Machine-readable
- Open
- Reuseable
RAW data would be optimal!
PLUTo: Ross Mounce & Peter Murray-Rust
Chemistry
AMI reads and recognises chemicals
structures.
Can even create reaction animation.
Natural language processing
can be used to analyse
chemical methods. These are
FACTS but the paper itself may
be copyrighted.
Annotation
Many applications:
- Find primers
- Enhance positive controls
- Find novel sequence information
- More detailed and accurate annotation
Potential to improve
quality and efficiency
of genomic research.
Legal Considerations
Copyright
Database
rights
Contract
Law
2011
2014
From 2014
UK Law
Workshops, hackdays, presentations, collaborations,
discussions with librarians and publishers.
Putting new rights into action.
In Europe
2013
Shortly after
20132015
Research commisioned through H2020...any EU Directive >5 years away.
Ireland already considering following UK - plus other member states?.
EuropePMC
Already provide
content mining facilities
and are willing to
experiment.
Please continue to support these
transformative technologies
and help researchers take
advantage of new legal rights.
Thank you very much
for your attention!
Any questions?
Peter Murray-Rust
Ross Mounce
Richard Smith-Unna
Steph Unna
Jenny Molloy
Mark MacGillivray
With thanks to:
Charles Oppenheim
Michelle Brook
Follow
@TheContentMine
More info on
contentmine.org
Find the code on
github.com/Content
Mine
Funded by:
All images are licensed under CC-BY unless otherwise stated
What is Content?
Phylogenetic Tree from Figure 1 in Evolution and Taxonomic Classification of Human Papillomavirus 16 (HPV16)-Related Variant Genomes: HPV31,
HPV33, HPV35, HPV52, HPV58 and HPV67. Chen Z, Schiffman M, Herrero R, DeSalle R, Anastos K, et al. (2011) Evolution and Taxonomic
Classification of Human Papillomavirus 16 (HPV16)-Related Variant Genomes: HPV31, HPV33, HPV35, HPV52, HPV58 and HPV67. PLoS ONE 6(5):
e20183. doi: 10.1371/journal.pone.0020183
Graph from He F, Fromion V, Westerhoff HV. (Im)Perfect robustness and adaptation of metabolic networks subject to metabolic and gene-expression
regulation: marrying control engineering with metabolic control analysis. BMC Syst Biol. 2013;7 131. doi:10.1186/1752-0509-7-131. PubMed PMID:
24261908; PubMed Central PMCID: PMC4222491.
Table from Table 1 Young GR, Mavrommatis B, Kassiotis G. Microarray analysis reveals global modulation of endogenous retroelement transcription by
microbes. Retrovirology. 2014;11 59. doi:10.1186/1742-4690-11-59. PubMed PMID: 25063042; PubMed Central PMCID: PMC4222864.
Text from Laidlaw CT, Condon JM, Belk MC. Viability Costs of Reproduction and Behavioral Compensation in Western Mosquitofish (Gambusia affinis).
PLoS One. 2014;9(11) e110524. doi:10.1371/journal.pone.0110524. PubMed PMID: 25365426; PubMed Central PMCID: PMC4217728.
Cell microscopy image from Pettinato G, Vanden Berg-Foels WS, Zhang N, Wen X. ROCK Inhibitor Is Not Required for Embryoid Body Formation from
Singularized Human Embryonic Stem Cells. PLoS One. 2014;9(11) e100742. doi:10.1371/journal.pone.0100742. PubMed PMID: 25365581; PubMed
Central PMCID: PMC4217711.
Supertrees:
Lang JM, Darling AE, Eisen JA. Phylogeny of bacterial and archaeal genomes using conserved genes: supertrees and supermatrices. PLoS One.
2013;8(4) e62510. doi:10.1371/journal.pone.0062510. PubMed PMID: 23638103; PubMed Central PMCID: PMC3636077.
McDowell A, Nagy I, Magyari M, Barnard E, Patrick S. The opportunistic pathogen Propionibacterium acnes: insights into typing, human disease, clonal
diversification and CAMP factor evolution. PLoS One. 2013;8(9) e70897. doi:10.1371/journal.pone.0070897. PubMed PMID: 24058439; PubMed Central
PMCID: PMC3772855.
Chemistry:
Diagram from Klejnstrup ML, Frandsen RJ, Holm DK, Nielsen MT, Mortensen UH, Larsen TO, Nielsen JB. Genetics of Polyketide Metabolism in
Aspergillus nidulans. Metabolites. 2012;2(1) 100-133. doi:10.3390/metabo2010100. PubMed PMID: 24957370; PubMed Central PMCID: PMC3901194.
Methods text from Greshock, T. J., Grubbs, A. W., Jiao, P., Wicklow, D. T., Gloer, J. B., & Williams, R. M. (2008). Isolation, Structure Elucidation, and
Biomimetic Total Synthesis of Versicolamide B, and the Isolation of Antipodal (−) Stephacidin A and (+) Notoamide B from Aspergillus versicolor NRRL‐ ‐
35600. Angewandte Chemie m frokInternational Edition, 47(19), 3573-3577.
Annotation:
Stubben, C. J., & Challacombe, J. F. (2014). Mining locus tags in PubMed Central to improve microbial gene annotation. BMC bioinformatics, 15(1), 43.
Figure from Haeussler, M., Gerner, M., & Bergman, C. M. (2011). Annotating genes and genomes with DNA sequences extracted from biomedical
articles. Bioinformatics, 27(7), 980-986.

More Related Content

What's hot

Gene
GeneGene
Gene
Jeny Jose
 
Crowdsourcing the Analysis of Genomes
Crowdsourcing the Analysis of GenomesCrowdsourcing the Analysis of Genomes
Crowdsourcing the Analysis of Genomes
Bastian Greshake
 
HVP5: Meeting summary and thoughts - Garry Cutting
HVP5: Meeting summary and thoughts - Garry CuttingHVP5: Meeting summary and thoughts - Garry Cutting
HVP5: Meeting summary and thoughts - Garry CuttingHuman Variome Project
 
Telomere Length as a Predictor for Longevity and Specific Mortality
Telomere Length as a Predictor for Longevity and Specific MortalityTelomere Length as a Predictor for Longevity and Specific Mortality
Telomere Length as a Predictor for Longevity and Specific Mortality
David Rehkopf
 
Human genome project
Human genome projectHuman genome project
Human genome project
Khemaram Loyal
 
Nanoweapons: Nanotechnology Weapons Of Genocide
Nanoweapons: Nanotechnology Weapons Of GenocideNanoweapons: Nanotechnology Weapons Of Genocide
Nanoweapons: Nanotechnology Weapons Of Genocide
brightrainbow1172
 
Senior Project Presentation[1]
Senior Project Presentation[1]Senior Project Presentation[1]
Senior Project Presentation[1]
debka
 
Large-scale integration of data and text
Large-scale integration of data and textLarge-scale integration of data and text
Large-scale integration of data and textLars Juhl Jensen
 
Ratycz Chlamydia Review Paper 2015
Ratycz Chlamydia Review Paper 2015Ratycz Chlamydia Review Paper 2015
Ratycz Chlamydia Review Paper 2015Connor Ratycz
 
Rym kefi (1)
Rym kefi (1)Rym kefi (1)
Rym kefi (1)
Gonçalo Figueira
 
Bacillus anthracis-NEB2011
Bacillus anthracis-NEB2011Bacillus anthracis-NEB2011
Bacillus anthracis-NEB2011
NEB-2011
 
Dr. Talita Resende - Organoids as an invitro model for enteric diseases
Dr. Talita Resende - Organoids as an invitro model for enteric diseasesDr. Talita Resende - Organoids as an invitro model for enteric diseases
Dr. Talita Resende - Organoids as an invitro model for enteric diseases
John Blue
 
Dissertation Presentation - Ashley Otter
Dissertation Presentation - Ashley OtterDissertation Presentation - Ashley Otter
Dissertation Presentation - Ashley Otter
Ashley Otter
 
1
11
10 week PhD report
10 week PhD report10 week PhD report
10 week PhD reportTanja Lepore
 
Resume-CV - April 2016
Resume-CV - April 2016Resume-CV - April 2016
Resume-CV - April 2016Avanti Gokhale
 
ジャーナルクラブ2013年7月10日 中世ヨーロッパのハンセン病菌のゲノム解読
ジャーナルクラブ2013年7月10日 中世ヨーロッパのハンセン病菌のゲノム解読ジャーナルクラブ2013年7月10日 中世ヨーロッパのハンセン病菌のゲノム解読
ジャーナルクラブ2013年7月10日 中世ヨーロッパのハンセン病菌のゲノム解読
Hiromi Matsumae
 
Zoonotic Viruses as a Risk Factor for Tumor Growth Initiation_Crimson Publishers
Zoonotic Viruses as a Risk Factor for Tumor Growth Initiation_Crimson PublishersZoonotic Viruses as a Risk Factor for Tumor Growth Initiation_Crimson Publishers
Zoonotic Viruses as a Risk Factor for Tumor Growth Initiation_Crimson Publishers
CrimsonpublishersCJMI
 

What's hot (19)

Gene
GeneGene
Gene
 
Crowdsourcing the Analysis of Genomes
Crowdsourcing the Analysis of GenomesCrowdsourcing the Analysis of Genomes
Crowdsourcing the Analysis of Genomes
 
HVP5: Meeting summary and thoughts - Garry Cutting
HVP5: Meeting summary and thoughts - Garry CuttingHVP5: Meeting summary and thoughts - Garry Cutting
HVP5: Meeting summary and thoughts - Garry Cutting
 
Telomere Length as a Predictor for Longevity and Specific Mortality
Telomere Length as a Predictor for Longevity and Specific MortalityTelomere Length as a Predictor for Longevity and Specific Mortality
Telomere Length as a Predictor for Longevity and Specific Mortality
 
Human genome project
Human genome projectHuman genome project
Human genome project
 
Nanoweapons: Nanotechnology Weapons Of Genocide
Nanoweapons: Nanotechnology Weapons Of GenocideNanoweapons: Nanotechnology Weapons Of Genocide
Nanoweapons: Nanotechnology Weapons Of Genocide
 
Senior Project Presentation[1]
Senior Project Presentation[1]Senior Project Presentation[1]
Senior Project Presentation[1]
 
Large-scale integration of data and text
Large-scale integration of data and textLarge-scale integration of data and text
Large-scale integration of data and text
 
Ratycz Chlamydia Review Paper 2015
Ratycz Chlamydia Review Paper 2015Ratycz Chlamydia Review Paper 2015
Ratycz Chlamydia Review Paper 2015
 
Rym kefi (1)
Rym kefi (1)Rym kefi (1)
Rym kefi (1)
 
Bacillus anthracis-NEB2011
Bacillus anthracis-NEB2011Bacillus anthracis-NEB2011
Bacillus anthracis-NEB2011
 
Organoid Poster
Organoid PosterOrganoid Poster
Organoid Poster
 
Dr. Talita Resende - Organoids as an invitro model for enteric diseases
Dr. Talita Resende - Organoids as an invitro model for enteric diseasesDr. Talita Resende - Organoids as an invitro model for enteric diseases
Dr. Talita Resende - Organoids as an invitro model for enteric diseases
 
Dissertation Presentation - Ashley Otter
Dissertation Presentation - Ashley OtterDissertation Presentation - Ashley Otter
Dissertation Presentation - Ashley Otter
 
1
11
1
 
10 week PhD report
10 week PhD report10 week PhD report
10 week PhD report
 
Resume-CV - April 2016
Resume-CV - April 2016Resume-CV - April 2016
Resume-CV - April 2016
 
ジャーナルクラブ2013年7月10日 中世ヨーロッパのハンセン病菌のゲノム解読
ジャーナルクラブ2013年7月10日 中世ヨーロッパのハンセン病菌のゲノム解読ジャーナルクラブ2013年7月10日 中世ヨーロッパのハンセン病菌のゲノム解読
ジャーナルクラブ2013年7月10日 中世ヨーロッパのハンセン病菌のゲノム解読
 
Zoonotic Viruses as a Risk Factor for Tumor Growth Initiation_Crimson Publishers
Zoonotic Viruses as a Risk Factor for Tumor Growth Initiation_Crimson PublishersZoonotic Viruses as a Risk Factor for Tumor Growth Initiation_Crimson Publishers
Zoonotic Viruses as a Risk Factor for Tumor Growth Initiation_Crimson Publishers
 

Viewers also liked

2nd determinants -finished product
2nd determinants -finished product2nd determinants -finished product
2nd determinants -finished product415167hg
 
Flora normal mata dan kuman penyebab infeksi mata
Flora normal mata dan kuman penyebab infeksi mataFlora normal mata dan kuman penyebab infeksi mata
Flora normal mata dan kuman penyebab infeksi mataSatya Pragnanda
 
YEAR Conference 2015 - How to share our research data
YEAR Conference 2015 - How to share our research dataYEAR Conference 2015 - How to share our research data
YEAR Conference 2015 - How to share our research data
Jenny Molloy
 
Introducing Open Science
Introducing Open ScienceIntroducing Open Science
Introducing Open Science
Jenny Molloy
 
Legal Framework for TDM
Legal Framework for TDMLegal Framework for TDM
Legal Framework for TDM
Jenny Molloy
 
ContentMine Presentation for WHO Health Data Seminar
ContentMine Presentation for WHO Health Data SeminarContentMine Presentation for WHO Health Data Seminar
ContentMine Presentation for WHO Health Data Seminar
Jenny Molloy
 
Sixth sense technology
Sixth sense technologySixth sense technology
Sixth sense technology
Mukesh Godara
 
SciDataCon 2014 TDM Workshop Intro Slides
SciDataCon 2014 TDM Workshop Intro SlidesSciDataCon 2014 TDM Workshop Intro Slides
SciDataCon 2014 TDM Workshop Intro Slides
Jenny Molloy
 
Id2 presentation
Id2 presentationId2 presentation
Id2 presentation
Jenny Molloy
 
Engineering Life with Synthetic Biology
Engineering Life with Synthetic BiologyEngineering Life with Synthetic Biology
Engineering Life with Synthetic Biology
Jenny Molloy
 

Viewers also liked (13)

I am happy
I am happyI am happy
I am happy
 
2nd determinants -finished product
2nd determinants -finished product2nd determinants -finished product
2nd determinants -finished product
 
Google glass
Google glassGoogle glass
Google glass
 
Flora normal mata dan kuman penyebab infeksi mata
Flora normal mata dan kuman penyebab infeksi mataFlora normal mata dan kuman penyebab infeksi mata
Flora normal mata dan kuman penyebab infeksi mata
 
YEAR Conference 2015 - How to share our research data
YEAR Conference 2015 - How to share our research dataYEAR Conference 2015 - How to share our research data
YEAR Conference 2015 - How to share our research data
 
Introducing Open Science
Introducing Open ScienceIntroducing Open Science
Introducing Open Science
 
Legal Framework for TDM
Legal Framework for TDMLegal Framework for TDM
Legal Framework for TDM
 
Android
Android Android
Android
 
ContentMine Presentation for WHO Health Data Seminar
ContentMine Presentation for WHO Health Data SeminarContentMine Presentation for WHO Health Data Seminar
ContentMine Presentation for WHO Health Data Seminar
 
Sixth sense technology
Sixth sense technologySixth sense technology
Sixth sense technology
 
SciDataCon 2014 TDM Workshop Intro Slides
SciDataCon 2014 TDM Workshop Intro SlidesSciDataCon 2014 TDM Workshop Intro Slides
SciDataCon 2014 TDM Workshop Intro Slides
 
Id2 presentation
Id2 presentationId2 presentation
Id2 presentation
 
Engineering Life with Synthetic Biology
Engineering Life with Synthetic BiologyEngineering Life with Synthetic Biology
Engineering Life with Synthetic Biology
 

Similar to ContentMine at EuropePMC AGM

Tsoi cv umms_8_6_15
Tsoi cv umms_8_6_15Tsoi cv umms_8_6_15
Tsoi cv umms_8_6_15
Alex Lam C Tsoi
 
Bishop reproducibility references nov2016
Bishop reproducibility references nov2016Bishop reproducibility references nov2016
Bishop reproducibility references nov2016
Dorothy Bishop
 
CV.Ximiao_He
CV.Ximiao_HeCV.Ximiao_He
CV.Ximiao_HeXimiao He
 
The genomes of four tapeworm species reveal adaptations to parasitism
The genomes of four tapeworm species reveal adaptations to parasitismThe genomes of four tapeworm species reveal adaptations to parasitism
The genomes of four tapeworm species reveal adaptations to parasitismJoão Soares
 
PAPER 3.1 ~ HUMAN GENOME PROJECT
PAPER 3.1 ~  HUMAN GENOME PROJECTPAPER 3.1 ~  HUMAN GENOME PROJECT
PAPER 3.1 ~ HUMAN GENOME PROJECT
Nusrat Gulbarga
 
Human Genome Project
Human Genome ProjectHuman Genome Project
Human Genome Project
PratyusshKumaarr
 
Bio
BioBio
References on Reproducibility Crisis in Science by D.V.M. Bishop
References on Reproducibility Crisis in Science by D.V.M. BishopReferences on Reproducibility Crisis in Science by D.V.M. Bishop
References on Reproducibility Crisis in Science by D.V.M. Bishop
Dorothy Bishop
 
Genes: A Study Finds “Deleted” Pieces of Genetic Information May Prove Crucia...
Genes: A Study Finds “Deleted” Pieces of Genetic Information May Prove Crucia...Genes: A Study Finds “Deleted” Pieces of Genetic Information May Prove Crucia...
Genes: A Study Finds “Deleted” Pieces of Genetic Information May Prove Crucia...
The Lifesciences Magazine
 
Human genome project and elsi
Human genome project and elsiHuman genome project and elsi
Human genome project and elsi
Yuvaraj neelakandan
 
NAMs in biomedical research
NAMs in biomedical researchNAMs in biomedical research
NAMs in biomedical research
crovida
 
Complete assignment on human Genome Project
Complete assignment on human Genome ProjectComplete assignment on human Genome Project
Complete assignment on human Genome Project
aafaq ali
 
485 lec4 the_genome
485 lec4 the_genome485 lec4 the_genome
485 lec4 the_genome
hhalhaddad
 
Envisioning a world where everyone helps solve disease
Envisioning a world where everyone helps solve diseaseEnvisioning a world where everyone helps solve disease
Envisioning a world where everyone helps solve disease
mhaendel
 
Methods to enhance the validity of precision guidelines emerging from big data
Methods to enhance the validity of precision guidelines emerging from big dataMethods to enhance the validity of precision guidelines emerging from big data
Methods to enhance the validity of precision guidelines emerging from big data
Chirag Patel
 

Similar to ContentMine at EuropePMC AGM (20)

Tsoi cv umms_8_6_15
Tsoi cv umms_8_6_15Tsoi cv umms_8_6_15
Tsoi cv umms_8_6_15
 
My presentation2
My presentation2My presentation2
My presentation2
 
ncomms10165
ncomms10165ncomms10165
ncomms10165
 
Bishop reproducibility references nov2016
Bishop reproducibility references nov2016Bishop reproducibility references nov2016
Bishop reproducibility references nov2016
 
CV.Ximiao_He
CV.Ximiao_HeCV.Ximiao_He
CV.Ximiao_He
 
The genomes of four tapeworm species reveal adaptations to parasitism
The genomes of four tapeworm species reveal adaptations to parasitismThe genomes of four tapeworm species reveal adaptations to parasitism
The genomes of four tapeworm species reveal adaptations to parasitism
 
BioPosterPP
BioPosterPPBioPosterPP
BioPosterPP
 
PAPER 3.1 ~ HUMAN GENOME PROJECT
PAPER 3.1 ~  HUMAN GENOME PROJECTPAPER 3.1 ~  HUMAN GENOME PROJECT
PAPER 3.1 ~ HUMAN GENOME PROJECT
 
Human Genome Project
Human Genome ProjectHuman Genome Project
Human Genome Project
 
Bio
BioBio
Bio
 
References on Reproducibility Crisis in Science by D.V.M. Bishop
References on Reproducibility Crisis in Science by D.V.M. BishopReferences on Reproducibility Crisis in Science by D.V.M. Bishop
References on Reproducibility Crisis in Science by D.V.M. Bishop
 
Genes: A Study Finds “Deleted” Pieces of Genetic Information May Prove Crucia...
Genes: A Study Finds “Deleted” Pieces of Genetic Information May Prove Crucia...Genes: A Study Finds “Deleted” Pieces of Genetic Information May Prove Crucia...
Genes: A Study Finds “Deleted” Pieces of Genetic Information May Prove Crucia...
 
Human genome project and elsi
Human genome project and elsiHuman genome project and elsi
Human genome project and elsi
 
NAMs in biomedical research
NAMs in biomedical researchNAMs in biomedical research
NAMs in biomedical research
 
Complete assignment on human Genome Project
Complete assignment on human Genome ProjectComplete assignment on human Genome Project
Complete assignment on human Genome Project
 
485 lec4 the_genome
485 lec4 the_genome485 lec4 the_genome
485 lec4 the_genome
 
Envisioning a world where everyone helps solve disease
Envisioning a world where everyone helps solve diseaseEnvisioning a world where everyone helps solve disease
Envisioning a world where everyone helps solve disease
 
Romain Banchereau - Resume
Romain Banchereau - ResumeRomain Banchereau - Resume
Romain Banchereau - Resume
 
leung-summary
leung-summaryleung-summary
leung-summary
 
Methods to enhance the validity of precision guidelines emerging from big data
Methods to enhance the validity of precision guidelines emerging from big dataMethods to enhance the validity of precision guidelines emerging from big data
Methods to enhance the validity of precision guidelines emerging from big data
 

Recently uploaded

Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 
S.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary levelS.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary level
ronaldlakony0
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
Sérgio Sacani
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
DiyaBiswas10
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
Areesha Ahmad
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
muralinath2
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
muralinath2
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
sachin783648
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
Lokesh Patil
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
IqrimaNabilatulhusni
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
muralinath2
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
fafyfskhan251kmf
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
silvermistyshot
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
Nistarini College, Purulia (W.B) India
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
yqqaatn0
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
Scintica Instrumentation
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 

Recently uploaded (20)

Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 
S.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary levelS.1 chemistry scheme term 2 for ordinary level
S.1 chemistry scheme term 2 for ordinary level
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
 
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdfDMARDs Pharmacolgy Pharm D 5th Semester.pdf
DMARDs Pharmacolgy Pharm D 5th Semester.pdf
 
Lateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensiveLateral Ventricles.pdf very easy good diagrams comprehensive
Lateral Ventricles.pdf very easy good diagrams comprehensive
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
 
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
原版制作(carleton毕业证书)卡尔顿大学毕业证硕士文凭原版一模一样
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
 

ContentMine at EuropePMC AGM

  • 1. Content Mining: Technology and Policy Developments @jenny_molloy EuropePMC AGM
  • 3. What is mining? 1982 “Automatically generating logical representations of text passages... by means of an analysis of the coherence structure of the passages.” Jerry R. Hobbs, Donald E. Walker, and Robert A. Amsler. 1982. Natural language access to structured text. In Proceedings of the 9th conference on Computational linguistics - Volume 1(COLING '82), Ján Horecký (Ed.), Vol. 1. Academia Praha, , Czechoslovakia, 127-132. DOI=10.3115/991813.991833 http://dx.doi.org/10.3115/991813.991833 2008 “The use of automated methods for exploiting the enormous amount of knowledge available in the biomedical literature.” Cohen, K. Bretonnel; Hunter, Lawrence (2008). "Getting Started in Text Mining". PLoS Computational Biology 4 (1): e20. doi:10.1371/journal.pcbi.0040020. PMC 2217579.PMID 18225946.
  • 4. Mining Examples Building bacterial supertrees Mining chemical reactions Better genome annotation
  • 5. Only ~4% phylogenetic analyses make underlying data available. Supertrees Content Mining enables AUTOMATED extraction from daily literature and conversion to NeXML: - Machine-readable - Open - Reuseable RAW data would be optimal! PLUTo: Ross Mounce & Peter Murray-Rust
  • 6. Chemistry AMI reads and recognises chemicals structures. Can even create reaction animation. Natural language processing can be used to analyse chemical methods. These are FACTS but the paper itself may be copyrighted.
  • 7. Annotation Many applications: - Find primers - Enhance positive controls - Find novel sequence information - More detailed and accurate annotation Potential to improve quality and efficiency of genomic research.
  • 9. 2011 2014 From 2014 UK Law Workshops, hackdays, presentations, collaborations, discussions with librarians and publishers. Putting new rights into action.
  • 10. In Europe 2013 Shortly after 20132015 Research commisioned through H2020...any EU Directive >5 years away. Ireland already considering following UK - plus other member states?.
  • 11. EuropePMC Already provide content mining facilities and are willing to experiment. Please continue to support these transformative technologies and help researchers take advantage of new legal rights. Thank you very much for your attention! Any questions? Peter Murray-Rust Ross Mounce Richard Smith-Unna Steph Unna Jenny Molloy Mark MacGillivray With thanks to: Charles Oppenheim Michelle Brook Follow @TheContentMine More info on contentmine.org Find the code on github.com/Content Mine Funded by:
  • 12. All images are licensed under CC-BY unless otherwise stated What is Content? Phylogenetic Tree from Figure 1 in Evolution and Taxonomic Classification of Human Papillomavirus 16 (HPV16)-Related Variant Genomes: HPV31, HPV33, HPV35, HPV52, HPV58 and HPV67. Chen Z, Schiffman M, Herrero R, DeSalle R, Anastos K, et al. (2011) Evolution and Taxonomic Classification of Human Papillomavirus 16 (HPV16)-Related Variant Genomes: HPV31, HPV33, HPV35, HPV52, HPV58 and HPV67. PLoS ONE 6(5): e20183. doi: 10.1371/journal.pone.0020183 Graph from He F, Fromion V, Westerhoff HV. (Im)Perfect robustness and adaptation of metabolic networks subject to metabolic and gene-expression regulation: marrying control engineering with metabolic control analysis. BMC Syst Biol. 2013;7 131. doi:10.1186/1752-0509-7-131. PubMed PMID: 24261908; PubMed Central PMCID: PMC4222491. Table from Table 1 Young GR, Mavrommatis B, Kassiotis G. Microarray analysis reveals global modulation of endogenous retroelement transcription by microbes. Retrovirology. 2014;11 59. doi:10.1186/1742-4690-11-59. PubMed PMID: 25063042; PubMed Central PMCID: PMC4222864. Text from Laidlaw CT, Condon JM, Belk MC. Viability Costs of Reproduction and Behavioral Compensation in Western Mosquitofish (Gambusia affinis). PLoS One. 2014;9(11) e110524. doi:10.1371/journal.pone.0110524. PubMed PMID: 25365426; PubMed Central PMCID: PMC4217728. Cell microscopy image from Pettinato G, Vanden Berg-Foels WS, Zhang N, Wen X. ROCK Inhibitor Is Not Required for Embryoid Body Formation from Singularized Human Embryonic Stem Cells. PLoS One. 2014;9(11) e100742. doi:10.1371/journal.pone.0100742. PubMed PMID: 25365581; PubMed Central PMCID: PMC4217711. Supertrees: Lang JM, Darling AE, Eisen JA. Phylogeny of bacterial and archaeal genomes using conserved genes: supertrees and supermatrices. PLoS One. 2013;8(4) e62510. doi:10.1371/journal.pone.0062510. PubMed PMID: 23638103; PubMed Central PMCID: PMC3636077. McDowell A, Nagy I, Magyari M, Barnard E, Patrick S. The opportunistic pathogen Propionibacterium acnes: insights into typing, human disease, clonal diversification and CAMP factor evolution. PLoS One. 2013;8(9) e70897. doi:10.1371/journal.pone.0070897. PubMed PMID: 24058439; PubMed Central PMCID: PMC3772855. Chemistry: Diagram from Klejnstrup ML, Frandsen RJ, Holm DK, Nielsen MT, Mortensen UH, Larsen TO, Nielsen JB. Genetics of Polyketide Metabolism in Aspergillus nidulans. Metabolites. 2012;2(1) 100-133. doi:10.3390/metabo2010100. PubMed PMID: 24957370; PubMed Central PMCID: PMC3901194. Methods text from Greshock, T. J., Grubbs, A. W., Jiao, P., Wicklow, D. T., Gloer, J. B., & Williams, R. M. (2008). Isolation, Structure Elucidation, and Biomimetic Total Synthesis of Versicolamide B, and the Isolation of Antipodal (−) Stephacidin A and (+) Notoamide B from Aspergillus versicolor NRRL‐ ‐ 35600. Angewandte Chemie m frokInternational Edition, 47(19), 3573-3577. Annotation: Stubben, C. J., & Challacombe, J. F. (2014). Mining locus tags in PubMed Central to improve microbial gene annotation. BMC bioinformatics, 15(1), 43. Figure from Haeussler, M., Gerner, M., & Bergman, C. M. (2011). Annotating genes and genomes with DNA sequences extracted from biomedical articles. Bioinformatics, 27(7), 980-986.