SlideShare a Scribd company logo
1 of 36
MRC Cognition and Brain
Sciences Unit, Cambridge,
UK, 2018-11-20
Open Scientific Knowledge
Peter Murray-Rust
TheContentMine and
Dept of Chemistry , Univ of Cambridge
A new knowledgebase beyond journals
Images from ContentMine CC BY and Wikimedia CC BY-SA
pm286@cam.ac.uk
peter@contentmine.org
Tux and GNU: Open and Free Heroes
This is a story of liberation You can be part of it
And it will make life easier for you and citizens everywhere
TUX Linux GNU FSF Might be
controversial
OurOur story is
In 3 ACTS
Our
We’ll show
you WHY we
need OPEN
Our
Then a DEMO
of SOFTWARE
getpapers and
AMI
Our
Building
COMMUNITY
We need YOU/US
Structure of the presentation
Rapidly reading the literature and supporting systematic reviews
Sustainable Open?
We need volunteers And a sustainable organization
Cannot be bought commercially, 501(c)3, OpenLock
SSI,
Numfocus
Rik Smith-Unna
PlantSciences Cambridge
ContentMine
CoKo
WorldBrain
getpapers and quickscrape
(2x digital music industry!)
ContentMine is OpenLocked Non-Profit http://contentmine.org
The Right to Read is the Right to Mine
The problem: publishers control the infrastructure
Sucking money
out of the system
And destroying science in the Global South…
*In Fahrenheit 451 firemen burned books; in C21st publishers restrict knowledge
Completely
unregulated industry
Megapub451*
cc by-nc-sa license LabHack and Alliance Earth
1 APC = 1900 USD
1 bioreactor = 25 USD
1 Raspberry PI 55 USD
1 submission to bioRxiv
Free (10 USD hidden)
“a PCR machine in the UK
is around £6000 but in
Zimbabwe about $33000 -
try convincing someone to
pay APCs when they have
to try and save for that.”
CITIZENS!
Zimbabwe. LabHack team from
Harare Institute of Technology.
Scientific knowledge should be totally free
OKFN
GNU
TUX/Linux
ContentMine
What’s “surveillance capitalism”?
No Surveillance
capitalism
Innovative reuse
Of content. No ©
@Senficon (Julia Reda) :Text & Data mining in times of
#copyright maximalism:
"Elsevier stopped me doing my research"
http://onsnetwork.org/chartgerink/2015/11/16/elsevi
er-stopped-me-doing-my-research/ … #opencon #TDM
Elsevier stopped me doing my research
Chris Hartgerink
I am a statistician interested in detecting potentially problematic research such as data fabrication,
which results in unreliable findings and can harm policy-making, confound funding decisions, and
hampers research progress.
To this end, I am content mining results reported in the psychology literature. Content mining the
literature is a valuable avenue of investigating research questions with innovative methods. For
example, our research group has written an automated program to mine research papers for errors in
the reported results and found that 1/8 papers (of 30,000) contains at least one result that could
directly influence the substantive conclusion [1].
In new research, I am trying to extract test results, figures, tables, and other information reported in
papers throughout the majority of the psychology literature. As such, I need the research papers
published in psychology that I can mine for these data. To this end, I started ‘bulk’ downloading research
papers from, for instance, Sciencedirect. I was doing this for scholarly purposes and took into account
potential server load by limiting the amount of papers I downloaded per minute to 9. I had no intention
to redistribute the downloaded materials, had legal access to them because my university pays a
subscription, and I only wanted to extract facts from these papers.
Full disclosure, I downloaded approximately 30GB of data from Sciencedirect in approximately 10 days.
This boils down to a server load of 0.0021GB/[min], 0.125GB/h, 3GB/day.
Approximately two weeks after I started downloading psychology research papers, Elsevier notified my
university that this was a violation of the access contract, that this could be considered stealing of
content, and that they wanted it to stop. My librarian explicitly instructed me to stop downloading
(which I did immediately), otherwise Elsevier would cut all access to Sciencedirect for my university.
I am now not able to mine a substantial part of the literature, and because of this Elsevier is directly
hampering me in my research.
[1] Nuijten, M. B., Hartgerink, C. H. J., van Assen, M. A. L. M., Epskamp, S., & Wicherts, J. M. (2015). The
prevalence of statistical reporting errors in psychology (1985–2013). Behavior Research Methods, 1–22.
doi: 10.3758/s13428-015-0664-2
Chris Hartgerink’s blog post
It costs 10 USD to mount an article on (bio)arXiv…
So why 2000 USD for a megapub451 article?
I can charge whatever I like!! No regulator!
academics pay – it’s not their money – and they get glory
APCs and Journals MUST GO!
arXiv
bioRxiv
chemRxiv
10$
Commercial publisher
1800$
Review
Production
Hosting
Corporate
Branding
Marketing
philanthropy
Shareholder
Profit
Scientific knowledge saves lives
But closed is costing us dearly …
…so closed access means people die…
…The software will demonstrate how we can search in future …
I’m from Congo where Ebola comes
from. The Liberia outbreak
Was predicted 30 years ago in a
paywalled paper
Semantic Fulltext
• EuropePMC coherent OpenAccess
• getpapers: query , download (through API).
• AMI filters, checks[1], transforms facts in papers.
• sequences, species, genera, genes,
dictionaries
[0] All operations shown run in total of <3 minutes.
[1] Dictionaries and lookup.
[2] Usable from home by anyone
Zika endemic areas
Wikimedia CC-BY-SA
Open Components
• All the literature – free FULLTEXT everywhere
• Universal dictionary
• Open software – modular
• FRICTIONLESS – no gatekeepers
• CC BY, CC0, BSD/MIT/Apache/GNU,
PREPRINTS!!
Crossref
EuropePMC
Wikidata
getpapers
AMI
We can change all that!
We can do everything ourselves! Look … demo
https://www.wikidata.org/wiki/Wikidata:WikiFactMine
ContentMine thanks the WikimediaFoundation for support
15 million articles, over 200 dictionaries
All the world’s 5 million FAIR Open Scientific articles (* 0.1 MB = 0.5 TB),
indexed by ContentMine . Disk 30 GBP Raspberry Pi3. 50 GBP
CC BY, PeterMR
Disk
Raspberry PI
Power
*** getpapers runs FAST! Downloads 50 papers /
sec => 3000 / min => 200,000 /hour
*** AMI-search:
Dictionaries based on anything in Wikidata (50
million items!) or your own.
We show country, brainparts, funders, disease…
looking for feedback, volunteers, examples
OpenNotebookScience
Jean-Claude Bradley presented with BlueObelisk
by Egon Willighagen
DEMO!!
(a) What is “neuroimaging”??
getpapers –q “neuroimaging” –x –k 100
–o neuro;
ami-search-cooccur neuro
country disease funders
(b) What does the MRC unit do?
getpapers –q “MRC Cognition and Brain
Studies Unit” –x –k 2000 –o cbsu;
ami-search-cooccur neuro
country brainparts braincognition
funders animaltesting
ECR communities we work with
• Open MOOC (Jon Tennant)
• OpenKnowledge Maps (Peter Kraker)
• Unpaywall (Heather Piwowar)
• World brain (Oli Sauter)
• And ContentMine Fellows
• Alexandra Bannach-Brown (Edinburgh, Bond)
(neuroscience and animal experiments)
• And …
AMI-Bio Proposal to Mozilla
We invite you to submit a full application
for AMI-bio: Citizen search and use of the
biomedical literature - Request ID number MF-
1811-05957.
Please submit your application by 11/30/2018.
Guanyang Zhang
 Biology, Arizona
 „My ContentMine Fellowship project will focus on mining weevil-plant associations from literature
records.“
 „Motivation. Comprising ~70,000 described and 220,000 estimated species, weevils
(Curculionoidea) are one of the most diverse plant-feeding insect lineages and constitute nearly
5% of all known animals.“
 „Knowledge of host plant associations is critical for pest management, conservation, and
comparative biological research. This knowledge is, however, scattered in 300 years of historical
literature and difficult to access.“
 Weevil-plant association network graph made with Google Fusion Table. Each blue circle is a weevil
tribe and yellow circle a plant genus. The size of a circle represents the number of associations.
Neo Christopher Chung
 Warsaw, Computational Biology
 Wants to find out geographic and temporal differences in the use of genomic software tools
ContentMine Workshops on Mining
Chris Kittel, CM, atMozfest 2015
Stefan Kasberger, CM
Julia Reda, Pirate MEP, running ContentMine
software to liberate science 2016-04-16
Lars Willighagen
 15 years old NL
 Wants: extract data about conifers (relations to chemicals, height etc.)
 Outcome: database with webpage containing conifer properties
 Table Facts Visualiser DEMO
 Card DEMO
 Word Cloud
 „ I applied to this fellowship to learn new things and combine the ContentMine with two previous
projects I never got to finish, and I got really excited by the idea and the ContentMine at large.“
bioRxiv in
Citizen Health Search (CHS)
A proposal to Wellcome Trust (
Open Research in Health call) with
ContentMine, Cochrane and UCL-EPPI (CCU)
CHS puts semantic search on the desktop
of the searcher. We index all the visible
Medical literature, normalize, section
and index against a bank of user-chosen
dictionaries.
CHS takes input from EPMC, bioRxiv and
emerging community sources such as
Crossref, unpaywall and outputs to Zenodo,
Wikidata and CM-Science Source.
Citizen Dashboard
Question/s
• “How can I help?”
– Create dictionaries
– Document your voyage
– Spread the word
– Advocate
– Meet at the pub for hacking?
– Code (especially downstream - visualisation)
?Anyone seriously interested in automatic extraction of
data from tables and plots?
http://www.budapestopenaccessinitiative.org/read
… an unprecedented public good. …
… completely free and unrestricted access to [peer-
reviewed literature] by all scientists, scholars, teachers,
students, and other curious minds. …
…Removing access barriers to this literature will
accelerate research, enrich education, share the
learning of the rich with the poor and the poor with
the rich, make this literature as useful as it can be, and
lay the foundation for uniting humanity in a common
intellectual conversation and quest for knowledge.
(Budapest Open Access Initiative, 2003)

More Related Content

What's hot

Automatic Extraction of Knowledge from Biomedical literature
Automatic Extraction of Knowledge from Biomedical literatureAutomatic Extraction of Knowledge from Biomedical literature
Automatic Extraction of Knowledge from Biomedical literaturepetermurrayrust
 
Biovision2017 Accessing the scientific literature
Biovision2017 Accessing the scientific literatureBiovision2017 Accessing the scientific literature
Biovision2017 Accessing the scientific literaturepetermurrayrust
 
Automatic Extraction of Knowledge from Biomedical literature
Automatic Extraction of Knowledge from Biomedical literature Automatic Extraction of Knowledge from Biomedical literature
Automatic Extraction of Knowledge from Biomedical literature TheContentMine
 
Cochrane workshop 2016
Cochrane workshop 2016Cochrane workshop 2016
Cochrane workshop 2016TheContentMine
 
Climate Change and Human Migration
Climate Change and Human MigrationClimate Change and Human Migration
Climate Change and Human Migrationpetermurrayrust
 
High throughput mining of the scholarly literature; talk at NIH
High throughput mining of the scholarly literature; talk at NIHHigh throughput mining of the scholarly literature; talk at NIH
High throughput mining of the scholarly literature; talk at NIHpetermurrayrust
 
Paradise Lost and The Right to Read is the Right to Mine
Paradise Lost and The Right to Read is the Right to MineParadise Lost and The Right to Read is the Right to Mine
Paradise Lost and The Right to Read is the Right to Minepetermurrayrust
 
Content Mining of Science in Cambridge
Content Mining of Science in CambridgeContent Mining of Science in Cambridge
Content Mining of Science in CambridgeTheContentMine
 
Content Mining of Science in Europe
Content Mining of Science in EuropeContent Mining of Science in Europe
Content Mining of Science in Europepetermurrayrust
 
Disrupting the Publisher-Academic Complex
Disrupting the Publisher-Academic ComplexDisrupting the Publisher-Academic Complex
Disrupting the Publisher-Academic Complexpetermurrayrust
 
HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8Scott Edmunds
 
The Future of Research (Science and Technology)
The Future of Research (Science and Technology)The Future of Research (Science and Technology)
The Future of Research (Science and Technology)Duncan Hull
 
Early Career Reseachers in Science. Start Early, Be Open , Be Brave
Early Career Reseachers in Science. Start Early, Be Open , Be BraveEarly Career Reseachers in Science. Start Early, Be Open , Be Brave
Early Career Reseachers in Science. Start Early, Be Open , Be Bravepetermurrayrust
 
Can machines understand the scientific literature?
Can machines understand the scientific literature?Can machines understand the scientific literature?
Can machines understand the scientific literature?petermurrayrust
 
ContentMining in Neuroscience
ContentMining in NeuroscienceContentMining in Neuroscience
ContentMining in NeuroscienceTheContentMine
 
MESUR: Making sense and use of usage data
MESUR: Making sense and use of usage dataMESUR: Making sense and use of usage data
MESUR: Making sense and use of usage dataHerbert Van de Sompel
 
Data Science, Data Curation, and Human-Data Interaction
Data Science, Data Curation, and Human-Data InteractionData Science, Data Curation, and Human-Data Interaction
Data Science, Data Curation, and Human-Data InteractionUniversity of Washington
 
A Global Commons for Scientific Data: Molecules and Wikidata
A Global Commons for Scientific Data: Molecules and WikidataA Global Commons for Scientific Data: Molecules and Wikidata
A Global Commons for Scientific Data: Molecules and Wikidatapetermurrayrust
 
Scott Edmunds, ReCon 2015: Beyond Dead Trees, Publishing Digital Research Obj...
Scott Edmunds, ReCon 2015: Beyond Dead Trees, Publishing Digital Research Obj...Scott Edmunds, ReCon 2015: Beyond Dead Trees, Publishing Digital Research Obj...
Scott Edmunds, ReCon 2015: Beyond Dead Trees, Publishing Digital Research Obj...GigaScience, BGI Hong Kong
 

What's hot (20)

Automatic Extraction of Knowledge from Biomedical literature
Automatic Extraction of Knowledge from Biomedical literatureAutomatic Extraction of Knowledge from Biomedical literature
Automatic Extraction of Knowledge from Biomedical literature
 
Biovision2017 Accessing the scientific literature
Biovision2017 Accessing the scientific literatureBiovision2017 Accessing the scientific literature
Biovision2017 Accessing the scientific literature
 
Automatic Extraction of Knowledge from Biomedical literature
Automatic Extraction of Knowledge from Biomedical literature Automatic Extraction of Knowledge from Biomedical literature
Automatic Extraction of Knowledge from Biomedical literature
 
Cochrane workshop 2016
Cochrane workshop 2016Cochrane workshop 2016
Cochrane workshop 2016
 
Climate Change and Human Migration
Climate Change and Human MigrationClimate Change and Human Migration
Climate Change and Human Migration
 
High throughput mining of the scholarly literature; talk at NIH
High throughput mining of the scholarly literature; talk at NIHHigh throughput mining of the scholarly literature; talk at NIH
High throughput mining of the scholarly literature; talk at NIH
 
Paradise Lost and The Right to Read is the Right to Mine
Paradise Lost and The Right to Read is the Right to MineParadise Lost and The Right to Read is the Right to Mine
Paradise Lost and The Right to Read is the Right to Mine
 
Content Mining of Science in Cambridge
Content Mining of Science in CambridgeContent Mining of Science in Cambridge
Content Mining of Science in Cambridge
 
Content Mining of Science in Europe
Content Mining of Science in EuropeContent Mining of Science in Europe
Content Mining of Science in Europe
 
Disrupting the Publisher-Academic Complex
Disrupting the Publisher-Academic ComplexDisrupting the Publisher-Academic Complex
Disrupting the Publisher-Academic Complex
 
HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8
 
The Future of Research (Science and Technology)
The Future of Research (Science and Technology)The Future of Research (Science and Technology)
The Future of Research (Science and Technology)
 
The Era of Open
The Era of OpenThe Era of Open
The Era of Open
 
Early Career Reseachers in Science. Start Early, Be Open , Be Brave
Early Career Reseachers in Science. Start Early, Be Open , Be BraveEarly Career Reseachers in Science. Start Early, Be Open , Be Brave
Early Career Reseachers in Science. Start Early, Be Open , Be Brave
 
Can machines understand the scientific literature?
Can machines understand the scientific literature?Can machines understand the scientific literature?
Can machines understand the scientific literature?
 
ContentMining in Neuroscience
ContentMining in NeuroscienceContentMining in Neuroscience
ContentMining in Neuroscience
 
MESUR: Making sense and use of usage data
MESUR: Making sense and use of usage dataMESUR: Making sense and use of usage data
MESUR: Making sense and use of usage data
 
Data Science, Data Curation, and Human-Data Interaction
Data Science, Data Curation, and Human-Data InteractionData Science, Data Curation, and Human-Data Interaction
Data Science, Data Curation, and Human-Data Interaction
 
A Global Commons for Scientific Data: Molecules and Wikidata
A Global Commons for Scientific Data: Molecules and WikidataA Global Commons for Scientific Data: Molecules and Wikidata
A Global Commons for Scientific Data: Molecules and Wikidata
 
Scott Edmunds, ReCon 2015: Beyond Dead Trees, Publishing Digital Research Obj...
Scott Edmunds, ReCon 2015: Beyond Dead Trees, Publishing Digital Research Obj...Scott Edmunds, ReCon 2015: Beyond Dead Trees, Publishing Digital Research Obj...
Scott Edmunds, ReCon 2015: Beyond Dead Trees, Publishing Digital Research Obj...
 

Similar to Open Access to Scientific Knowledge

ContentMine: Mining the Scientific Literature
ContentMine: Mining the Scientific LiteratureContentMine: Mining the Scientific Literature
ContentMine: Mining the Scientific Literaturepetermurrayrust
 
Early Career Reseachers and Open Healthcare
Early Career Reseachers and Open HealthcareEarly Career Reseachers and Open Healthcare
Early Career Reseachers and Open Healthcarepetermurrayrust
 
The culture of researchData
The culture of researchDataThe culture of researchData
The culture of researchDatapetermurrayrust
 
Amanuens.is HUmans and machines annotating scholarly literature
Amanuens.is HUmans and machines annotating scholarly literatureAmanuens.is HUmans and machines annotating scholarly literature
Amanuens.is HUmans and machines annotating scholarly literaturepetermurrayrust
 
Amanuens.is HUmans and machines annotating scholarly literature
Amanuens.is HUmans and machines annotating scholarly literature Amanuens.is HUmans and machines annotating scholarly literature
Amanuens.is HUmans and machines annotating scholarly literature TheContentMine
 
Open software and knowledge for MIOSS
Open software and knowledge for MIOSS Open software and knowledge for MIOSS
Open software and knowledge for MIOSS TheContentMine
 
Open software and knowledge for MIOSS
Open software and knowledge for MIOSSOpen software and knowledge for MIOSS
Open software and knowledge for MIOSSpetermurrayrust
 
Automatic Extraction of Science and Medicine from the scholarly literature
Automatic Extraction of Science and  Medicine from the scholarly literatureAutomatic Extraction of Science and  Medicine from the scholarly literature
Automatic Extraction of Science and Medicine from the scholarly literaturepetermurrayrust
 
Automatic Extraction of Science and Medicine from the scholarly literature
Automatic Extraction of Science and Medicine from the scholarly literatureAutomatic Extraction of Science and Medicine from the scholarly literature
Automatic Extraction of Science and Medicine from the scholarly literatureTheContentMine
 
Digital Scholarship: Enlightenment or Devastated Landscape?
Digital Scholarship: Enlightenment or Devastated Landscape? Digital Scholarship: Enlightenment or Devastated Landscape?
Digital Scholarship: Enlightenment or Devastated Landscape? TheContentMine
 
The culture of researchData
The culture of researchData The culture of researchData
The culture of researchData TheContentMine
 
The Culture of Research Data, by Peter Murray-Rust
The Culture of Research Data, by Peter Murray-RustThe Culture of Research Data, by Peter Murray-Rust
The Culture of Research Data, by Peter Murray-RustLEARN Project
 
Open access impact
Open access impactOpen access impact
Open access impactIryna Kuchma
 
Open data and Open Science
Open data and Open ScienceOpen data and Open Science
Open data and Open Sciencepetermurrayrust
 
Digital research: Collections, data, tools and methods
Digital research: Collections, data, tools and methods Digital research: Collections, data, tools and methods
Digital research: Collections, data, tools and methods Stella Wisdom
 
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...GigaScience, BGI Hong Kong
 
Open Data in a Global Ecosystem
Open Data in a Global EcosystemOpen Data in a Global Ecosystem
Open Data in a Global EcosystemPhilip Bourne
 

Similar to Open Access to Scientific Knowledge (20)

ContentMine: Mining the Scientific Literature
ContentMine: Mining the Scientific LiteratureContentMine: Mining the Scientific Literature
ContentMine: Mining the Scientific Literature
 
Early Career Reseachers and Open Healthcare
Early Career Reseachers and Open HealthcareEarly Career Reseachers and Open Healthcare
Early Career Reseachers and Open Healthcare
 
The culture of researchData
The culture of researchDataThe culture of researchData
The culture of researchData
 
Amanuens.is HUmans and machines annotating scholarly literature
Amanuens.is HUmans and machines annotating scholarly literatureAmanuens.is HUmans and machines annotating scholarly literature
Amanuens.is HUmans and machines annotating scholarly literature
 
Amanuens.is HUmans and machines annotating scholarly literature
Amanuens.is HUmans and machines annotating scholarly literature Amanuens.is HUmans and machines annotating scholarly literature
Amanuens.is HUmans and machines annotating scholarly literature
 
Open software and knowledge for MIOSS
Open software and knowledge for MIOSS Open software and knowledge for MIOSS
Open software and knowledge for MIOSS
 
Open software and knowledge for MIOSS
Open software and knowledge for MIOSSOpen software and knowledge for MIOSS
Open software and knowledge for MIOSS
 
Automatic Extraction of Science and Medicine from the scholarly literature
Automatic Extraction of Science and  Medicine from the scholarly literatureAutomatic Extraction of Science and  Medicine from the scholarly literature
Automatic Extraction of Science and Medicine from the scholarly literature
 
Automatic Extraction of Science and Medicine from the scholarly literature
Automatic Extraction of Science and Medicine from the scholarly literatureAutomatic Extraction of Science and Medicine from the scholarly literature
Automatic Extraction of Science and Medicine from the scholarly literature
 
Digital Scholarship
Digital ScholarshipDigital Scholarship
Digital Scholarship
 
Digital Scholarship: Enlightenment or Devastated Landscape?
Digital Scholarship: Enlightenment or Devastated Landscape? Digital Scholarship: Enlightenment or Devastated Landscape?
Digital Scholarship: Enlightenment or Devastated Landscape?
 
The culture of researchData
The culture of researchData The culture of researchData
The culture of researchData
 
The Culture of Research Data, by Peter Murray-Rust
The Culture of Research Data, by Peter Murray-RustThe Culture of Research Data, by Peter Murray-Rust
The Culture of Research Data, by Peter Murray-Rust
 
Plosslides
PlosslidesPlosslides
Plosslides
 
PLOS slides
PLOS slidesPLOS slides
PLOS slides
 
Open access impact
Open access impactOpen access impact
Open access impact
 
Open data and Open Science
Open data and Open ScienceOpen data and Open Science
Open data and Open Science
 
Digital research: Collections, data, tools and methods
Digital research: Collections, data, tools and methods Digital research: Collections, data, tools and methods
Digital research: Collections, data, tools and methods
 
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
 
Open Data in a Global Ecosystem
Open Data in a Global EcosystemOpen Data in a Global Ecosystem
Open Data in a Global Ecosystem
 

More from petermurrayrust

Omdi2021 Ontologies for (Materials) Science in the Digital Age
Omdi2021 Ontologies for (Materials) Science in the Digital AgeOmdi2021 Ontologies for (Materials) Science in the Digital Age
Omdi2021 Ontologies for (Materials) Science in the Digital Agepetermurrayrust
 
Open Science Principles and Practice
Open Science Principles and PracticeOpen Science Principles and Practice
Open Science Principles and Practicepetermurrayrust
 
Open Virus Indian Presentation
Open Virus Indian PresentationOpen Virus Indian Presentation
Open Virus Indian Presentationpetermurrayrust
 
OpenVirus at OpenPublishingFest
OpenVirus at OpenPublishingFestOpenVirus at OpenPublishingFest
OpenVirus at OpenPublishingFestpetermurrayrust
 
Open Virus Indian Presentation
Open Virus Indian PresentationOpen Virus Indian Presentation
Open Virus Indian Presentationpetermurrayrust
 
Automatic mining of data from materials science literature
Automatic mining of data from materials science literatureAutomatic mining of data from materials science literature
Automatic mining of data from materials science literaturepetermurrayrust
 
openVirus - tools for discovering literature on viruses
openVirus - tools for discovering literature on virusesopenVirus - tools for discovering literature on viruses
openVirus - tools for discovering literature on virusespetermurrayrust
 
XML for science; its huge potential; but are pubiishers preventing it?
XML for science; its huge potential; but are pubiishers preventing it?XML for science; its huge potential; but are pubiishers preventing it?
XML for science; its huge potential; but are pubiishers preventing it?petermurrayrust
 
Scientific search for everyone
Scientific search for everyoneScientific search for everyone
Scientific search for everyonepetermurrayrust
 
Openplant2018 Poster; Semantic searching
Openplant2018 Poster; Semantic searchingOpenplant2018 Poster; Semantic searching
Openplant2018 Poster; Semantic searchingpetermurrayrust
 
Extracting science from the archive
Extracting science from the archiveExtracting science from the archive
Extracting science from the archivepetermurrayrust
 
WikiFactMine: Ontology for Everybody and Everything
WikiFactMine: Ontology for Everybody and EverythingWikiFactMine: Ontology for Everybody and Everything
WikiFactMine: Ontology for Everybody and Everythingpetermurrayrust
 
Young people in an Age of Knowledge Neocolonialism
Young people in an Age of Knowledge NeocolonialismYoung people in an Age of Knowledge Neocolonialism
Young people in an Age of Knowledge Neocolonialismpetermurrayrust
 
WikiFactMine for Plant Chemistry
WikiFactMine for Plant ChemistryWikiFactMine for Plant Chemistry
WikiFactMine for Plant Chemistrypetermurrayrust
 
Can machines understand the scientific literature
Can machines understand the scientific literatureCan machines understand the scientific literature
Can machines understand the scientific literaturepetermurrayrust
 
Asking the scientific literature to tell us about metabolism
Asking the scientific literature to tell us about metabolismAsking the scientific literature to tell us about metabolism
Asking the scientific literature to tell us about metabolismpetermurrayrust
 
Asking the scientific literature to tell us about metabolism
Asking the scientific literature to tell us about metabolismAsking the scientific literature to tell us about metabolism
Asking the scientific literature to tell us about metabolismpetermurrayrust
 

More from petermurrayrust (17)

Omdi2021 Ontologies for (Materials) Science in the Digital Age
Omdi2021 Ontologies for (Materials) Science in the Digital AgeOmdi2021 Ontologies for (Materials) Science in the Digital Age
Omdi2021 Ontologies for (Materials) Science in the Digital Age
 
Open Science Principles and Practice
Open Science Principles and PracticeOpen Science Principles and Practice
Open Science Principles and Practice
 
Open Virus Indian Presentation
Open Virus Indian PresentationOpen Virus Indian Presentation
Open Virus Indian Presentation
 
OpenVirus at OpenPublishingFest
OpenVirus at OpenPublishingFestOpenVirus at OpenPublishingFest
OpenVirus at OpenPublishingFest
 
Open Virus Indian Presentation
Open Virus Indian PresentationOpen Virus Indian Presentation
Open Virus Indian Presentation
 
Automatic mining of data from materials science literature
Automatic mining of data from materials science literatureAutomatic mining of data from materials science literature
Automatic mining of data from materials science literature
 
openVirus - tools for discovering literature on viruses
openVirus - tools for discovering literature on virusesopenVirus - tools for discovering literature on viruses
openVirus - tools for discovering literature on viruses
 
XML for science; its huge potential; but are pubiishers preventing it?
XML for science; its huge potential; but are pubiishers preventing it?XML for science; its huge potential; but are pubiishers preventing it?
XML for science; its huge potential; but are pubiishers preventing it?
 
Scientific search for everyone
Scientific search for everyoneScientific search for everyone
Scientific search for everyone
 
Openplant2018 Poster; Semantic searching
Openplant2018 Poster; Semantic searchingOpenplant2018 Poster; Semantic searching
Openplant2018 Poster; Semantic searching
 
Extracting science from the archive
Extracting science from the archiveExtracting science from the archive
Extracting science from the archive
 
WikiFactMine: Ontology for Everybody and Everything
WikiFactMine: Ontology for Everybody and EverythingWikiFactMine: Ontology for Everybody and Everything
WikiFactMine: Ontology for Everybody and Everything
 
Young people in an Age of Knowledge Neocolonialism
Young people in an Age of Knowledge NeocolonialismYoung people in an Age of Knowledge Neocolonialism
Young people in an Age of Knowledge Neocolonialism
 
WikiFactMine for Plant Chemistry
WikiFactMine for Plant ChemistryWikiFactMine for Plant Chemistry
WikiFactMine for Plant Chemistry
 
Can machines understand the scientific literature
Can machines understand the scientific literatureCan machines understand the scientific literature
Can machines understand the scientific literature
 
Asking the scientific literature to tell us about metabolism
Asking the scientific literature to tell us about metabolismAsking the scientific literature to tell us about metabolism
Asking the scientific literature to tell us about metabolism
 
Asking the scientific literature to tell us about metabolism
Asking the scientific literature to tell us about metabolismAsking the scientific literature to tell us about metabolism
Asking the scientific literature to tell us about metabolism
 

Recently uploaded

Call Girl Lucknow Mallika 7001305949 Independent Escort Service Lucknow
Call Girl Lucknow Mallika 7001305949 Independent Escort Service LucknowCall Girl Lucknow Mallika 7001305949 Independent Escort Service Lucknow
Call Girl Lucknow Mallika 7001305949 Independent Escort Service Lucknownarwatsonia7
 
See the 2,456 pharmacies on the National E-Pharmacy Platform
See the 2,456 pharmacies on the National E-Pharmacy PlatformSee the 2,456 pharmacies on the National E-Pharmacy Platform
See the 2,456 pharmacies on the National E-Pharmacy PlatformKweku Zurek
 
Call Girls Service In Shyam Nagar Whatsapp 8445551418 Independent Escort Service
Call Girls Service In Shyam Nagar Whatsapp 8445551418 Independent Escort ServiceCall Girls Service In Shyam Nagar Whatsapp 8445551418 Independent Escort Service
Call Girls Service In Shyam Nagar Whatsapp 8445551418 Independent Escort Serviceparulsinha
 
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service ChennaiCall Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service ChennaiNehru place Escorts
 
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment BookingCall Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment BookingNehru place Escorts
 
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...narwatsonia7
 
Bangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% SafeBangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% Safenarwatsonia7
 
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy GirlsCall Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy Girlsnehamumbai
 
Call Girls Thane Just Call 9910780858 Get High Class Call Girls Service
Call Girls Thane Just Call 9910780858 Get High Class Call Girls ServiceCall Girls Thane Just Call 9910780858 Get High Class Call Girls Service
Call Girls Thane Just Call 9910780858 Get High Class Call Girls Servicesonalikaur4
 
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original PhotosBook Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photosnarwatsonia7
 
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original PhotosCall Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photosnarwatsonia7
 
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment BookingCall Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Bookingnarwatsonia7
 
Call Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service Availablenarwatsonia7
 
call girls in Connaught Place DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
call girls in Connaught Place  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...call girls in Connaught Place  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
call girls in Connaught Place DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...saminamagar
 
Glomerular Filtration and determinants of glomerular filtration .pptx
Glomerular Filtration and  determinants of glomerular filtration .pptxGlomerular Filtration and  determinants of glomerular filtration .pptx
Glomerular Filtration and determinants of glomerular filtration .pptxDr.Nusrat Tariq
 
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...narwatsonia7
 
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...narwatsonia7
 
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service LucknowVIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknownarwatsonia7
 

Recently uploaded (20)

Call Girl Lucknow Mallika 7001305949 Independent Escort Service Lucknow
Call Girl Lucknow Mallika 7001305949 Independent Escort Service LucknowCall Girl Lucknow Mallika 7001305949 Independent Escort Service Lucknow
Call Girl Lucknow Mallika 7001305949 Independent Escort Service Lucknow
 
See the 2,456 pharmacies on the National E-Pharmacy Platform
See the 2,456 pharmacies on the National E-Pharmacy PlatformSee the 2,456 pharmacies on the National E-Pharmacy Platform
See the 2,456 pharmacies on the National E-Pharmacy Platform
 
Call Girls Service In Shyam Nagar Whatsapp 8445551418 Independent Escort Service
Call Girls Service In Shyam Nagar Whatsapp 8445551418 Independent Escort ServiceCall Girls Service In Shyam Nagar Whatsapp 8445551418 Independent Escort Service
Call Girls Service In Shyam Nagar Whatsapp 8445551418 Independent Escort Service
 
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service ChennaiCall Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
Call Girls Service Chennai Jiya 7001305949 Independent Escort Service Chennai
 
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment BookingCall Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
Call Girls Service Nandiambakkam | 7001305949 At Low Cost Cash Payment Booking
 
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
Russian Call Girls Chickpet - 7001305949 Booking and charges genuine rate for...
 
Bangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% SafeBangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% Safe
Bangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% Safe
 
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy GirlsCall Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
Call Girls In Andheri East Call 9920874524 Book Hot And Sexy Girls
 
Call Girls Thane Just Call 9910780858 Get High Class Call Girls Service
Call Girls Thane Just Call 9910780858 Get High Class Call Girls ServiceCall Girls Thane Just Call 9910780858 Get High Class Call Girls Service
Call Girls Thane Just Call 9910780858 Get High Class Call Girls Service
 
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original PhotosBook Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
Book Call Girls in Yelahanka - For 7001305949 Cheap & Best with original Photos
 
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original PhotosCall Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
Call Girl Service Bidadi - For 7001305949 Cheap & Best with original Photos
 
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Jp Nagar Just Call 7001305949 Top Class Call Girl Service Available
 
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment BookingCall Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
Call Girl Koramangala | 7001305949 At Low Cost Cash Payment Booking
 
Call Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service AvailableCall Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service Available
Call Girls Hsr Layout Just Call 7001305949 Top Class Call Girl Service Available
 
call girls in Connaught Place DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
call girls in Connaught Place  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...call girls in Connaught Place  DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
call girls in Connaught Place DELHI 🔝 >༒9540349809 🔝 genuine Escort Service ...
 
Glomerular Filtration and determinants of glomerular filtration .pptx
Glomerular Filtration and  determinants of glomerular filtration .pptxGlomerular Filtration and  determinants of glomerular filtration .pptx
Glomerular Filtration and determinants of glomerular filtration .pptx
 
sauth delhi call girls in Bhajanpura 🔝 9953056974 🔝 escort Service
sauth delhi call girls in Bhajanpura 🔝 9953056974 🔝 escort Servicesauth delhi call girls in Bhajanpura 🔝 9953056974 🔝 escort Service
sauth delhi call girls in Bhajanpura 🔝 9953056974 🔝 escort Service
 
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
Housewife Call Girls Bangalore - Call 7001305949 Rs-3500 with A/C Room Cash o...
 
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
Call Girls Electronic City Just Call 7001305949 Top Class Call Girl Service A...
 
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service LucknowVIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
VIP Call Girls Lucknow Nandini 7001305949 Independent Escort Service Lucknow
 

Open Access to Scientific Knowledge

  • 1. MRC Cognition and Brain Sciences Unit, Cambridge, UK, 2018-11-20 Open Scientific Knowledge Peter Murray-Rust TheContentMine and Dept of Chemistry , Univ of Cambridge A new knowledgebase beyond journals Images from ContentMine CC BY and Wikimedia CC BY-SA pm286@cam.ac.uk peter@contentmine.org
  • 2. Tux and GNU: Open and Free Heroes This is a story of liberation You can be part of it And it will make life easier for you and citizens everywhere TUX Linux GNU FSF Might be controversial
  • 3. OurOur story is In 3 ACTS Our We’ll show you WHY we need OPEN Our Then a DEMO of SOFTWARE getpapers and AMI Our Building COMMUNITY We need YOU/US Structure of the presentation Rapidly reading the literature and supporting systematic reviews
  • 4. Sustainable Open? We need volunteers And a sustainable organization Cannot be bought commercially, 501(c)3, OpenLock SSI, Numfocus
  • 6. (2x digital music industry!) ContentMine is OpenLocked Non-Profit http://contentmine.org The Right to Read is the Right to Mine
  • 7. The problem: publishers control the infrastructure Sucking money out of the system And destroying science in the Global South… *In Fahrenheit 451 firemen burned books; in C21st publishers restrict knowledge Completely unregulated industry Megapub451*
  • 8. cc by-nc-sa license LabHack and Alliance Earth 1 APC = 1900 USD 1 bioreactor = 25 USD 1 Raspberry PI 55 USD 1 submission to bioRxiv Free (10 USD hidden) “a PCR machine in the UK is around £6000 but in Zimbabwe about $33000 - try convincing someone to pay APCs when they have to try and save for that.” CITIZENS! Zimbabwe. LabHack team from Harare Institute of Technology.
  • 9. Scientific knowledge should be totally free OKFN GNU TUX/Linux ContentMine What’s “surveillance capitalism”? No Surveillance capitalism Innovative reuse Of content. No ©
  • 10. @Senficon (Julia Reda) :Text & Data mining in times of #copyright maximalism: "Elsevier stopped me doing my research" http://onsnetwork.org/chartgerink/2015/11/16/elsevi er-stopped-me-doing-my-research/ … #opencon #TDM Elsevier stopped me doing my research Chris Hartgerink
  • 11. I am a statistician interested in detecting potentially problematic research such as data fabrication, which results in unreliable findings and can harm policy-making, confound funding decisions, and hampers research progress. To this end, I am content mining results reported in the psychology literature. Content mining the literature is a valuable avenue of investigating research questions with innovative methods. For example, our research group has written an automated program to mine research papers for errors in the reported results and found that 1/8 papers (of 30,000) contains at least one result that could directly influence the substantive conclusion [1]. In new research, I am trying to extract test results, figures, tables, and other information reported in papers throughout the majority of the psychology literature. As such, I need the research papers published in psychology that I can mine for these data. To this end, I started ‘bulk’ downloading research papers from, for instance, Sciencedirect. I was doing this for scholarly purposes and took into account potential server load by limiting the amount of papers I downloaded per minute to 9. I had no intention to redistribute the downloaded materials, had legal access to them because my university pays a subscription, and I only wanted to extract facts from these papers. Full disclosure, I downloaded approximately 30GB of data from Sciencedirect in approximately 10 days. This boils down to a server load of 0.0021GB/[min], 0.125GB/h, 3GB/day. Approximately two weeks after I started downloading psychology research papers, Elsevier notified my university that this was a violation of the access contract, that this could be considered stealing of content, and that they wanted it to stop. My librarian explicitly instructed me to stop downloading (which I did immediately), otherwise Elsevier would cut all access to Sciencedirect for my university. I am now not able to mine a substantial part of the literature, and because of this Elsevier is directly hampering me in my research. [1] Nuijten, M. B., Hartgerink, C. H. J., van Assen, M. A. L. M., Epskamp, S., & Wicherts, J. M. (2015). The prevalence of statistical reporting errors in psychology (1985–2013). Behavior Research Methods, 1–22. doi: 10.3758/s13428-015-0664-2 Chris Hartgerink’s blog post
  • 12. It costs 10 USD to mount an article on (bio)arXiv… So why 2000 USD for a megapub451 article?
  • 13. I can charge whatever I like!! No regulator! academics pay – it’s not their money – and they get glory
  • 14. APCs and Journals MUST GO! arXiv bioRxiv chemRxiv 10$ Commercial publisher 1800$ Review Production Hosting Corporate Branding Marketing philanthropy Shareholder Profit
  • 15. Scientific knowledge saves lives But closed is costing us dearly …
  • 16.
  • 17. …so closed access means people die… …The software will demonstrate how we can search in future … I’m from Congo where Ebola comes from. The Liberia outbreak Was predicted 30 years ago in a paywalled paper
  • 18. Semantic Fulltext • EuropePMC coherent OpenAccess • getpapers: query , download (through API). • AMI filters, checks[1], transforms facts in papers. • sequences, species, genera, genes, dictionaries [0] All operations shown run in total of <3 minutes. [1] Dictionaries and lookup. [2] Usable from home by anyone Zika endemic areas Wikimedia CC-BY-SA
  • 19. Open Components • All the literature – free FULLTEXT everywhere • Universal dictionary • Open software – modular • FRICTIONLESS – no gatekeepers • CC BY, CC0, BSD/MIT/Apache/GNU,
  • 20. PREPRINTS!! Crossref EuropePMC Wikidata getpapers AMI We can change all that! We can do everything ourselves! Look … demo
  • 21. https://www.wikidata.org/wiki/Wikidata:WikiFactMine ContentMine thanks the WikimediaFoundation for support 15 million articles, over 200 dictionaries
  • 22. All the world’s 5 million FAIR Open Scientific articles (* 0.1 MB = 0.5 TB), indexed by ContentMine . Disk 30 GBP Raspberry Pi3. 50 GBP CC BY, PeterMR Disk Raspberry PI Power
  • 23. *** getpapers runs FAST! Downloads 50 papers / sec => 3000 / min => 200,000 /hour *** AMI-search: Dictionaries based on anything in Wikidata (50 million items!) or your own. We show country, brainparts, funders, disease… looking for feedback, volunteers, examples
  • 24. OpenNotebookScience Jean-Claude Bradley presented with BlueObelisk by Egon Willighagen
  • 25. DEMO!! (a) What is “neuroimaging”?? getpapers –q “neuroimaging” –x –k 100 –o neuro; ami-search-cooccur neuro country disease funders (b) What does the MRC unit do? getpapers –q “MRC Cognition and Brain Studies Unit” –x –k 2000 –o cbsu; ami-search-cooccur neuro country brainparts braincognition funders animaltesting
  • 26.
  • 27. ECR communities we work with • Open MOOC (Jon Tennant) • OpenKnowledge Maps (Peter Kraker) • Unpaywall (Heather Piwowar) • World brain (Oli Sauter) • And ContentMine Fellows • Alexandra Bannach-Brown (Edinburgh, Bond) (neuroscience and animal experiments) • And …
  • 28. AMI-Bio Proposal to Mozilla We invite you to submit a full application for AMI-bio: Citizen search and use of the biomedical literature - Request ID number MF- 1811-05957. Please submit your application by 11/30/2018.
  • 29. Guanyang Zhang  Biology, Arizona  „My ContentMine Fellowship project will focus on mining weevil-plant associations from literature records.“  „Motivation. Comprising ~70,000 described and 220,000 estimated species, weevils (Curculionoidea) are one of the most diverse plant-feeding insect lineages and constitute nearly 5% of all known animals.“  „Knowledge of host plant associations is critical for pest management, conservation, and comparative biological research. This knowledge is, however, scattered in 300 years of historical literature and difficult to access.“  Weevil-plant association network graph made with Google Fusion Table. Each blue circle is a weevil tribe and yellow circle a plant genus. The size of a circle represents the number of associations.
  • 30. Neo Christopher Chung  Warsaw, Computational Biology  Wants to find out geographic and temporal differences in the use of genomic software tools
  • 31. ContentMine Workshops on Mining Chris Kittel, CM, atMozfest 2015 Stefan Kasberger, CM
  • 32. Julia Reda, Pirate MEP, running ContentMine software to liberate science 2016-04-16
  • 33. Lars Willighagen  15 years old NL  Wants: extract data about conifers (relations to chemicals, height etc.)  Outcome: database with webpage containing conifer properties  Table Facts Visualiser DEMO  Card DEMO  Word Cloud  „ I applied to this fellowship to learn new things and combine the ContentMine with two previous projects I never got to finish, and I got really excited by the idea and the ContentMine at large.“
  • 34. bioRxiv in Citizen Health Search (CHS) A proposal to Wellcome Trust ( Open Research in Health call) with ContentMine, Cochrane and UCL-EPPI (CCU) CHS puts semantic search on the desktop of the searcher. We index all the visible Medical literature, normalize, section and index against a bank of user-chosen dictionaries. CHS takes input from EPMC, bioRxiv and emerging community sources such as Crossref, unpaywall and outputs to Zenodo, Wikidata and CM-Science Source. Citizen Dashboard
  • 35. Question/s • “How can I help?” – Create dictionaries – Document your voyage – Spread the word – Advocate – Meet at the pub for hacking? – Code (especially downstream - visualisation) ?Anyone seriously interested in automatic extraction of data from tables and plots?
  • 36. http://www.budapestopenaccessinitiative.org/read … an unprecedented public good. … … completely free and unrestricted access to [peer- reviewed literature] by all scientists, scholars, teachers, students, and other curious minds. … …Removing access barriers to this literature will accelerate research, enrich education, share the learning of the rich with the poor and the poor with the rich, make this literature as useful as it can be, and lay the foundation for uniting humanity in a common intellectual conversation and quest for knowledge. (Budapest Open Access Initiative, 2003)