SlideShare a Scribd company logo
Future Avenues for Open Data




         Corey Chivers
  PhD Candidate, McGill University
          Data Hacker
          @cjbayesian
      bayesianbiologist.com
Open Data in Science
●   Science is supposed to be a bastion of
    openess (data and otherwise).
●   Publication incentives get in the way of this.
●   However, the future looks bright!
Full text journal articles
●   Study scientific products (articles) as data.
●   Insights beyond the metadata
●   ~200,000 p-values reported in ~30,000
    ecology journal articles.
                          ... To quantify the technical level of any theory presented in the articles, we
                          counted equations, inequalities, and other mathematical expressions (hereafter
                          referred to simply as “equations”) in the main text and any printed appendixes. We
                          divided this count by the number of pages to give a measure of equation density,
                          which ranged from 0 to 7.29 equations per page (mean ± SEM: 0.43 ± 0.04) and was
                          uncorrelated with the length of the article (r647 = 0.056, P = 0.151). To assess
                          impact, we obtained citation data for these articles from the Science Citation Index
                          Expanded on the Thomson Reuters Web of Science in May 2011, excluding any
                          selfcitations (i.e., citing papers for which one or more of the author surnames
                          matched one or more of the author surnames for the cited paper). The number of
                          citations varied widely, ranging from 0 to 374 with a mean ± SEM of 44.80 ± 1.98
                          citations (excluding self-citations). Controlling for a significant positive effect
                          of paper length (Table 1, All citations), the use of equations has a striking
                          influence on this measure of impact. Equation density negatively affects citation
                          rates, leading on average to 22% fewer citations for each additional equation per
                          page (Table 1, All citations). We might expect this effect to be driven largely by a
                          reduction in nontheoretical citations. To investigate this hypothesis, we searched
                          for the term “model*” (excluding some common empirical uses such as “experimental
                          model*”) in the title or abstract of the citing articles and used the presence of
                          this term as a proxy for whether the citing paper was a theoretical one. This search
                          identified 6,229 (22.2%) of the 28,068 citing articles as “theoretical.” We validated
                          our proxy by examining a randomly selected subset of 200 citing articles, which
                          showed that 84.5% were correctly classified as theoretical or nontheoretical. As
                          expected, the negative effect of equation density is strongest for nontheoretical
                          papers, which provide 27% fewer citations for each additional equation per page
                          (Table 1, Nontheoretical citations). Articles less than 10 pages long with up to 0.5
                          equations per page are just as well ...
●   R is an open source data analysis
    & statistics language.
●   Powerful plotting and statistics built in.
●   Huge community of developers and
    statisticians providing customized packages to
    do just about every data
    crunching/analysis/ML task under the sun.
Community Meetups
●   Getting data hackers together
●   Bridging academia and the Real World
●   Sharing tools and data
●   Collaborating together to bring the awesome
Open Data in Science. Corey Chivers

More Related Content

What's hot

Resume
ResumeResume
Lesson3#Scientific Research Skills
Lesson3#Scientific Research SkillsLesson3#Scientific Research Skills
Lesson3#Scientific Research SkillsD. ALQahtani
 
Altmetrix
AltmetrixAltmetrix
Altmetrix
Hugo Besemer
 
7 differences betwwen shopping and buying
7 differences betwwen shopping and buying7 differences betwwen shopping and buying
7 differences betwwen shopping and buying
Sumit Sharma
 
ENRON EMAIL TEXT ANALYTICS
ENRON EMAIL TEXT ANALYTICSENRON EMAIL TEXT ANALYTICS
ENRON EMAIL TEXT ANALYTICSRadhika Kini
 

What's hot (6)

Resume
ResumeResume
Resume
 
Lesson3#Scientific Research Skills
Lesson3#Scientific Research SkillsLesson3#Scientific Research Skills
Lesson3#Scientific Research Skills
 
Altmetrix
AltmetrixAltmetrix
Altmetrix
 
resume
resumeresume
resume
 
7 differences betwwen shopping and buying
7 differences betwwen shopping and buying7 differences betwwen shopping and buying
7 differences betwwen shopping and buying
 
ENRON EMAIL TEXT ANALYTICS
ENRON EMAIL TEXT ANALYTICSENRON EMAIL TEXT ANALYTICS
ENRON EMAIL TEXT ANALYTICS
 

Viewers also liked

El derecho aministrativo
El derecho aministrativoEl derecho aministrativo
El derecho aministrativo
Silvina Robles HAnastasiadis
 
GG Deck SERVICES_s
GG Deck SERVICES_sGG Deck SERVICES_s
GG Deck SERVICES_sSam Ramasamy
 
Mc donaldcrystal publicsspeaking_ted_presentation
Mc donaldcrystal publicsspeaking_ted_presentationMc donaldcrystal publicsspeaking_ted_presentation
Mc donaldcrystal publicsspeaking_ted_presentation
stalmcdonald
 
Ecohack pitch. Alex Aylett
Ecohack pitch. Alex AylettEcohack pitch. Alex Aylett
Ecohack pitch. Alex Aylett
Trudat
 
Visualizing Government Data. Sebastien Pierre, FFunction, ffctn.com
Visualizing Government Data. Sebastien Pierre, FFunction, ffctn.comVisualizing Government Data. Sebastien Pierre, FFunction, ffctn.com
Visualizing Government Data. Sebastien Pierre, FFunction, ffctn.com
Trudat
 
How journalists and open data folks can better work together. Roberto Rocha
How journalists and open data folks can better work together.  Roberto RochaHow journalists and open data folks can better work together.  Roberto Rocha
How journalists and open data folks can better work together. Roberto Rocha
Trudat
 
Open Data business examples - Michael Lenczner
Open Data business examples - Michael LencznerOpen Data business examples - Michael Lenczner
Open Data business examples - Michael Lenczner
Trudat
 
Open Data in Neuroscience, Trevor Bekolay
Open Data in Neuroscience, Trevor BekolayOpen Data in Neuroscience, Trevor Bekolay
Open Data in Neuroscience, Trevor Bekolay
Trudat
 
Thinking infrastructurally. Panel 3: Future Avenues for Open Data. Tracey P. ...
Thinking infrastructurally. Panel 3: Future Avenues for Open Data. Tracey P. ...Thinking infrastructurally. Panel 3: Future Avenues for Open Data. Tracey P. ...
Thinking infrastructurally. Panel 3: Future Avenues for Open Data. Tracey P. ...
Trudat
 
about me
about meabout me
about meQ YANG
 
Value of Open Data: IATI. Michael Roberts
Value of Open Data: IATI. Michael RobertsValue of Open Data: IATI. Michael Roberts
Value of Open Data: IATI. Michael Roberts
Trudat
 
Montreal 1947, From Above. Anton Dubrau. cat-bus.com
Montreal 1947, From Above. Anton Dubrau. cat-bus.comMontreal 1947, From Above. Anton Dubrau. cat-bus.com
Montreal 1947, From Above. Anton Dubrau. cat-bus.com
Trudat
 
Linking the Open Data? by Petko Valtchev
Linking the Open Data? by Petko ValtchevLinking the Open Data? by Petko Valtchev
Linking the Open Data? by Petko Valtchev
Trudat
 
Open Data Startups. Heri Rakotomalala, @heri
Open Data Startups. Heri Rakotomalala, @heriOpen Data Startups. Heri Rakotomalala, @heri
Open Data Startups. Heri Rakotomalala, @heri
Trudat
 
Peek freans 1
Peek freans 1Peek freans 1
Peek freans 1Mian Umar
 

Viewers also liked (17)

El derecho aministrativo
El derecho aministrativoEl derecho aministrativo
El derecho aministrativo
 
GG Deck SERVICES_s
GG Deck SERVICES_sGG Deck SERVICES_s
GG Deck SERVICES_s
 
9tolerancia
9tolerancia9tolerancia
9tolerancia
 
Mc donaldcrystal publicsspeaking_ted_presentation
Mc donaldcrystal publicsspeaking_ted_presentationMc donaldcrystal publicsspeaking_ted_presentation
Mc donaldcrystal publicsspeaking_ted_presentation
 
Romania
RomaniaRomania
Romania
 
Ecohack pitch. Alex Aylett
Ecohack pitch. Alex AylettEcohack pitch. Alex Aylett
Ecohack pitch. Alex Aylett
 
Visualizing Government Data. Sebastien Pierre, FFunction, ffctn.com
Visualizing Government Data. Sebastien Pierre, FFunction, ffctn.comVisualizing Government Data. Sebastien Pierre, FFunction, ffctn.com
Visualizing Government Data. Sebastien Pierre, FFunction, ffctn.com
 
How journalists and open data folks can better work together. Roberto Rocha
How journalists and open data folks can better work together.  Roberto RochaHow journalists and open data folks can better work together.  Roberto Rocha
How journalists and open data folks can better work together. Roberto Rocha
 
Open Data business examples - Michael Lenczner
Open Data business examples - Michael LencznerOpen Data business examples - Michael Lenczner
Open Data business examples - Michael Lenczner
 
Open Data in Neuroscience, Trevor Bekolay
Open Data in Neuroscience, Trevor BekolayOpen Data in Neuroscience, Trevor Bekolay
Open Data in Neuroscience, Trevor Bekolay
 
Thinking infrastructurally. Panel 3: Future Avenues for Open Data. Tracey P. ...
Thinking infrastructurally. Panel 3: Future Avenues for Open Data. Tracey P. ...Thinking infrastructurally. Panel 3: Future Avenues for Open Data. Tracey P. ...
Thinking infrastructurally. Panel 3: Future Avenues for Open Data. Tracey P. ...
 
about me
about meabout me
about me
 
Value of Open Data: IATI. Michael Roberts
Value of Open Data: IATI. Michael RobertsValue of Open Data: IATI. Michael Roberts
Value of Open Data: IATI. Michael Roberts
 
Montreal 1947, From Above. Anton Dubrau. cat-bus.com
Montreal 1947, From Above. Anton Dubrau. cat-bus.comMontreal 1947, From Above. Anton Dubrau. cat-bus.com
Montreal 1947, From Above. Anton Dubrau. cat-bus.com
 
Linking the Open Data? by Petko Valtchev
Linking the Open Data? by Petko ValtchevLinking the Open Data? by Petko Valtchev
Linking the Open Data? by Petko Valtchev
 
Open Data Startups. Heri Rakotomalala, @heri
Open Data Startups. Heri Rakotomalala, @heriOpen Data Startups. Heri Rakotomalala, @heri
Open Data Startups. Heri Rakotomalala, @heri
 
Peek freans 1
Peek freans 1Peek freans 1
Peek freans 1
 

Similar to Open Data in Science. Corey Chivers

Secured Ontology Mapping
Secured Ontology Mapping Secured Ontology Mapping
Secured Ontology Mapping
dannyijwest
 
Co word analysis
Co word analysisCo word analysis
Co word analysis
debolina73
 
Keyphrase Extraction using Neighborhood Knowledge
Keyphrase Extraction using Neighborhood KnowledgeKeyphrase Extraction using Neighborhood Knowledge
Keyphrase Extraction using Neighborhood Knowledge
IJMTST Journal
 
Measuring academic influence: Not all citations are equal
Measuring academic influence: Not all citations are equalMeasuring academic influence: Not all citations are equal
Measuring academic influence: Not all citations are equal
Andre Vellino
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
Gv index scientific contribution rating index that takes into account the gro...
Gv index scientific contribution rating index that takes into account the gro...Gv index scientific contribution rating index that takes into account the gro...
Gv index scientific contribution rating index that takes into account the gro...
IJDKP
 
Using a keyword extraction pipeline to understand concepts in future work sec...
Using a keyword extraction pipeline to understand concepts in future work sec...Using a keyword extraction pipeline to understand concepts in future work sec...
Using a keyword extraction pipeline to understand concepts in future work sec...
Kai Li
 
Reviewing quantitative articles_and_checklist
Reviewing quantitative articles_and_checklistReviewing quantitative articles_and_checklist
Reviewing quantitative articles_and_checklist
Lasse Torkkeli
 
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMSA COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
gerogepatton
 
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMSA COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
gerogepatton
 
21 minutes agoTami Frazier RE Discussion - Week 3COLLAPSE.docx
21 minutes agoTami Frazier RE Discussion - Week 3COLLAPSE.docx21 minutes agoTami Frazier RE Discussion - Week 3COLLAPSE.docx
21 minutes agoTami Frazier RE Discussion - Week 3COLLAPSE.docx
vickeryr87
 
Entity Resolution in Large Graphs
Entity Resolution in Large GraphsEntity Resolution in Large Graphs
Entity Resolution in Large Graphs
Apurva Kumar
 
Learning to summarize using coherence
Learning to summarize using coherenceLearning to summarize using coherence
Learning to summarize using coherence
Content Savvy
 
SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
SemEval-2012 Task 6: A Pilot on Semantic Textual SimilaritySemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
SemEval-2012 Task 6: A Pilot on Semantic Textual Similaritypathsproject
 
F017243241
F017243241F017243241
F017243241
IOSR Journals
 
An Enhanced Suffix Tree Approach to Measure Semantic Similarity between Multi...
An Enhanced Suffix Tree Approach to Measure Semantic Similarity between Multi...An Enhanced Suffix Tree Approach to Measure Semantic Similarity between Multi...
An Enhanced Suffix Tree Approach to Measure Semantic Similarity between Multi...
iosrjce
 
Interpreting the Semantics of Anomalies Based on Mutual Information in Link M...
Interpreting the Semantics of Anomalies Based on Mutual Information in Link M...Interpreting the Semantics of Anomalies Based on Mutual Information in Link M...
Interpreting the Semantics of Anomalies Based on Mutual Information in Link M...
ijdms
 
Seeds Affinity Propagation Based on Text Clustering
Seeds Affinity Propagation Based on Text ClusteringSeeds Affinity Propagation Based on Text Clustering
Seeds Affinity Propagation Based on Text Clustering
IJRES Journal
 

Similar to Open Data in Science. Corey Chivers (20)

Secured Ontology Mapping
Secured Ontology Mapping Secured Ontology Mapping
Secured Ontology Mapping
 
Co word analysis
Co word analysisCo word analysis
Co word analysis
 
Keyphrase Extraction using Neighborhood Knowledge
Keyphrase Extraction using Neighborhood KnowledgeKeyphrase Extraction using Neighborhood Knowledge
Keyphrase Extraction using Neighborhood Knowledge
 
Measuring academic influence: Not all citations are equal
Measuring academic influence: Not all citations are equalMeasuring academic influence: Not all citations are equal
Measuring academic influence: Not all citations are equal
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
N15-1013
N15-1013N15-1013
N15-1013
 
Gv index scientific contribution rating index that takes into account the gro...
Gv index scientific contribution rating index that takes into account the gro...Gv index scientific contribution rating index that takes into account the gro...
Gv index scientific contribution rating index that takes into account the gro...
 
Using a keyword extraction pipeline to understand concepts in future work sec...
Using a keyword extraction pipeline to understand concepts in future work sec...Using a keyword extraction pipeline to understand concepts in future work sec...
Using a keyword extraction pipeline to understand concepts in future work sec...
 
Reviewing quantitative articles_and_checklist
Reviewing quantitative articles_and_checklistReviewing quantitative articles_and_checklist
Reviewing quantitative articles_and_checklist
 
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMSA COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
 
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMSA COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
 
21 minutes agoTami Frazier RE Discussion - Week 3COLLAPSE.docx
21 minutes agoTami Frazier RE Discussion - Week 3COLLAPSE.docx21 minutes agoTami Frazier RE Discussion - Week 3COLLAPSE.docx
21 minutes agoTami Frazier RE Discussion - Week 3COLLAPSE.docx
 
Entity Resolution in Large Graphs
Entity Resolution in Large GraphsEntity Resolution in Large Graphs
Entity Resolution in Large Graphs
 
call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...
 
Learning to summarize using coherence
Learning to summarize using coherenceLearning to summarize using coherence
Learning to summarize using coherence
 
SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
SemEval-2012 Task 6: A Pilot on Semantic Textual SimilaritySemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
 
F017243241
F017243241F017243241
F017243241
 
An Enhanced Suffix Tree Approach to Measure Semantic Similarity between Multi...
An Enhanced Suffix Tree Approach to Measure Semantic Similarity between Multi...An Enhanced Suffix Tree Approach to Measure Semantic Similarity between Multi...
An Enhanced Suffix Tree Approach to Measure Semantic Similarity between Multi...
 
Interpreting the Semantics of Anomalies Based on Mutual Information in Link M...
Interpreting the Semantics of Anomalies Based on Mutual Information in Link M...Interpreting the Semantics of Anomalies Based on Mutual Information in Link M...
Interpreting the Semantics of Anomalies Based on Mutual Information in Link M...
 
Seeds Affinity Propagation Based on Text Clustering
Seeds Affinity Propagation Based on Text ClusteringSeeds Affinity Propagation Based on Text Clustering
Seeds Affinity Propagation Based on Text Clustering
 

Recently uploaded

Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
Vlad Stirbu
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 

Recently uploaded (20)

Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 

Open Data in Science. Corey Chivers

  • 1. Future Avenues for Open Data Corey Chivers PhD Candidate, McGill University Data Hacker @cjbayesian bayesianbiologist.com
  • 2. Open Data in Science ● Science is supposed to be a bastion of openess (data and otherwise). ● Publication incentives get in the way of this. ● However, the future looks bright!
  • 3.
  • 4. Full text journal articles ● Study scientific products (articles) as data. ● Insights beyond the metadata ● ~200,000 p-values reported in ~30,000 ecology journal articles. ... To quantify the technical level of any theory presented in the articles, we counted equations, inequalities, and other mathematical expressions (hereafter referred to simply as “equations”) in the main text and any printed appendixes. We divided this count by the number of pages to give a measure of equation density, which ranged from 0 to 7.29 equations per page (mean ± SEM: 0.43 ± 0.04) and was uncorrelated with the length of the article (r647 = 0.056, P = 0.151). To assess impact, we obtained citation data for these articles from the Science Citation Index Expanded on the Thomson Reuters Web of Science in May 2011, excluding any selfcitations (i.e., citing papers for which one or more of the author surnames matched one or more of the author surnames for the cited paper). The number of citations varied widely, ranging from 0 to 374 with a mean ± SEM of 44.80 ± 1.98 citations (excluding self-citations). Controlling for a significant positive effect of paper length (Table 1, All citations), the use of equations has a striking influence on this measure of impact. Equation density negatively affects citation rates, leading on average to 22% fewer citations for each additional equation per page (Table 1, All citations). We might expect this effect to be driven largely by a reduction in nontheoretical citations. To investigate this hypothesis, we searched for the term “model*” (excluding some common empirical uses such as “experimental model*”) in the title or abstract of the citing articles and used the presence of this term as a proxy for whether the citing paper was a theoretical one. This search identified 6,229 (22.2%) of the 28,068 citing articles as “theoretical.” We validated our proxy by examining a randomly selected subset of 200 citing articles, which showed that 84.5% were correctly classified as theoretical or nontheoretical. As expected, the negative effect of equation density is strongest for nontheoretical papers, which provide 27% fewer citations for each additional equation per page (Table 1, Nontheoretical citations). Articles less than 10 pages long with up to 0.5 equations per page are just as well ...
  • 5. R is an open source data analysis & statistics language. ● Powerful plotting and statistics built in. ● Huge community of developers and statisticians providing customized packages to do just about every data crunching/analysis/ML task under the sun.
  • 6. Community Meetups ● Getting data hackers together ● Bridging academia and the Real World ● Sharing tools and data ● Collaborating together to bring the awesome