SlideShare a Scribd company logo
1 of 7
Download to read offline
Future Avenues for Open Data




         Corey Chivers
  PhD Candidate, McGill University
          Data Hacker
          @cjbayesian
      bayesianbiologist.com
Open Data in Science
●   Science is supposed to be a bastion of
    openess (data and otherwise).
●   Publication incentives get in the way of this.
●   However, the future looks bright!
Full text journal articles
●   Study scientific products (articles) as data.
●   Insights beyond the metadata
●   ~200,000 p-values reported in ~30,000
    ecology journal articles.
                          ... To quantify the technical level of any theory presented in the articles, we
                          counted equations, inequalities, and other mathematical expressions (hereafter
                          referred to simply as “equations”) in the main text and any printed appendixes. We
                          divided this count by the number of pages to give a measure of equation density,
                          which ranged from 0 to 7.29 equations per page (mean ± SEM: 0.43 ± 0.04) and was
                          uncorrelated with the length of the article (r647 = 0.056, P = 0.151). To assess
                          impact, we obtained citation data for these articles from the Science Citation Index
                          Expanded on the Thomson Reuters Web of Science in May 2011, excluding any
                          selfcitations (i.e., citing papers for which one or more of the author surnames
                          matched one or more of the author surnames for the cited paper). The number of
                          citations varied widely, ranging from 0 to 374 with a mean ± SEM of 44.80 ± 1.98
                          citations (excluding self-citations). Controlling for a significant positive effect
                          of paper length (Table 1, All citations), the use of equations has a striking
                          influence on this measure of impact. Equation density negatively affects citation
                          rates, leading on average to 22% fewer citations for each additional equation per
                          page (Table 1, All citations). We might expect this effect to be driven largely by a
                          reduction in nontheoretical citations. To investigate this hypothesis, we searched
                          for the term “model*” (excluding some common empirical uses such as “experimental
                          model*”) in the title or abstract of the citing articles and used the presence of
                          this term as a proxy for whether the citing paper was a theoretical one. This search
                          identified 6,229 (22.2%) of the 28,068 citing articles as “theoretical.” We validated
                          our proxy by examining a randomly selected subset of 200 citing articles, which
                          showed that 84.5% were correctly classified as theoretical or nontheoretical. As
                          expected, the negative effect of equation density is strongest for nontheoretical
                          papers, which provide 27% fewer citations for each additional equation per page
                          (Table 1, Nontheoretical citations). Articles less than 10 pages long with up to 0.5
                          equations per page are just as well ...
●   R is an open source data analysis
    & statistics language.
●   Powerful plotting and statistics built in.
●   Huge community of developers and
    statisticians providing customized packages to
    do just about every data
    crunching/analysis/ML task under the sun.
Community Meetups
●   Getting data hackers together
●   Bridging academia and the Real World
●   Sharing tools and data
●   Collaborating together to bring the awesome
Open Data in Science. Corey Chivers

More Related Content

What's hot

Lesson3#Scientific Research Skills
Lesson3#Scientific Research SkillsLesson3#Scientific Research Skills
Lesson3#Scientific Research SkillsD. ALQahtani
 
7 differences betwwen shopping and buying
7 differences betwwen shopping and buying7 differences betwwen shopping and buying
7 differences betwwen shopping and buyingSumit Sharma
 
ENRON EMAIL TEXT ANALYTICS
ENRON EMAIL TEXT ANALYTICSENRON EMAIL TEXT ANALYTICS
ENRON EMAIL TEXT ANALYTICSRadhika Kini
 

What's hot (6)

Resume
ResumeResume
Resume
 
Lesson3#Scientific Research Skills
Lesson3#Scientific Research SkillsLesson3#Scientific Research Skills
Lesson3#Scientific Research Skills
 
Altmetrix
AltmetrixAltmetrix
Altmetrix
 
resume
resumeresume
resume
 
7 differences betwwen shopping and buying
7 differences betwwen shopping and buying7 differences betwwen shopping and buying
7 differences betwwen shopping and buying
 
ENRON EMAIL TEXT ANALYTICS
ENRON EMAIL TEXT ANALYTICSENRON EMAIL TEXT ANALYTICS
ENRON EMAIL TEXT ANALYTICS
 

Viewers also liked

GG Deck SERVICES_s
GG Deck SERVICES_sGG Deck SERVICES_s
GG Deck SERVICES_sSam Ramasamy
 
Mc donaldcrystal publicsspeaking_ted_presentation
Mc donaldcrystal publicsspeaking_ted_presentationMc donaldcrystal publicsspeaking_ted_presentation
Mc donaldcrystal publicsspeaking_ted_presentationstalmcdonald
 
Ecohack pitch. Alex Aylett
Ecohack pitch. Alex AylettEcohack pitch. Alex Aylett
Ecohack pitch. Alex AylettTrudat
 
Visualizing Government Data. Sebastien Pierre, FFunction, ffctn.com
Visualizing Government Data. Sebastien Pierre, FFunction, ffctn.comVisualizing Government Data. Sebastien Pierre, FFunction, ffctn.com
Visualizing Government Data. Sebastien Pierre, FFunction, ffctn.comTrudat
 
How journalists and open data folks can better work together. Roberto Rocha
How journalists and open data folks can better work together.  Roberto RochaHow journalists and open data folks can better work together.  Roberto Rocha
How journalists and open data folks can better work together. Roberto RochaTrudat
 
Open Data business examples - Michael Lenczner
Open Data business examples - Michael LencznerOpen Data business examples - Michael Lenczner
Open Data business examples - Michael LencznerTrudat
 
Open Data in Neuroscience, Trevor Bekolay
Open Data in Neuroscience, Trevor BekolayOpen Data in Neuroscience, Trevor Bekolay
Open Data in Neuroscience, Trevor BekolayTrudat
 
Thinking infrastructurally. Panel 3: Future Avenues for Open Data. Tracey P. ...
Thinking infrastructurally. Panel 3: Future Avenues for Open Data. Tracey P. ...Thinking infrastructurally. Panel 3: Future Avenues for Open Data. Tracey P. ...
Thinking infrastructurally. Panel 3: Future Avenues for Open Data. Tracey P. ...Trudat
 
about me
about meabout me
about meQ YANG
 
Value of Open Data: IATI. Michael Roberts
Value of Open Data: IATI. Michael RobertsValue of Open Data: IATI. Michael Roberts
Value of Open Data: IATI. Michael RobertsTrudat
 
Montreal 1947, From Above. Anton Dubrau. cat-bus.com
Montreal 1947, From Above. Anton Dubrau. cat-bus.comMontreal 1947, From Above. Anton Dubrau. cat-bus.com
Montreal 1947, From Above. Anton Dubrau. cat-bus.comTrudat
 
Linking the Open Data? by Petko Valtchev
Linking the Open Data? by Petko ValtchevLinking the Open Data? by Petko Valtchev
Linking the Open Data? by Petko ValtchevTrudat
 
Open Data Startups. Heri Rakotomalala, @heri
Open Data Startups. Heri Rakotomalala, @heriOpen Data Startups. Heri Rakotomalala, @heri
Open Data Startups. Heri Rakotomalala, @heriTrudat
 
Peek freans 1
Peek freans 1Peek freans 1
Peek freans 1Mian Umar
 

Viewers also liked (17)

El derecho aministrativo
El derecho aministrativoEl derecho aministrativo
El derecho aministrativo
 
GG Deck SERVICES_s
GG Deck SERVICES_sGG Deck SERVICES_s
GG Deck SERVICES_s
 
9tolerancia
9tolerancia9tolerancia
9tolerancia
 
Mc donaldcrystal publicsspeaking_ted_presentation
Mc donaldcrystal publicsspeaking_ted_presentationMc donaldcrystal publicsspeaking_ted_presentation
Mc donaldcrystal publicsspeaking_ted_presentation
 
Romania
RomaniaRomania
Romania
 
Ecohack pitch. Alex Aylett
Ecohack pitch. Alex AylettEcohack pitch. Alex Aylett
Ecohack pitch. Alex Aylett
 
Visualizing Government Data. Sebastien Pierre, FFunction, ffctn.com
Visualizing Government Data. Sebastien Pierre, FFunction, ffctn.comVisualizing Government Data. Sebastien Pierre, FFunction, ffctn.com
Visualizing Government Data. Sebastien Pierre, FFunction, ffctn.com
 
How journalists and open data folks can better work together. Roberto Rocha
How journalists and open data folks can better work together.  Roberto RochaHow journalists and open data folks can better work together.  Roberto Rocha
How journalists and open data folks can better work together. Roberto Rocha
 
Open Data business examples - Michael Lenczner
Open Data business examples - Michael LencznerOpen Data business examples - Michael Lenczner
Open Data business examples - Michael Lenczner
 
Open Data in Neuroscience, Trevor Bekolay
Open Data in Neuroscience, Trevor BekolayOpen Data in Neuroscience, Trevor Bekolay
Open Data in Neuroscience, Trevor Bekolay
 
Thinking infrastructurally. Panel 3: Future Avenues for Open Data. Tracey P. ...
Thinking infrastructurally. Panel 3: Future Avenues for Open Data. Tracey P. ...Thinking infrastructurally. Panel 3: Future Avenues for Open Data. Tracey P. ...
Thinking infrastructurally. Panel 3: Future Avenues for Open Data. Tracey P. ...
 
about me
about meabout me
about me
 
Value of Open Data: IATI. Michael Roberts
Value of Open Data: IATI. Michael RobertsValue of Open Data: IATI. Michael Roberts
Value of Open Data: IATI. Michael Roberts
 
Montreal 1947, From Above. Anton Dubrau. cat-bus.com
Montreal 1947, From Above. Anton Dubrau. cat-bus.comMontreal 1947, From Above. Anton Dubrau. cat-bus.com
Montreal 1947, From Above. Anton Dubrau. cat-bus.com
 
Linking the Open Data? by Petko Valtchev
Linking the Open Data? by Petko ValtchevLinking the Open Data? by Petko Valtchev
Linking the Open Data? by Petko Valtchev
 
Open Data Startups. Heri Rakotomalala, @heri
Open Data Startups. Heri Rakotomalala, @heriOpen Data Startups. Heri Rakotomalala, @heri
Open Data Startups. Heri Rakotomalala, @heri
 
Peek freans 1
Peek freans 1Peek freans 1
Peek freans 1
 

Similar to Open Data in Science. Corey Chivers

Secured Ontology Mapping
Secured Ontology Mapping Secured Ontology Mapping
Secured Ontology Mapping dannyijwest
 
Co word analysis
Co word analysisCo word analysis
Co word analysisdebolina73
 
Keyphrase Extraction using Neighborhood Knowledge
Keyphrase Extraction using Neighborhood KnowledgeKeyphrase Extraction using Neighborhood Knowledge
Keyphrase Extraction using Neighborhood KnowledgeIJMTST Journal
 
Measuring academic influence: Not all citations are equal
Measuring academic influence: Not all citations are equalMeasuring academic influence: Not all citations are equal
Measuring academic influence: Not all citations are equalAndre Vellino
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Gv index scientific contribution rating index that takes into account the gro...
Gv index scientific contribution rating index that takes into account the gro...Gv index scientific contribution rating index that takes into account the gro...
Gv index scientific contribution rating index that takes into account the gro...IJDKP
 
Using a keyword extraction pipeline to understand concepts in future work sec...
Using a keyword extraction pipeline to understand concepts in future work sec...Using a keyword extraction pipeline to understand concepts in future work sec...
Using a keyword extraction pipeline to understand concepts in future work sec...Kai Li
 
Reviewing quantitative articles_and_checklist
Reviewing quantitative articles_and_checklistReviewing quantitative articles_and_checklist
Reviewing quantitative articles_and_checklistLasse Torkkeli
 
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMSA COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMSgerogepatton
 
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMSA COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMSgerogepatton
 
21 minutes agoTami Frazier RE Discussion - Week 3COLLAPSE.docx
21 minutes agoTami Frazier RE Discussion - Week 3COLLAPSE.docx21 minutes agoTami Frazier RE Discussion - Week 3COLLAPSE.docx
21 minutes agoTami Frazier RE Discussion - Week 3COLLAPSE.docxvickeryr87
 
Entity Resolution in Large Graphs
Entity Resolution in Large GraphsEntity Resolution in Large Graphs
Entity Resolution in Large GraphsApurva Kumar
 
Learning to summarize using coherence
Learning to summarize using coherenceLearning to summarize using coherence
Learning to summarize using coherenceContent Savvy
 
SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
SemEval-2012 Task 6: A Pilot on Semantic Textual SimilaritySemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
SemEval-2012 Task 6: A Pilot on Semantic Textual Similaritypathsproject
 
An Enhanced Suffix Tree Approach to Measure Semantic Similarity between Multi...
An Enhanced Suffix Tree Approach to Measure Semantic Similarity between Multi...An Enhanced Suffix Tree Approach to Measure Semantic Similarity between Multi...
An Enhanced Suffix Tree Approach to Measure Semantic Similarity between Multi...iosrjce
 
Interpreting the Semantics of Anomalies Based on Mutual Information in Link M...
Interpreting the Semantics of Anomalies Based on Mutual Information in Link M...Interpreting the Semantics of Anomalies Based on Mutual Information in Link M...
Interpreting the Semantics of Anomalies Based on Mutual Information in Link M...ijdms
 
Seeds Affinity Propagation Based on Text Clustering
Seeds Affinity Propagation Based on Text ClusteringSeeds Affinity Propagation Based on Text Clustering
Seeds Affinity Propagation Based on Text ClusteringIJRES Journal
 

Similar to Open Data in Science. Corey Chivers (20)

Secured Ontology Mapping
Secured Ontology Mapping Secured Ontology Mapping
Secured Ontology Mapping
 
Co word analysis
Co word analysisCo word analysis
Co word analysis
 
Keyphrase Extraction using Neighborhood Knowledge
Keyphrase Extraction using Neighborhood KnowledgeKeyphrase Extraction using Neighborhood Knowledge
Keyphrase Extraction using Neighborhood Knowledge
 
Measuring academic influence: Not all citations are equal
Measuring academic influence: Not all citations are equalMeasuring academic influence: Not all citations are equal
Measuring academic influence: Not all citations are equal
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
N15-1013
N15-1013N15-1013
N15-1013
 
Gv index scientific contribution rating index that takes into account the gro...
Gv index scientific contribution rating index that takes into account the gro...Gv index scientific contribution rating index that takes into account the gro...
Gv index scientific contribution rating index that takes into account the gro...
 
Using a keyword extraction pipeline to understand concepts in future work sec...
Using a keyword extraction pipeline to understand concepts in future work sec...Using a keyword extraction pipeline to understand concepts in future work sec...
Using a keyword extraction pipeline to understand concepts in future work sec...
 
Reviewing quantitative articles_and_checklist
Reviewing quantitative articles_and_checklistReviewing quantitative articles_and_checklist
Reviewing quantitative articles_and_checklist
 
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMSA COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
 
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMSA COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
A COMPARISON OF DOCUMENT SIMILARITY ALGORITHMS
 
21 minutes agoTami Frazier RE Discussion - Week 3COLLAPSE.docx
21 minutes agoTami Frazier RE Discussion - Week 3COLLAPSE.docx21 minutes agoTami Frazier RE Discussion - Week 3COLLAPSE.docx
21 minutes agoTami Frazier RE Discussion - Week 3COLLAPSE.docx
 
Entity Resolution in Large Graphs
Entity Resolution in Large GraphsEntity Resolution in Large Graphs
Entity Resolution in Large Graphs
 
call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...
 
Learning to summarize using coherence
Learning to summarize using coherenceLearning to summarize using coherence
Learning to summarize using coherence
 
SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
SemEval-2012 Task 6: A Pilot on Semantic Textual SimilaritySemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
 
F017243241
F017243241F017243241
F017243241
 
An Enhanced Suffix Tree Approach to Measure Semantic Similarity between Multi...
An Enhanced Suffix Tree Approach to Measure Semantic Similarity between Multi...An Enhanced Suffix Tree Approach to Measure Semantic Similarity between Multi...
An Enhanced Suffix Tree Approach to Measure Semantic Similarity between Multi...
 
Interpreting the Semantics of Anomalies Based on Mutual Information in Link M...
Interpreting the Semantics of Anomalies Based on Mutual Information in Link M...Interpreting the Semantics of Anomalies Based on Mutual Information in Link M...
Interpreting the Semantics of Anomalies Based on Mutual Information in Link M...
 
Seeds Affinity Propagation Based on Text Clustering
Seeds Affinity Propagation Based on Text ClusteringSeeds Affinity Propagation Based on Text Clustering
Seeds Affinity Propagation Based on Text Clustering
 

Recently uploaded

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 

Recently uploaded (20)

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 

Open Data in Science. Corey Chivers

  • 1. Future Avenues for Open Data Corey Chivers PhD Candidate, McGill University Data Hacker @cjbayesian bayesianbiologist.com
  • 2. Open Data in Science ● Science is supposed to be a bastion of openess (data and otherwise). ● Publication incentives get in the way of this. ● However, the future looks bright!
  • 3.
  • 4. Full text journal articles ● Study scientific products (articles) as data. ● Insights beyond the metadata ● ~200,000 p-values reported in ~30,000 ecology journal articles. ... To quantify the technical level of any theory presented in the articles, we counted equations, inequalities, and other mathematical expressions (hereafter referred to simply as “equations”) in the main text and any printed appendixes. We divided this count by the number of pages to give a measure of equation density, which ranged from 0 to 7.29 equations per page (mean ± SEM: 0.43 ± 0.04) and was uncorrelated with the length of the article (r647 = 0.056, P = 0.151). To assess impact, we obtained citation data for these articles from the Science Citation Index Expanded on the Thomson Reuters Web of Science in May 2011, excluding any selfcitations (i.e., citing papers for which one or more of the author surnames matched one or more of the author surnames for the cited paper). The number of citations varied widely, ranging from 0 to 374 with a mean ± SEM of 44.80 ± 1.98 citations (excluding self-citations). Controlling for a significant positive effect of paper length (Table 1, All citations), the use of equations has a striking influence on this measure of impact. Equation density negatively affects citation rates, leading on average to 22% fewer citations for each additional equation per page (Table 1, All citations). We might expect this effect to be driven largely by a reduction in nontheoretical citations. To investigate this hypothesis, we searched for the term “model*” (excluding some common empirical uses such as “experimental model*”) in the title or abstract of the citing articles and used the presence of this term as a proxy for whether the citing paper was a theoretical one. This search identified 6,229 (22.2%) of the 28,068 citing articles as “theoretical.” We validated our proxy by examining a randomly selected subset of 200 citing articles, which showed that 84.5% were correctly classified as theoretical or nontheoretical. As expected, the negative effect of equation density is strongest for nontheoretical papers, which provide 27% fewer citations for each additional equation per page (Table 1, Nontheoretical citations). Articles less than 10 pages long with up to 0.5 equations per page are just as well ...
  • 5. R is an open source data analysis & statistics language. ● Powerful plotting and statistics built in. ● Huge community of developers and statisticians providing customized packages to do just about every data crunching/analysis/ML task under the sun.
  • 6. Community Meetups ● Getting data hackers together ● Bridging academia and the Real World ● Sharing tools and data ● Collaborating together to bring the awesome