SlideShare a Scribd company logo
Bioinformatics!
What is it good for?
Stephen Newhouse & Paul Agapow
Festival of Genomics London 2019
BioinformaticsLondon
https://www.meetup.com/Bioinformatics-London/events/
2
Nice training set, where’s your data?
The right tool for the job
Thresholds are there for a reason
Swallowed a thesaurus
(didn’t swallow the dictionary)
Army of One
Real Scientists
Army of One (take 2)
The Gripe Curve
Army of One (take 3)
Perhaps we could run ComBat?
Low variance
Multi-omics
Reviewer #2
Optimism
Somehow, this is my fault
The most differentially expressed genes
SEPT7
SEPT2
MARCH1
DEC1
etc.
Precision medicine
Alphabet spaghetti code
With my abundant free time
blast -max_target_seqs
Now you have two problems
Solutions?
Maybe we should have higher standards
25
Maybe we should give up
26
THANKS!
✘ Nathan Lau
✘ “Dr Bioinformatician”
✘ “Dr A.I.”
✘ Fabian Klötzl
✘ Kevin G
✘ Iddo Friedberg
✘ Stephen Ross
27
✘ Katherine James
✘ Richard Emes
✘ Ming Tang
✘ Ian Holmes
✘ Frederick Ross
✘ Ben van Zwanenberg
✘ And anonymous
contributors ...

More Related Content

Similar to Bioinformatics! (What is it good for?)

Conference-The-future-will-be-digital-and-biology-but who-will-lead-watson-go...
Conference-The-future-will-be-digital-and-biology-but who-will-lead-watson-go...Conference-The-future-will-be-digital-and-biology-but who-will-lead-watson-go...
Conference-The-future-will-be-digital-and-biology-but who-will-lead-watson-go...Manuel GEA - Bio-Modeling Systems
 
Data analysis & integration challenges in genomics
Data analysis & integration challenges in genomicsData analysis & integration challenges in genomics
Data analysis & integration challenges in genomics
mikaelhuss
 
2014 marine-microbes-grc
2014 marine-microbes-grc2014 marine-microbes-grc
2014 marine-microbes-grcc.titus.brown
 
'Stories that persuade with data' - talk at CENDI meeting January 9 2014
'Stories that persuade with data' - talk at CENDI meeting January 9 2014'Stories that persuade with data' - talk at CENDI meeting January 9 2014
'Stories that persuade with data' - talk at CENDI meeting January 9 2014Anita de Waard
 
The seven-deadly-sins-of-bioinformatics3960
The seven-deadly-sins-of-bioinformatics3960The seven-deadly-sins-of-bioinformatics3960
The seven-deadly-sins-of-bioinformatics3960mare34
 
The Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsThe Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of Bioinformatics
Duncan Hull
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
nadimissimple
 
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
Neuroscience Information Framework
 
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systemsA01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
Bioinformatics Open Source Conference
 
Beyond Proofs of Concept for Biomedical AI
Beyond Proofs of Concept for Biomedical AIBeyond Proofs of Concept for Biomedical AI
Beyond Proofs of Concept for Biomedical AI
Paul Agapow
 
Sweden_eemis_big_data
Sweden_eemis_big_dataSweden_eemis_big_data
Sweden_eemis_big_data
Adina Chuang Howe
 
Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...
Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...
Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...
David Talby
 
Liberty university biol 101 module 1
Liberty university biol 101 module 1Liberty university biol 101 module 1
Liberty university biol 101 module 1
Olivia Fournier
 
Deep learning for biomedical discovery and data mining I
Deep learning for biomedical discovery and data mining IDeep learning for biomedical discovery and data mining I
Deep learning for biomedical discovery and data mining I
Deakin University
 
Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!
adcobb
 

Similar to Bioinformatics! (What is it good for?) (20)

Conference-The-future-will-be-digital-and-biology-but who-will-lead-watson-go...
Conference-The-future-will-be-digital-and-biology-but who-will-lead-watson-go...Conference-The-future-will-be-digital-and-biology-but who-will-lead-watson-go...
Conference-The-future-will-be-digital-and-biology-but who-will-lead-watson-go...
 
Data analysis & integration challenges in genomics
Data analysis & integration challenges in genomicsData analysis & integration challenges in genomics
Data analysis & integration challenges in genomics
 
2014 marine-microbes-grc
2014 marine-microbes-grc2014 marine-microbes-grc
2014 marine-microbes-grc
 
'Stories that persuade with data' - talk at CENDI meeting January 9 2014
'Stories that persuade with data' - talk at CENDI meeting January 9 2014'Stories that persuade with data' - talk at CENDI meeting January 9 2014
'Stories that persuade with data' - talk at CENDI meeting January 9 2014
 
2013 alumni-webinar
2013 alumni-webinar2013 alumni-webinar
2013 alumni-webinar
 
The seven-deadly-sins-of-bioinformatics3960
The seven-deadly-sins-of-bioinformatics3960The seven-deadly-sins-of-bioinformatics3960
The seven-deadly-sins-of-bioinformatics3960
 
The Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsThe Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of Bioinformatics
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
Publishing for the 21st Century: Experiences from the NEUROSCIENCE INFORMATIO...
 
A01-Openness in knowledge-based systems
A01-Openness in knowledge-based systemsA01-Openness in knowledge-based systems
A01-Openness in knowledge-based systems
 
2014 villefranche
2014 villefranche2014 villefranche
2014 villefranche
 
Beyond Proofs of Concept for Biomedical AI
Beyond Proofs of Concept for Biomedical AIBeyond Proofs of Concept for Biomedical AI
Beyond Proofs of Concept for Biomedical AI
 
Big data nebraska
Big data nebraskaBig data nebraska
Big data nebraska
 
2014 naples
2014 naples2014 naples
2014 naples
 
Sweden_eemis_big_data
Sweden_eemis_big_dataSweden_eemis_big_data
Sweden_eemis_big_data
 
Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...
Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...
Semantic Natural Language Understanding with Spark, UIMA & Machine Learned On...
 
Liberty university biol 101 module 1
Liberty university biol 101 module 1Liberty university biol 101 module 1
Liberty university biol 101 module 1
 
Deep learning for biomedical discovery and data mining I
Deep learning for biomedical discovery and data mining IDeep learning for biomedical discovery and data mining I
Deep learning for biomedical discovery and data mining I
 
Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!Introduction to Gene Mining Part A: BLASTn-off!
Introduction to Gene Mining Part A: BLASTn-off!
 
2014 ucl
2014 ucl2014 ucl
2014 ucl
 

More from Paul Agapow

Can drug repurposing be saved with AI 202405.pdf
Can drug repurposing be saved with AI 202405.pdfCan drug repurposing be saved with AI 202405.pdf
Can drug repurposing be saved with AI 202405.pdf
Paul Agapow
 
IA, la clave de la genomica (May 2024).pdf
IA, la clave de la genomica (May 2024).pdfIA, la clave de la genomica (May 2024).pdf
IA, la clave de la genomica (May 2024).pdf
Paul Agapow
 
Digital Biomarkers, a (too) brief introduction.pdf
Digital Biomarkers, a (too) brief introduction.pdfDigital Biomarkers, a (too) brief introduction.pdf
Digital Biomarkers, a (too) brief introduction.pdf
Paul Agapow
 
How to make every mistake and still have a career, Feb2024.pdf
How to make every mistake and still have a career, Feb2024.pdfHow to make every mistake and still have a career, Feb2024.pdf
How to make every mistake and still have a career, Feb2024.pdf
Paul Agapow
 
ML, biomedical data & trust
ML, biomedical data & trustML, biomedical data & trust
ML, biomedical data & trust
Paul Agapow
 
Where AI will (and won't) revolutionize biomedicine
Where AI will (and won't) revolutionize biomedicineWhere AI will (and won't) revolutionize biomedicine
Where AI will (and won't) revolutionize biomedicine
Paul Agapow
 
Multi-omics for drug discovery: what we lose, what we gain
Multi-omics for drug discovery: what we lose, what we gainMulti-omics for drug discovery: what we lose, what we gain
Multi-omics for drug discovery: what we lose, what we gain
Paul Agapow
 
ML & AI in pharma: an overview
ML & AI in pharma: an overviewML & AI in pharma: an overview
ML & AI in pharma: an overview
Paul Agapow
 
ML & AI in Drug development: the hidden part of the iceberg
ML & AI in Drug development: the hidden part of the icebergML & AI in Drug development: the hidden part of the iceberg
ML & AI in Drug development: the hidden part of the iceberg
Paul Agapow
 
Machine learning, health data & the limits of knowledge
Machine learning, health data & the limits of knowledgeMachine learning, health data & the limits of knowledge
Machine learning, health data & the limits of knowledge
Paul Agapow
 
AI in Healthcare
AI in HealthcareAI in Healthcare
AI in Healthcare
Paul Agapow
 
The End of the Drug Development Casino?
The End of the Drug Development Casino?The End of the Drug Development Casino?
The End of the Drug Development Casino?
Paul Agapow
 
Get yourself a better bioinformatics job
Get yourself a better bioinformatics jobGet yourself a better bioinformatics job
Get yourself a better bioinformatics job
Paul Agapow
 
Interpreting Complex Real World Data for Pharmaceutical Research
Interpreting Complex Real World Data for Pharmaceutical ResearchInterpreting Complex Real World Data for Pharmaceutical Research
Interpreting Complex Real World Data for Pharmaceutical Research
Paul Agapow
 
Filling the gaps in translational research
Filling the gaps in translational researchFilling the gaps in translational research
Filling the gaps in translational research
Paul Agapow
 
Big Data & ML for Clinical Data
Big Data & ML for Clinical DataBig Data & ML for Clinical Data
Big Data & ML for Clinical Data
Paul Agapow
 
Machine Learning for Preclinical Research
Machine Learning for Preclinical ResearchMachine Learning for Preclinical Research
Machine Learning for Preclinical Research
Paul Agapow
 
AI for Precision Medicine (Pragmatic preclinical data science)
AI for Precision Medicine (Pragmatic preclinical data science)AI for Precision Medicine (Pragmatic preclinical data science)
AI for Precision Medicine (Pragmatic preclinical data science)
Paul Agapow
 
Patient subtypes: real or not?
Patient subtypes: real or not?Patient subtypes: real or not?
Patient subtypes: real or not?
Paul Agapow
 
Big biomedical data is a lie
Big biomedical data is a lieBig biomedical data is a lie
Big biomedical data is a lie
Paul Agapow
 

More from Paul Agapow (20)

Can drug repurposing be saved with AI 202405.pdf
Can drug repurposing be saved with AI 202405.pdfCan drug repurposing be saved with AI 202405.pdf
Can drug repurposing be saved with AI 202405.pdf
 
IA, la clave de la genomica (May 2024).pdf
IA, la clave de la genomica (May 2024).pdfIA, la clave de la genomica (May 2024).pdf
IA, la clave de la genomica (May 2024).pdf
 
Digital Biomarkers, a (too) brief introduction.pdf
Digital Biomarkers, a (too) brief introduction.pdfDigital Biomarkers, a (too) brief introduction.pdf
Digital Biomarkers, a (too) brief introduction.pdf
 
How to make every mistake and still have a career, Feb2024.pdf
How to make every mistake and still have a career, Feb2024.pdfHow to make every mistake and still have a career, Feb2024.pdf
How to make every mistake and still have a career, Feb2024.pdf
 
ML, biomedical data & trust
ML, biomedical data & trustML, biomedical data & trust
ML, biomedical data & trust
 
Where AI will (and won't) revolutionize biomedicine
Where AI will (and won't) revolutionize biomedicineWhere AI will (and won't) revolutionize biomedicine
Where AI will (and won't) revolutionize biomedicine
 
Multi-omics for drug discovery: what we lose, what we gain
Multi-omics for drug discovery: what we lose, what we gainMulti-omics for drug discovery: what we lose, what we gain
Multi-omics for drug discovery: what we lose, what we gain
 
ML & AI in pharma: an overview
ML & AI in pharma: an overviewML & AI in pharma: an overview
ML & AI in pharma: an overview
 
ML & AI in Drug development: the hidden part of the iceberg
ML & AI in Drug development: the hidden part of the icebergML & AI in Drug development: the hidden part of the iceberg
ML & AI in Drug development: the hidden part of the iceberg
 
Machine learning, health data & the limits of knowledge
Machine learning, health data & the limits of knowledgeMachine learning, health data & the limits of knowledge
Machine learning, health data & the limits of knowledge
 
AI in Healthcare
AI in HealthcareAI in Healthcare
AI in Healthcare
 
The End of the Drug Development Casino?
The End of the Drug Development Casino?The End of the Drug Development Casino?
The End of the Drug Development Casino?
 
Get yourself a better bioinformatics job
Get yourself a better bioinformatics jobGet yourself a better bioinformatics job
Get yourself a better bioinformatics job
 
Interpreting Complex Real World Data for Pharmaceutical Research
Interpreting Complex Real World Data for Pharmaceutical ResearchInterpreting Complex Real World Data for Pharmaceutical Research
Interpreting Complex Real World Data for Pharmaceutical Research
 
Filling the gaps in translational research
Filling the gaps in translational researchFilling the gaps in translational research
Filling the gaps in translational research
 
Big Data & ML for Clinical Data
Big Data & ML for Clinical DataBig Data & ML for Clinical Data
Big Data & ML for Clinical Data
 
Machine Learning for Preclinical Research
Machine Learning for Preclinical ResearchMachine Learning for Preclinical Research
Machine Learning for Preclinical Research
 
AI for Precision Medicine (Pragmatic preclinical data science)
AI for Precision Medicine (Pragmatic preclinical data science)AI for Precision Medicine (Pragmatic preclinical data science)
AI for Precision Medicine (Pragmatic preclinical data science)
 
Patient subtypes: real or not?
Patient subtypes: real or not?Patient subtypes: real or not?
Patient subtypes: real or not?
 
Big biomedical data is a lie
Big biomedical data is a lieBig biomedical data is a lie
Big biomedical data is a lie
 

Recently uploaded

一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
GetInData
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
AnirbanRoy608946
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Subhajit Sahu
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
mzpolocfi
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
u86oixdj
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 

Recently uploaded (20)

一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTESAdjusting OpenMP PageRank : SHORT REPORT / NOTES
Adjusting OpenMP PageRank : SHORT REPORT / NOTES
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 

Bioinformatics! (What is it good for?)

Editor's Notes

  1. PAUL: Good morning and welcome to the Festival of Genomics. Every year we gather here to see talks and workshops about the genomic revolution, about how arcane molecular technologies and advanced computation are driving the greatest revolution in healthcare ever. How we are teasing apart the the tangled web of disease, how sequencing is allowing quicker, targeted diagnosis. Every day, we are witness to a deluge of biomedical wonders. STEPHEN: Unless you actually work in the field and you know the reality - You know how things actually work, how the “analytical sausage” is made - You know about bad software, failing hardware, ignorance of statistics, poorly understood technology, and half-baked experiment, rushed study designs. You know that behind every groundbreaking Nature paper, there’s a poor harassed bioinformatician in a dim basement room complaining to their PI that those results don’t mean what they think they mean and hoping the PI doesn't open up the raw results in an excel file and ask why their favourite gene is missing from the top 10 results...
  2. STEPHEN: So, who are we? We are two working, card-carrying bioinformaticians. I am Stephen Newhouse [INSERT SHORT BIO] PAUL: And I am Paul Agapow, until recently I was the lead of Translational Bioinformatics at the Data Science Institute at Imperial College, before in December joining a large pharmaceutical company. I can’t tell you which one, but I can say that it rhymes with ‘AstraZeneca’. STEPHEN: Together with Nathan Lau of QMUL, we are the organisers of Bioinformatics London, a regular meetup for bioinformaticians, computational biologists and genomicists in capital. At our meetings we usually share announcements of jobs or funding opportunities, have a talk, and then adjourn to the pub where we complain endlessly about our jobs and have a laugh... Its like AA, but for informaticians... If this sounds attractive to you, why not come along. If you’d like to give a talk, even better! PAUL: So today we’re talking back to the rest of the Festival, recounting the small indignities and bad science suffered by bioinformaticians every day. We’ll be asking why bioinformatics is so broken. We’ll be talking about experiments that should not have been done and could not have been done, but were done anyway. For simplicity, we’ll tell all anecdotes and stories from the first person, although they were gathered from our own experience, from the membership of BioinformaticsLondon and from across the web. We'll complain, criticise and maybe even offer a solution or two. Why is Bioinformatics such a mess? Can we fix it?
  3. PAUL: Like much of the world, we seem obsessed by the phrase “big data”, and that “biomedical big data” will help decipher complex diseases and point the way to patient stratification and precision therapies. All this, despite the fact that “biomedical big data” largely doesn’t exist. Typical clinical trial and transcriptomic datasets have less than 100 subjects, while huge GWAS experiments get a of press, typically they are in the low thousands, as are most real world datasets of interest. Deep learning typically requires thousands, if not 10s of thousands of samples. So does GWAS, although one recent publication alleges for complex neurological diseases, millions of subjects may be required. Still, I’ve been asked many times to combine or merge trials or datasets to “boost power”. Despite the fact that adding subjects gathered from a different population, under a different protocol, using different measurements and metrics is actually lowering the power of the dataset.
  4. STEPHEN: For a decade, a lot of bioinformatics was driven by illegible, unmaintainable, cryptic Perl scripts, a write-once, read-never language that prides itself on being opaque and never coming within sight of a software engineer or proper coding practice, or version control…. All done in a rush to get the work done quickly and published.. PAUL: Now we use R.
  5. PAUL: A microarray analysis revealed no significant results, with no genes having significantly different gene expression. Despite this, the PI (clinician) insisted we examine the “top hit”. It was a testis specific Y-encoded gene. The cohort was all female.
  6. STEPHEN: Frequently, clinician scientists and PIs have a tendency to hear words and latch on them, making them a new catchphrase for anything related to bioinformatics and any kind of “advanced” data analysis - seemingly to make themselves sound "informed"... PAUL: Can I have a Docker? In the cloud? What if we used AI?
  7. PAUL: As a bioinformatician, I’m the goto expert for sequencing, systems biology, genome analysis, phylogenetics, proteomics, microbiomics, lipidomics, high performance computing, machine learning, systems administration, programming, web development, databasing, dev-ops, laying cables, formatting hard drives and finding out why your email isn’t working.
  8. STEPHEN: For a paper on novel ways to interpret and visualise data, another co-author suggested I shouldn’t be an author on a paper because “there would be more bioinformaticians than real scientists in the authorship list”... Sometimes we get treated worse than PhD students!
  9. PAUL: The PI had a tendency to pose vast, sweeping technical problems and then look at me and say “This is a job for the INFORMATICS TEAM”. There was only one of me and despite my other faults, I have yet to develop multiple personality disorder.
  10. STEPHEN: Let’s do AI, you know, lets build a Deep learning / Baysian / hierarchical model ... and you know all this jargon is just going to give you exactly the same results as a simple correlation, t-test or network analysis, don't you? Because in the end we are just comparing means between groups….and plus we often don’t have the numbers...or the compute (GPUs)
  11. PAUL: I was once forwarded a job ad for sole staff bioinformatician for a busy hospital department. The JD was the result of 6 universities, trusts and departments, listing 73 “key responsibilities”, which is about 30 minutes per key responsibility per week. The named duties ranging from sequencing, database development, research, writing papers and presenting them at conferences, answering user requests, developing software, installing and networking computers, training, all the way down to keeping track of reagent levels and making sure there was milk in the fridge.
  12. STEPHEN: For a project I was on, the sample processing procedure had been changed part-way through the study, resulting in a massive batch effect that simply couldn’t be corrected for. When I brought this to the attention of PI clinicians, they replied "What do you know about this, you don't have any Nature papers." For reference, neither did they. What I had was years of experience working with (bad) data...they weren’t all nature papers (because of batch effects)
  13. PAUL: Hey, friendly bioinformatics guy! I have RNAseq from one case...can you tell me which genes are differentially expressed? Can we do analysis on it?
  14. PAUL: Hey, friendly bioinformatics guy! We've got a really interesting multi-omic dataset. Why don’t we analyse it? STEPHEN: How many samples? PAUL: 10. STEPHEN: Uh, what sort of background? PAUL: Oh, they’re just random patients - all mixed race, super interesting! right? STEPHEN: How did you get the samples? PAUL: Buccal swabs STEPHEN: This could be difficult ... PAUL: Why not use deep learning and AI. STEPHEN: [sigh]....I could stick it in Docker?
  15. STEPHEN: Don’t let journals and peer-reviewers off the hook. We’ve had papers rejected because they don’t use the latest tech and, the latest new sexy software...Sometimes reviewers fail to recognise the approach they're suggesting is wholly inappropriate for the dataset, or might be economically out of reach of the scientists conducting still-worthy science.
  16. PAUL: Famously, the methodology researcher John Ioannidis has published on mis- and over- interpretation of biomedical research, highlighting researcher bias, unrepresentative samples, underpowered studies, cherry-picking, p-hacking, poor control groups, inappropriate experimental design, multiple hypothesis testing, post hoc analysis and mishandling of outliers. He estimates that 60% of research is incorrect. I heard this and thought “only 60%?” Most scientific research involving involves researchers painting the bullseye on the wall *after* spraying it with bullets.
  17. STEPHEN: You didn’t consult with me at the start of the experimental design to advise on the samples and/or duplicates for significant results. You didn’t ask me what was required to do this kind of analysis. You didn’t budget for any analysis or any of my time. You stored all your data in Excel or - even worse - Word - or even worse - PDFs. And yet, here you are, blaming me for not being not being able to prove your hypothesis correct.
  18. STEPHEN: While we’re talking about Excel, let’s commemorate those genes that are mangled in more than 5% of publications.
  19. PAUL: Precision medicine can be defined as repeatedly tweaking your patient cohort, dropping subjects in-and-out, until analysis yields a significant result.
  20. PAUL: Let’s not exclude bioinformatics software authors, who are almost always bioinformaticians themselves. Science and academia are corrosive to proper programming and software engineering. They are largely unable to pay for professional programmers and engineers, resulting in complex software, platforms and systems being constructed by people who literally learnt their skills from a book. Once, to try and understand the results I was getting from a popular bioinformatics program, I started reading the source code. Each file was several thousand lines long, containing large vestigial lumps of code from previous versions: func_1, func_2, func_4, func_5a. Variables were called h, hh, hh2, foo and - delightfully - thing. Of course, standard pillars of professional software engineering like version control, refactoring profiling, and unit testing have no value in an environment where only results matter and not the maintainability of those results. How many programs have been validated on publication with a thorough test suite that demonstrates that the program works? (Rather than the results doing little violence to our expectations.) When those programs are updated, how many are retested against the test suite to see that they still work?
  21. STEPHEN: Here is a slightly scandalous piece of advice about academia: you don’t want to be seen as competent. You don’t want to become known as the person who can do a thing. Thus it was an grave error of judgment when I helped the grad student in the lab next door by setting up remote access and showing them how to run a program. This escalated to doing the same for the other students, then their PI, all while slowly becoming known as “the guy who will do X for you!, always, and never says No (but should)”
  22. STEPHEN: How is it possible that people still don’t know how to use blast, one of the oldest and commonly used bioinformatic programs in existence? (Also effects -num_descriptions and -num_alignments)
  23. PAUL: When confronted with a complex biomedical problem, there’s a peculiar optimism in the idea “let's make a database”. The irony is increased even more if the database was populated with dubious or untrustworthy data and then that database is used to annotate and identify new additions to the database, resulting in a runaway train. And the irony hits maximum if the bioinformatician that constructed the db is then charged with making queries on the database.
  24. STEPHEN: Isn’t this all ridiculous? Data Analysis is basic, foundational to biomedicine and biotech. Why is it often done so badly? Why are there so many papers with weak or incorrect results, irreproducible results? Why is there so much bad software?
  25. PAUL: We have built a system that rewards rapid and frequent publication, that incentivizes novel and startling findings. We’ve put a bunch of educated and intelligent people into that system, told them that their success and career progression depends on chasing those measures. Then we’re disappointed when people chase those measures and not valuable things like careful and thorough checking of analysis, reproducibility, documentation and verification. Shouldn’t we have higher standards? Shouldn’t funders, research councils and departments insist on higher standards for the research they’re overseeing? Shouldn’t we have higher standards for ourselves?
  26. PAUL: Maybe we should give up. Maybe we should just stop being bioinformaticians. I mean this in several ways. First, there was a time when bioinformatics and biology and biomedicine were almost entirely practiced within the bounds of the university, the hospital and the research institute. People had little or no career flexibility and so just put up a lot of bad workplaces and bad bosses. This is no longer the case. There is a healthy, expanding commercial sphere for bioinformatics, companies and startups where you can work on interesting and useful problems. The craze for data science has provided another avenue for frustrated bioinformaticians. Those places are not free of problems we’ve outlined, but there are choices. The academy is not the only show in town. Second, maybe we should stop calling ourselves “bioinformaticians” and stop doing “bioinformatics”. The terms have become so abused and so hopelessly broad as to be meaningless, as to potentially encompass anyone who does something with a computer related to biology in some way: all the way from a Masters students who can use the web version of BLAST, through mathematical ecologists, website designers, genomicists, research software engineers, biostatisticians to computational chemists. The words are not doing us any good, so let’s walk away from the words. Call yourself anything but a bioinformatician and see if your life gets better. Finally, let's take that to the ultimate end. Let’s kill bioinformatics. Much as how economies have moved from being largely farming and industry to being largely services and “thought work”, biomedical science has moved from lab coats and hospital gowns to being dominated by analysis. We’ve commoditized the manual “wet” side of science and almost everyone does “computer work”. Everyone is a bioinformatician. So many we should stop doing bioinformatics and do science instead. STEPHEN: solutions?: Educate the PIs/clinicians/basic scientists N > 100, analysis plan, time lines Trust us when we say it will take a week Set up a career track for us Expectation management Unplanned work management Involve us from the beginning: treat us like a statistician