SlideShare a Scribd company logo
Considerations in bioinformatics
analyses and study design
Elana J Fertig
Johns Hopkins University
Why study design and bioinformatics
pipelines?
Let’s team up to avoid painful power
calculation discussions
• https://www.youtube.com/watch?v=PbODigCZqL8
Is there a boundary between standard
bioinformatics and AI/data science?
Data science and statistics are a continuum
that must work together for best analyses
How does bioinformatics work?
When to contact the bioinformatician?
“To call in the statistician after the experiment is done is no more
than asking him to perform a post-mortem examination: he may
be able to say what the experiment died of.” Ronald Fisher
Why? Am I wasting your time? Never.
Are you really just a control freak? Ok, maybe.
The GIGO principle of computer science:
Garbage In Garbage Out
Best analyses come from good data cleaning
and study design
Considerations for study design
• Sample preparation impacts which technologies you can use
• Biological hypothesis should drive technology selection
• “Off-label” use of technologies impacts technology protocols (e.g.,
TCR sequence, virus, or splice variant detection from bulk or sc RNA-
seq)
• Consider study design to anticipate the impact of technical artifacts
may impact data quality (e.g., library, sequencing run, batch,
technician, date of processing, age of sample, etc).
Measure twice, cut once
Core coordination minimizes off-label analysis costs
Off-labelinformatics
toolsworkonrawdata
Published bladder cancer microarray data set
Leek et al. 2010
Even large consortia datasets like TCGA have
batch effects
Fortin et al., 2014
Design studies to avoid confounding technical
artifacts and biological covariates
Batch effects change the correlation structure
between genes
Leek et al. 2010
Batch effects change the correlation structure
between genes
Leek et al. 2010
Study design and data cleaning are the most
critical part of any analysis
We can mathematically correct for known
batch effects in data with good study designs
We can correct for batch effects if we know
they are there
Recognizing confounded designs
• Trial Arm A in one batch and trial Arm B in another
• Pre-treatment in one batch and post-treatment in another
• Responders in one batch and non-responders in another
• Designs can get complicated. E.g., what do you do if you have
multiple tissue sites from multiple individuals and you want to
compare both site and individual differences?
We love to help
during design!
What should we do?
Leek et al. 2010
Bioinformatics as a team sport and best
practices
• Early consultation for sample
preparation, technology selection,
and study design
• Interactive collaboration during
data preprocessing and cleaning
• Reproducible scripts to include as
manuscript supplements or online
to document analysis steps
• Open source software for
dissemination of any new
algorithms employed in analysis
Summary
• It is never too early to contact your friendly neighborhood
bioinformatician and we can consult on
• Sample preservation
• Technology selection
• Study design
• Analysis plan and preprocessing
• Data parasiting
• Coordinated collaboration in the data generation process and with
the sequencing core minimizes costs and maximizes data quality

More Related Content

What's hot

Niosomes
NiosomesNiosomes
Ocular Drug Delivery System(OCDDS)
Ocular Drug Delivery System(OCDDS)Ocular Drug Delivery System(OCDDS)
Ocular Drug Delivery System(OCDDS)
ssp183
 
Pharmacogenomics
PharmacogenomicsPharmacogenomics
Pharmacogenomics
Jayati Shrivastava
 
Nanocrystals As Drug Delivery System
Nanocrystals As Drug Delivery SystemNanocrystals As Drug Delivery System
Nanocrystals As Drug Delivery System
ijtsrd
 
ANDDS - GASTRO RETENTIVE DRUG DELIVERY SYSTEM
ANDDS - GASTRO RETENTIVE DRUG DELIVERY SYSTEMANDDS - GASTRO RETENTIVE DRUG DELIVERY SYSTEM
ANDDS - GASTRO RETENTIVE DRUG DELIVERY SYSTEM
lakshamandpatel
 
Nasal drug delivery
Nasal drug deliveryNasal drug delivery
Nasal drug delivery
venkata naveen
 
Buccal drug delivery system
Buccal drug delivery system Buccal drug delivery system
Buccal drug delivery system
Samatha Jajala
 
Optimization technique
Optimization techniqueOptimization technique
Optimization technique
Navaneethakrishnan Palaniappan
 
Nanosponge drug delivery system
Nanosponge drug delivery systemNanosponge drug delivery system
Nanosponge drug delivery system
konatham teja kumar reddy
 
Advances in transmucosal drug delivery
Advances in transmucosal drug deliveryAdvances in transmucosal drug delivery
Advances in transmucosal drug delivery
Gasper Fernandes
 
Ndds 6 Implantable Drug Delivery System
Ndds 6 Implantable Drug Delivery SystemNdds 6 Implantable Drug Delivery System
Ndds 6 Implantable Drug Delivery System
shashankc10
 
Gastro retentive drug delivery system
Gastro retentive drug delivery systemGastro retentive drug delivery system
Gastro retentive drug delivery system
vedanshu malviya
 
Controlled drug delivery system
Controlled drug delivery systemControlled drug delivery system
Controlled drug delivery system
Sushmitha002
 
Controlled Drug delivery
 Controlled Drug delivery Controlled Drug delivery
Controlled Drug delivery
BLDEA S SSM COLLEGE OF PHARMACY
 
MUCOADHESIIVE DRUG DELIVERY SYSTEM
MUCOADHESIIVE DRUG DELIVERY SYSTEMMUCOADHESIIVE DRUG DELIVERY SYSTEM
MUCOADHESIIVE DRUG DELIVERY SYSTEM
Dr Gajanan Sanap
 
Zinc finger technology
Zinc finger technologyZinc finger technology
Zinc finger technology
Munish Chhabra
 
GRDDS presentation.pptx
GRDDS presentation.pptxGRDDS presentation.pptx
GRDDS presentation.pptx
abiket
 
Preparation & Stability of Large Volume Parenterals by PRINCE THAKUR
Preparation & Stability of Large Volume Parenterals by PRINCE THAKURPreparation & Stability of Large Volume Parenterals by PRINCE THAKUR
Preparation & Stability of Large Volume Parenterals by PRINCE THAKUR
PrinceThakur50
 

What's hot (18)

Niosomes
NiosomesNiosomes
Niosomes
 
Ocular Drug Delivery System(OCDDS)
Ocular Drug Delivery System(OCDDS)Ocular Drug Delivery System(OCDDS)
Ocular Drug Delivery System(OCDDS)
 
Pharmacogenomics
PharmacogenomicsPharmacogenomics
Pharmacogenomics
 
Nanocrystals As Drug Delivery System
Nanocrystals As Drug Delivery SystemNanocrystals As Drug Delivery System
Nanocrystals As Drug Delivery System
 
ANDDS - GASTRO RETENTIVE DRUG DELIVERY SYSTEM
ANDDS - GASTRO RETENTIVE DRUG DELIVERY SYSTEMANDDS - GASTRO RETENTIVE DRUG DELIVERY SYSTEM
ANDDS - GASTRO RETENTIVE DRUG DELIVERY SYSTEM
 
Nasal drug delivery
Nasal drug deliveryNasal drug delivery
Nasal drug delivery
 
Buccal drug delivery system
Buccal drug delivery system Buccal drug delivery system
Buccal drug delivery system
 
Optimization technique
Optimization techniqueOptimization technique
Optimization technique
 
Nanosponge drug delivery system
Nanosponge drug delivery systemNanosponge drug delivery system
Nanosponge drug delivery system
 
Advances in transmucosal drug delivery
Advances in transmucosal drug deliveryAdvances in transmucosal drug delivery
Advances in transmucosal drug delivery
 
Ndds 6 Implantable Drug Delivery System
Ndds 6 Implantable Drug Delivery SystemNdds 6 Implantable Drug Delivery System
Ndds 6 Implantable Drug Delivery System
 
Gastro retentive drug delivery system
Gastro retentive drug delivery systemGastro retentive drug delivery system
Gastro retentive drug delivery system
 
Controlled drug delivery system
Controlled drug delivery systemControlled drug delivery system
Controlled drug delivery system
 
Controlled Drug delivery
 Controlled Drug delivery Controlled Drug delivery
Controlled Drug delivery
 
MUCOADHESIIVE DRUG DELIVERY SYSTEM
MUCOADHESIIVE DRUG DELIVERY SYSTEMMUCOADHESIIVE DRUG DELIVERY SYSTEM
MUCOADHESIIVE DRUG DELIVERY SYSTEM
 
Zinc finger technology
Zinc finger technologyZinc finger technology
Zinc finger technology
 
GRDDS presentation.pptx
GRDDS presentation.pptxGRDDS presentation.pptx
GRDDS presentation.pptx
 
Preparation & Stability of Large Volume Parenterals by PRINCE THAKUR
Preparation & Stability of Large Volume Parenterals by PRINCE THAKURPreparation & Stability of Large Volume Parenterals by PRINCE THAKUR
Preparation & Stability of Large Volume Parenterals by PRINCE THAKUR
 

Similar to Bioinformatics workflows and study design

ScienceCloud: Collaborative Workflows in Biologics R&D
ScienceCloud: Collaborative Workflows in Biologics R&DScienceCloud: Collaborative Workflows in Biologics R&D
ScienceCloud: Collaborative Workflows in Biologics R&D
BIOVIA
 
Reproducible research: theory
Reproducible research: theoryReproducible research: theory
Reproducible research: theory
C. Tobin Magle
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to Bioinformatics
Denis C. Bauer
 
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
William Gunn
 
2016 davis-biotech
2016 davis-biotech2016 davis-biotech
2016 davis-biotech
c.titus.brown
 
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...
robertstevens65
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to Bioinformatics
Leighton Pritchard
 
Journal Club - Best Practices for Scientific Computing
Journal Club - Best Practices for Scientific ComputingJournal Club - Best Practices for Scientific Computing
Journal Club - Best Practices for Scientific Computing
Bram Zandbelt
 
A practical guide to practicing open science
A practical guide to practicing open scienceA practical guide to practicing open science
A practical guide to practicing open science
Krzysztof Gorgolewski
 
Cracking the (bio)code -- Professional Development Session at SACNAS 2014
Cracking the (bio)code -- Professional Development Session at SACNAS 2014Cracking the (bio)code -- Professional Development Session at SACNAS 2014
Cracking the (bio)code -- Professional Development Session at SACNAS 2014
Tracy Heath
 
informatics_future.pdf
informatics_future.pdfinformatics_future.pdf
informatics_future.pdf
AdhySugara2
 
Use of Artificial Intelligence for Literature Screening
Use of Artificial Intelligence for Literature ScreeningUse of Artificial Intelligence for Literature Screening
Use of Artificial Intelligence for Literature Screening
U.S. Army Engineer Research and Development Center
 
RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015
William Gunn
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
nadimissimple
 
Omics Logic - Bioinformatics 2.0
Omics Logic - Bioinformatics 2.0Omics Logic - Bioinformatics 2.0
Omics Logic - Bioinformatics 2.0
Elia Brodsky
 
Careers in bioinformatics, Scope, Skills and Jobs
Careers in bioinformatics, Scope, Skills and JobsCareers in bioinformatics, Scope, Skills and Jobs
Careers in bioinformatics, Scope, Skills and Jobs
M Abdullah Chaudhry
 
Öppen data och forskningens genomslag
Öppen data och forskningens genomslagÖppen data och forskningens genomslag
Öppen data och forskningens genomslag
Kungliga biblioteket National Library of Sweden
 
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...
Barry Smith
 
Career oppurtunities in the field of Bioinformatics
Career oppurtunities in the field of BioinformaticsCareer oppurtunities in the field of Bioinformatics
Career oppurtunities in the field of Bioinformatics
Shikha Thakur
 
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
Carole Goble
 

Similar to Bioinformatics workflows and study design (20)

ScienceCloud: Collaborative Workflows in Biologics R&D
ScienceCloud: Collaborative Workflows in Biologics R&DScienceCloud: Collaborative Workflows in Biologics R&D
ScienceCloud: Collaborative Workflows in Biologics R&D
 
Reproducible research: theory
Reproducible research: theoryReproducible research: theory
Reproducible research: theory
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to Bioinformatics
 
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
Sci Know Mine 2013: What can we learn from topic modeling on 350M academic do...
 
2016 davis-biotech
2016 davis-biotech2016 davis-biotech
2016 davis-biotech
 
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...
The Big Picture: The Industrial Revolutiona talk in berlin, 2008, about indus...
 
Introduction to Bioinformatics
Introduction to BioinformaticsIntroduction to Bioinformatics
Introduction to Bioinformatics
 
Journal Club - Best Practices for Scientific Computing
Journal Club - Best Practices for Scientific ComputingJournal Club - Best Practices for Scientific Computing
Journal Club - Best Practices for Scientific Computing
 
A practical guide to practicing open science
A practical guide to practicing open scienceA practical guide to practicing open science
A practical guide to practicing open science
 
Cracking the (bio)code -- Professional Development Session at SACNAS 2014
Cracking the (bio)code -- Professional Development Session at SACNAS 2014Cracking the (bio)code -- Professional Development Session at SACNAS 2014
Cracking the (bio)code -- Professional Development Session at SACNAS 2014
 
informatics_future.pdf
informatics_future.pdfinformatics_future.pdf
informatics_future.pdf
 
Use of Artificial Intelligence for Literature Screening
Use of Artificial Intelligence for Literature ScreeningUse of Artificial Intelligence for Literature Screening
Use of Artificial Intelligence for Literature Screening
 
RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Omics Logic - Bioinformatics 2.0
Omics Logic - Bioinformatics 2.0Omics Logic - Bioinformatics 2.0
Omics Logic - Bioinformatics 2.0
 
Careers in bioinformatics, Scope, Skills and Jobs
Careers in bioinformatics, Scope, Skills and JobsCareers in bioinformatics, Scope, Skills and Jobs
Careers in bioinformatics, Scope, Skills and Jobs
 
Öppen data och forskningens genomslag
Öppen data och forskningens genomslagÖppen data och forskningens genomslag
Öppen data och forskningens genomslag
 
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...
Clinical trial data wants to be free: Lessons from the ImmPort Immunology Dat...
 
Career oppurtunities in the field of Bioinformatics
Career oppurtunities in the field of BioinformaticsCareer oppurtunities in the field of Bioinformatics
Career oppurtunities in the field of Bioinformatics
 
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
 

Recently uploaded

Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Fernanda Palhano
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
mkkikqvo
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
Sachin Paul
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
roli9797
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Kaxil Naik
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
xclpvhuk
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
ElizabethGarrettChri
 
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens""Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
sameer shah
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
v7oacc3l
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
Márton Kodok
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
a9qfiubqu
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 

Recently uploaded (20)

Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdfUdemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
Udemy_2024_Global_Learning_Skills_Trends_Report (1).pdf
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
原版一比一多伦多大学毕业证(UofT毕业证书)如何办理
 
Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......Palo Alto Cortex XDR presentation .......
Palo Alto Cortex XDR presentation .......
 
Analysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performanceAnalysis insight about a Flyball dog competition team's performance
Analysis insight about a Flyball dog competition team's performance
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
 
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens""Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
在线办理(英国UCA毕业证书)创意艺术大学毕业证在读证明一模一样
 
Build applications with generative AI on Google Cloud
Build applications with generative AI on Google CloudBuild applications with generative AI on Google Cloud
Build applications with generative AI on Google Cloud
 
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
原版一比一弗林德斯大学毕业证(Flinders毕业证书)如何办理
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 

Bioinformatics workflows and study design

  • 1. Considerations in bioinformatics analyses and study design Elana J Fertig Johns Hopkins University
  • 2. Why study design and bioinformatics pipelines?
  • 3. Let’s team up to avoid painful power calculation discussions • https://www.youtube.com/watch?v=PbODigCZqL8
  • 4. Is there a boundary between standard bioinformatics and AI/data science?
  • 5. Data science and statistics are a continuum that must work together for best analyses
  • 7. When to contact the bioinformatician? “To call in the statistician after the experiment is done is no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of.” Ronald Fisher
  • 8. Why? Am I wasting your time? Never. Are you really just a control freak? Ok, maybe. The GIGO principle of computer science: Garbage In Garbage Out
  • 9. Best analyses come from good data cleaning and study design
  • 10. Considerations for study design • Sample preparation impacts which technologies you can use • Biological hypothesis should drive technology selection • “Off-label” use of technologies impacts technology protocols (e.g., TCR sequence, virus, or splice variant detection from bulk or sc RNA- seq) • Consider study design to anticipate the impact of technical artifacts may impact data quality (e.g., library, sequencing run, batch, technician, date of processing, age of sample, etc). Measure twice, cut once
  • 11. Core coordination minimizes off-label analysis costs Off-labelinformatics toolsworkonrawdata
  • 12.
  • 13. Published bladder cancer microarray data set Leek et al. 2010
  • 14. Even large consortia datasets like TCGA have batch effects Fortin et al., 2014
  • 15. Design studies to avoid confounding technical artifacts and biological covariates
  • 16. Batch effects change the correlation structure between genes Leek et al. 2010
  • 17. Batch effects change the correlation structure between genes Leek et al. 2010
  • 18. Study design and data cleaning are the most critical part of any analysis
  • 19. We can mathematically correct for known batch effects in data with good study designs
  • 20. We can correct for batch effects if we know they are there
  • 21. Recognizing confounded designs • Trial Arm A in one batch and trial Arm B in another • Pre-treatment in one batch and post-treatment in another • Responders in one batch and non-responders in another • Designs can get complicated. E.g., what do you do if you have multiple tissue sites from multiple individuals and you want to compare both site and individual differences? We love to help during design!
  • 22. What should we do? Leek et al. 2010
  • 23. Bioinformatics as a team sport and best practices • Early consultation for sample preparation, technology selection, and study design • Interactive collaboration during data preprocessing and cleaning • Reproducible scripts to include as manuscript supplements or online to document analysis steps • Open source software for dissemination of any new algorithms employed in analysis
  • 24. Summary • It is never too early to contact your friendly neighborhood bioinformatician and we can consult on • Sample preservation • Technology selection • Study design • Analysis plan and preprocessing • Data parasiting • Coordinated collaboration in the data generation process and with the sequencing core minimizes costs and maximizes data quality