SlideShare a Scribd company logo
The Story of My Research 
developing a bottom-up computational approach to 
investigate microbial diversity 
Qingpeng Zhang 
Department of Computer Science and Engineering 
Michigan State University 
Supervisor: Dr. Titus Brown
odyssey? 
The Story of My Research 
developing a bottom-up computational approach to 
investigate microbial diversity 
Qingpeng Zhang 
Department of Computer Science and Engineering 
Michigan State University 
Supervisor: Dr. Titus Brown
khmer 
developing a bottom-up computational approach to 
development 
start study/research metagenomics 
digital 
normalization 
diversity 
analysis on k-mer 
level 
2008 
2009 
2010 
2011 
2012 
2013 
2014 
Osedax 
diversity Symbionts 
analysis on 
read level(IGS) 
GPGC 
soil 
sample 
investigate microbial diversity
2008: metagenomics
2008: metagenomics 
“Big Data!”
Microbial diversity 
2009: microbial diversity 
assembly 
binning/annotation 
reference 
similarity-based composition-based
How many stuffs are there in the sample? - alpha diversity 
How different are the samples? - beta diversity 
Microbial diversity 
2009: microbial diversity 
assembly 
binning/annotation 
reference 
similarity-based composition-based
Microbial diversity 
2009: microbial diversity 
assembly 
"Nothing works, everything binning/annotation 
sucks." 
reference 
similarity-based composition-based
Microbial diversity 
2009: microbial diversity 
assembly 
NO! 
binning/annotation 
reference 
similarity-based composition-based
2009: k-mer counting
khmer 
development 
start study/research metagenomics 
digital 
normalization 
diversity 
analysis on k-mer 
level 
Osedax 
diversity Symbionts 
analysis on 
read level(IGS) 
GPGC 
soil 
sample 
2008 
2009 
2010 
2011 
2012 
2013 
2014 
developing a bottom-up computational approach to 
investigate microbial diversity
2010 -now: GPGC 
How many stuffs are there in the sample? - alpha diversity 
How does agricultural soil differ from native soil? - beta diversity
khmer 
development 
start study/research metagenomics 
digital 
normalization 
diversity 
analysis on k-mer 
level 
Osedax 
diversity Symbionts 
analysis on 
read level(IGS) 
GPGC 
soil 
sample 
2008 
2009 
2010 
2011 
2012 
2013 
2014 
developing a bottom-up computational approach to 
investigate microbial diversity
2010 -now: khmer
2010 -now: khmer
2010 -now: khmer 
• My contributions: 
• algorithm design/analysis, exploring the mathematics behind, the choice of optimal 
parameters 
• contributing codes, including unique k-mers counting, overlap k-mer counting, optimal 
parameter choice, others related to my specific research project. 
• benchmarking, testing, actually using it. 
• exploration of applications like error trimming, filter low abundance reads, digital 
normalization, etc. suggestion on features 
• work on the khmer manuscript
2010 -now: khmer 
• My contributions: 
• algorithm design/analysis, exploring the mathematics behind, the choice of optimal 
parameters 
• contributing codes, including unique k-mers counting, overlap k-mer counting, optimal 
parameter choice, others related to my specific research project. 
• benchmarking, testing, actually using it. 
• exploration of applications like error trimming, filter low abundance reads, digital 
normalization, etc. suggestion on features 
• work on the khmer manuscript
khmer 
development 
start study/research metagenomics 
digital 
normalization 
diversity 
analysis on 
k-mer level 
Osedax 
diversity Symbionts 
analysis on 
read level(IGS) 
GPGC 
soil 
sample 
2008 
2009 
2010 
2011 
2012 
2013 
2014 
developing a bottom-up computational approach to 
investigate microbial diversity
2010 -2012: diversity analysis on k-mer level
2010 -2012: diversity analysis on k-mer level
khmer 
development 
start study/research metagenomics 
digital 
normalization 
diversity 
analysis on k-mer 
level 
Osedax 
diversity Symbionts 
analysis on 
read level(IGS) 
GPGC 
soil 
sample 
2008 
2009 
2010 
2011 
2012 
2013 
2014 
developing a bottom-up computational approach to 
investigate microbial diversity
2011-2012: diginorm 
median k-mer frequency to represent 
the sequencing coverage of the 
read 
useful for diversity analysis 
Digital normalization 
removing redundant reads 
useful for assembly
2011-2012: diginorm 
median k-mer frequency to represent 
the sequencing coverage of the 
read 
useful for diversity analysis 
Digital normalization 
removing redundant reads 
useful for assembly
khmer 
development 
start study/research metagenomics 
digital 
normalization 
diversity 
analysis on k-mer 
level 
Osedax 
diversity Symbiont 
analysis on 
read level(IGS) 
GPGC 
soil 
sample 
2008 
2009 
2010 
2011 
2012 
2013 
2014 
developing a bottom-up computational approach to 
investigate microbial diversity
2012-2013 symbionts 
My contributions: 
• diginorm/assembly/binning/ 
annotation 
• genome completeness estimation 
• 94% complete Rs1 
• 66-89% complete Rs2 
• some transcriptome analysis 
• Other bioinformatics support
khmer 
development 
start study/research metagenomics 
digital 
normalization 
diversity 
analysis on k-mer 
level 
Osedax 
Symbionts 
diversity 
analysis on 
read 
level(IGS) 
GPGC 
soil 
sample 
2008 
2009 
2010 
2011 
2012 
2013 
2014 
developing a bottom-up computational approach to 
investigate microbial diversity
2012 -now: diversity analysis on read level
2012 -now: diversity analysis on read level 
IGS(informative genomic 
segment) can represent 
the novel information of a 
genome 
We can use all the data, 
not only the data we 
understand!
AFGH 
FGHI 
ABCE 
AABC 
ABCD 
AAAB 
ABCD FGHI ABCE AFGH 
AABC 
AAAB
AFGH 
FGHI 
ABCE 
AABC 
ABCD 
AAAB 
ABCD FGHI ABCE AFGH 
AABC 
AAAB
Improve the pipeline 
khmer diginorm error correction
Sorcerer II Global Ocean Sampling Expedition
2010 -now: GPGC
khmer 
development 
start study/research metagenomics 
digital 
normalization 
diversity 
analysis on k-mer 
level 
Osedax 
diversity Symbionts 
analysis on 
read level(IGS) 
GPGC 
soil 
sample 
2008 
2009 
2010 
2011 
2012 
2013 
2014 
developing a bottom-up computational approach to 
investigate microbial diversity
37 
Future work 
• Finish the IGS based diversity analysis paper 
• Refine pipeline/adjust statistical method to fit IGSs 
• More real data sets 
• MetaHIT(Metagenomics of the Human Intestinal Tract) (working..) 
• HMP (Human Microbiome Project) (working..) 
• GPGC(Soil) (working..) 
• Ballast water virome (working..) 
• Finish a review of the methods and applications of k-mer counting in 
bioinformatics (will also be part of my dissertation) 
• Expand the application of IGS 
• sequencing depth/effort estimation, genome size estimation 
• reads binning/classification based on coverage profile across samples 
• relate IGS to phylogenetic info and function 
• extract IGS(reads) according different coverage profile (shared by all
Acknowledgement 
● Dr. Titus Brown 
● Lab members of GED 
● Elijah Lowe 
● Jiarong Guo 
● Camille Scott 
● Michael Crusoe 
● Luiz Irber 
● Dr. Sherine Awad 
● Former members of GED 
● Dr. Adina Howe 
● Eric McDonald 
● Dr. Jason Pell 
● Dr. Likit Preeyanon 
● RDP 
● Dr. Jim Cole 
● Jordan Fish

More Related Content

Similar to committee_meeting_1031

On the Globalization of Modeling Languages (June 8th, 2015)
On the Globalization of Modeling Languages (June 8th, 2015)On the Globalization of Modeling Languages (June 8th, 2015)
On the Globalization of Modeling Languages (June 8th, 2015)
Benoit Combemale
 
GB20 Nodes Training Course 2013, module 5B: Latest trends in data analysis
GB20 Nodes Training Course 2013, module 5B: Latest trends in data analysisGB20 Nodes Training Course 2013, module 5B: Latest trends in data analysis
GB20 Nodes Training Course 2013, module 5B: Latest trends in data analysis
Dag Endresen
 
iMicrobe_ASLO_2015
iMicrobe_ASLO_2015iMicrobe_ASLO_2015
iMicrobe_ASLO_2015
Bonnie Hurwitz
 
Experimental Designs in Next Generation Sequencing
Experimental Designs in Next Generation Sequencing Experimental Designs in Next Generation Sequencing
Experimental Designs in Next Generation Sequencing
GuttiPavan
 
Tools for Metagenomics with 16S/ITS and Whole Genome Shotgun Sequences
Tools for Metagenomics with 16S/ITS and Whole Genome Shotgun SequencesTools for Metagenomics with 16S/ITS and Whole Genome Shotgun Sequences
Tools for Metagenomics with 16S/ITS and Whole Genome Shotgun Sequences
Surya Saha
 
150219 agbt giab_poster_marc
150219 agbt giab_poster_marc150219 agbt giab_poster_marc
150219 agbt giab_poster_marc
GenomeInABottle
 
Detecting cyberbullying text using the approaches with machine learning model...
Detecting cyberbullying text using the approaches with machine learning model...Detecting cyberbullying text using the approaches with machine learning model...
Detecting cyberbullying text using the approaches with machine learning model...
IAESIJAI
 
2014 nyu-bio-talk
2014 nyu-bio-talk2014 nyu-bio-talk
2014 nyu-bio-talk
c.titus.brown
 
2015 genome-center
2015 genome-center2015 genome-center
2015 genome-center
c.titus.brown
 
[2013.10.29] albertsen genomics metagenomics
[2013.10.29] albertsen genomics metagenomics[2013.10.29] albertsen genomics metagenomics
[2013.10.29] albertsen genomics metagenomics
Mads Albertsen
 
2014 sage-talk
2014 sage-talk2014 sage-talk
2014 sage-talk
c.titus.brown
 
Portofolio Muhammad Afrizal Septiansyah 2024
Portofolio Muhammad Afrizal Septiansyah 2024Portofolio Muhammad Afrizal Septiansyah 2024
Portofolio Muhammad Afrizal Septiansyah 2024
MuhammadAfrizalSepti
 
Pine education-platform
Pine education-platformPine education-platform
Pine education-platform
Jaclyn Williams
 
HPCAC - the state of bioinformatics in 2017
HPCAC - the state of bioinformatics in 2017HPCAC - the state of bioinformatics in 2017
HPCAC - the state of bioinformatics in 2017
philippbayer
 
Integrating Pathway Databases with Gene Ontology Causal Activity Models
Integrating Pathway Databases with Gene Ontology Causal Activity ModelsIntegrating Pathway Databases with Gene Ontology Causal Activity Models
Integrating Pathway Databases with Gene Ontology Causal Activity Models
Benjamin Good
 
Szomszor "Methods and Tools for Scholarly Data Analytics"
Szomszor "Methods and Tools for Scholarly Data Analytics"Szomszor "Methods and Tools for Scholarly Data Analytics"
Szomszor "Methods and Tools for Scholarly Data Analytics"
National Information Standards Organization (NISO)
 
My master thesis half way there
My master thesis half way thereMy master thesis half way there
My master thesis half way there
Kasper Skytte Andersen
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
MLconf
 
Integrated omics analysis pipeline for model organism with Cytoscape, Kozo Ni...
Integrated omics analysis pipeline for model organism with Cytoscape, Kozo Ni...Integrated omics analysis pipeline for model organism with Cytoscape, Kozo Ni...
Integrated omics analysis pipeline for model organism with Cytoscape, Kozo Ni...
Kozo Nishida
 
About the use of biomedical ontologies to play with text in the context of th...
About the use of biomedical ontologies to play with text in the context of th...About the use of biomedical ontologies to play with text in the context of th...
About the use of biomedical ontologies to play with text in the context of th...
INRAE (MISTEA) and University of Montpellier (LIRMM)
 

Similar to committee_meeting_1031 (20)

On the Globalization of Modeling Languages (June 8th, 2015)
On the Globalization of Modeling Languages (June 8th, 2015)On the Globalization of Modeling Languages (June 8th, 2015)
On the Globalization of Modeling Languages (June 8th, 2015)
 
GB20 Nodes Training Course 2013, module 5B: Latest trends in data analysis
GB20 Nodes Training Course 2013, module 5B: Latest trends in data analysisGB20 Nodes Training Course 2013, module 5B: Latest trends in data analysis
GB20 Nodes Training Course 2013, module 5B: Latest trends in data analysis
 
iMicrobe_ASLO_2015
iMicrobe_ASLO_2015iMicrobe_ASLO_2015
iMicrobe_ASLO_2015
 
Experimental Designs in Next Generation Sequencing
Experimental Designs in Next Generation Sequencing Experimental Designs in Next Generation Sequencing
Experimental Designs in Next Generation Sequencing
 
Tools for Metagenomics with 16S/ITS and Whole Genome Shotgun Sequences
Tools for Metagenomics with 16S/ITS and Whole Genome Shotgun SequencesTools for Metagenomics with 16S/ITS and Whole Genome Shotgun Sequences
Tools for Metagenomics with 16S/ITS and Whole Genome Shotgun Sequences
 
150219 agbt giab_poster_marc
150219 agbt giab_poster_marc150219 agbt giab_poster_marc
150219 agbt giab_poster_marc
 
Detecting cyberbullying text using the approaches with machine learning model...
Detecting cyberbullying text using the approaches with machine learning model...Detecting cyberbullying text using the approaches with machine learning model...
Detecting cyberbullying text using the approaches with machine learning model...
 
2014 nyu-bio-talk
2014 nyu-bio-talk2014 nyu-bio-talk
2014 nyu-bio-talk
 
2015 genome-center
2015 genome-center2015 genome-center
2015 genome-center
 
[2013.10.29] albertsen genomics metagenomics
[2013.10.29] albertsen genomics metagenomics[2013.10.29] albertsen genomics metagenomics
[2013.10.29] albertsen genomics metagenomics
 
2014 sage-talk
2014 sage-talk2014 sage-talk
2014 sage-talk
 
Portofolio Muhammad Afrizal Septiansyah 2024
Portofolio Muhammad Afrizal Septiansyah 2024Portofolio Muhammad Afrizal Septiansyah 2024
Portofolio Muhammad Afrizal Septiansyah 2024
 
Pine education-platform
Pine education-platformPine education-platform
Pine education-platform
 
HPCAC - the state of bioinformatics in 2017
HPCAC - the state of bioinformatics in 2017HPCAC - the state of bioinformatics in 2017
HPCAC - the state of bioinformatics in 2017
 
Integrating Pathway Databases with Gene Ontology Causal Activity Models
Integrating Pathway Databases with Gene Ontology Causal Activity ModelsIntegrating Pathway Databases with Gene Ontology Causal Activity Models
Integrating Pathway Databases with Gene Ontology Causal Activity Models
 
Szomszor "Methods and Tools for Scholarly Data Analytics"
Szomszor "Methods and Tools for Scholarly Data Analytics"Szomszor "Methods and Tools for Scholarly Data Analytics"
Szomszor "Methods and Tools for Scholarly Data Analytics"
 
My master thesis half way there
My master thesis half way thereMy master thesis half way there
My master thesis half way there
 
Neel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to codeNeel Sundaresan - Teaching a machine to code
Neel Sundaresan - Teaching a machine to code
 
Integrated omics analysis pipeline for model organism with Cytoscape, Kozo Ni...
Integrated omics analysis pipeline for model organism with Cytoscape, Kozo Ni...Integrated omics analysis pipeline for model organism with Cytoscape, Kozo Ni...
Integrated omics analysis pipeline for model organism with Cytoscape, Kozo Ni...
 
About the use of biomedical ontologies to play with text in the context of th...
About the use of biomedical ontologies to play with text in the context of th...About the use of biomedical ontologies to play with text in the context of th...
About the use of biomedical ontologies to play with text in the context of th...
 

More from Qingpeng "Q.P." Zhang

VenmoPlus
VenmoPlusVenmoPlus
Qingpeng zhang 0713
Qingpeng zhang 0713Qingpeng zhang 0713
Qingpeng zhang 0713
Qingpeng "Q.P." Zhang
 
Qingpeng zhang 0711
Qingpeng zhang 0711Qingpeng zhang 0711
Qingpeng zhang 0711
Qingpeng "Q.P." Zhang
 
VenmoPlus0708
VenmoPlus0708VenmoPlus0708
VenmoPlus0708
Qingpeng "Q.P." Zhang
 
VenmoPlus demo week6
VenmoPlus demo week6VenmoPlus demo week6
VenmoPlus demo week6
Qingpeng "Q.P." Zhang
 
0629venmoplus
0629venmoplus0629venmoplus
0629venmoplus
Qingpeng "Q.P." Zhang
 
Qingpeng zhang week5
Qingpeng zhang week5Qingpeng zhang week5
Qingpeng zhang week5
Qingpeng "Q.P." Zhang
 
Introducing VenmoPlus.com 6/27 version
Introducing VenmoPlus.com 6/27 versionIntroducing VenmoPlus.com 6/27 version
Introducing VenmoPlus.com 6/27 version
Qingpeng "Q.P." Zhang
 
Novel Computational Approaches to Investigate Microbial Diversity
Novel Computational Approaches to Investigate Microbial DiversityNovel Computational Approaches to Investigate Microbial Diversity
Novel Computational Approaches to Investigate Microbial Diversity
Qingpeng "Q.P." Zhang
 
Comprehensive Exam Slides 11/13/2013
Comprehensive Exam Slides 11/13/2013Comprehensive Exam Slides 11/13/2013
Comprehensive Exam Slides 11/13/2013
Qingpeng "Q.P." Zhang
 

More from Qingpeng "Q.P." Zhang (10)

VenmoPlus
VenmoPlusVenmoPlus
VenmoPlus
 
Qingpeng zhang 0713
Qingpeng zhang 0713Qingpeng zhang 0713
Qingpeng zhang 0713
 
Qingpeng zhang 0711
Qingpeng zhang 0711Qingpeng zhang 0711
Qingpeng zhang 0711
 
VenmoPlus0708
VenmoPlus0708VenmoPlus0708
VenmoPlus0708
 
VenmoPlus demo week6
VenmoPlus demo week6VenmoPlus demo week6
VenmoPlus demo week6
 
0629venmoplus
0629venmoplus0629venmoplus
0629venmoplus
 
Qingpeng zhang week5
Qingpeng zhang week5Qingpeng zhang week5
Qingpeng zhang week5
 
Introducing VenmoPlus.com 6/27 version
Introducing VenmoPlus.com 6/27 versionIntroducing VenmoPlus.com 6/27 version
Introducing VenmoPlus.com 6/27 version
 
Novel Computational Approaches to Investigate Microbial Diversity
Novel Computational Approaches to Investigate Microbial DiversityNovel Computational Approaches to Investigate Microbial Diversity
Novel Computational Approaches to Investigate Microbial Diversity
 
Comprehensive Exam Slides 11/13/2013
Comprehensive Exam Slides 11/13/2013Comprehensive Exam Slides 11/13/2013
Comprehensive Exam Slides 11/13/2013
 

Recently uploaded

Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
Dr. Mulla Adam Ali
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
Katrina Pritchard
 
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
RitikBhardwaj56
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
WaniBasim
 
DRUGS AND ITS classification slide share
DRUGS AND ITS classification slide shareDRUGS AND ITS classification slide share
DRUGS AND ITS classification slide share
taiba qazi
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
Nguyen Thanh Tu Collection
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
Priyankaranawat4
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Dr. Vinod Kumar Kanvaria
 
World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024
ak6969907
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
tarandeep35
 
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective UpskillingYour Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Excellence Foundation for South Sudan
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
Israel Genealogy Research Association
 
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptxC1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
mulvey2
 
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
IreneSebastianRueco1
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
Academy of Science of South Africa
 
Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
David Douglas School District
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat  Leveraging AI for Diversity, Equity, and InclusionExecutive Directors Chat  Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
TechSoup
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
eBook.com.bd (প্রয়োজনীয় বাংলা বই)
 
PIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf IslamabadPIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf Islamabad
AyyanKhan40
 

Recently uploaded (20)

Hindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdfHindi varnamala | hindi alphabet PPT.pdf
Hindi varnamala | hindi alphabet PPT.pdf
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
 
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
 
Liberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdfLiberal Approach to the Study of Indian Politics.pdf
Liberal Approach to the Study of Indian Politics.pdf
 
DRUGS AND ITS classification slide share
DRUGS AND ITS classification slide shareDRUGS AND ITS classification slide share
DRUGS AND ITS classification slide share
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
 
clinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdfclinical examination of hip joint (1).pdf
clinical examination of hip joint (1).pdf
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
 
World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
 
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective UpskillingYour Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective Upskilling
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
 
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptxC1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
 
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
 
Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
 
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat  Leveraging AI for Diversity, Equity, and InclusionExecutive Directors Chat  Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
 
PIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf IslamabadPIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf Islamabad
 

committee_meeting_1031

  • 1. The Story of My Research developing a bottom-up computational approach to investigate microbial diversity Qingpeng Zhang Department of Computer Science and Engineering Michigan State University Supervisor: Dr. Titus Brown
  • 2. odyssey? The Story of My Research developing a bottom-up computational approach to investigate microbial diversity Qingpeng Zhang Department of Computer Science and Engineering Michigan State University Supervisor: Dr. Titus Brown
  • 3. khmer developing a bottom-up computational approach to development start study/research metagenomics digital normalization diversity analysis on k-mer level 2008 2009 2010 2011 2012 2013 2014 Osedax diversity Symbionts analysis on read level(IGS) GPGC soil sample investigate microbial diversity
  • 6. Microbial diversity 2009: microbial diversity assembly binning/annotation reference similarity-based composition-based
  • 7. How many stuffs are there in the sample? - alpha diversity How different are the samples? - beta diversity Microbial diversity 2009: microbial diversity assembly binning/annotation reference similarity-based composition-based
  • 8. Microbial diversity 2009: microbial diversity assembly "Nothing works, everything binning/annotation sucks." reference similarity-based composition-based
  • 9. Microbial diversity 2009: microbial diversity assembly NO! binning/annotation reference similarity-based composition-based
  • 11. khmer development start study/research metagenomics digital normalization diversity analysis on k-mer level Osedax diversity Symbionts analysis on read level(IGS) GPGC soil sample 2008 2009 2010 2011 2012 2013 2014 developing a bottom-up computational approach to investigate microbial diversity
  • 12. 2010 -now: GPGC How many stuffs are there in the sample? - alpha diversity How does agricultural soil differ from native soil? - beta diversity
  • 13. khmer development start study/research metagenomics digital normalization diversity analysis on k-mer level Osedax diversity Symbionts analysis on read level(IGS) GPGC soil sample 2008 2009 2010 2011 2012 2013 2014 developing a bottom-up computational approach to investigate microbial diversity
  • 16. 2010 -now: khmer • My contributions: • algorithm design/analysis, exploring the mathematics behind, the choice of optimal parameters • contributing codes, including unique k-mers counting, overlap k-mer counting, optimal parameter choice, others related to my specific research project. • benchmarking, testing, actually using it. • exploration of applications like error trimming, filter low abundance reads, digital normalization, etc. suggestion on features • work on the khmer manuscript
  • 17. 2010 -now: khmer • My contributions: • algorithm design/analysis, exploring the mathematics behind, the choice of optimal parameters • contributing codes, including unique k-mers counting, overlap k-mer counting, optimal parameter choice, others related to my specific research project. • benchmarking, testing, actually using it. • exploration of applications like error trimming, filter low abundance reads, digital normalization, etc. suggestion on features • work on the khmer manuscript
  • 18. khmer development start study/research metagenomics digital normalization diversity analysis on k-mer level Osedax diversity Symbionts analysis on read level(IGS) GPGC soil sample 2008 2009 2010 2011 2012 2013 2014 developing a bottom-up computational approach to investigate microbial diversity
  • 19. 2010 -2012: diversity analysis on k-mer level
  • 20. 2010 -2012: diversity analysis on k-mer level
  • 21. khmer development start study/research metagenomics digital normalization diversity analysis on k-mer level Osedax diversity Symbionts analysis on read level(IGS) GPGC soil sample 2008 2009 2010 2011 2012 2013 2014 developing a bottom-up computational approach to investigate microbial diversity
  • 22. 2011-2012: diginorm median k-mer frequency to represent the sequencing coverage of the read useful for diversity analysis Digital normalization removing redundant reads useful for assembly
  • 23. 2011-2012: diginorm median k-mer frequency to represent the sequencing coverage of the read useful for diversity analysis Digital normalization removing redundant reads useful for assembly
  • 24. khmer development start study/research metagenomics digital normalization diversity analysis on k-mer level Osedax diversity Symbiont analysis on read level(IGS) GPGC soil sample 2008 2009 2010 2011 2012 2013 2014 developing a bottom-up computational approach to investigate microbial diversity
  • 25. 2012-2013 symbionts My contributions: • diginorm/assembly/binning/ annotation • genome completeness estimation • 94% complete Rs1 • 66-89% complete Rs2 • some transcriptome analysis • Other bioinformatics support
  • 26. khmer development start study/research metagenomics digital normalization diversity analysis on k-mer level Osedax Symbionts diversity analysis on read level(IGS) GPGC soil sample 2008 2009 2010 2011 2012 2013 2014 developing a bottom-up computational approach to investigate microbial diversity
  • 27. 2012 -now: diversity analysis on read level
  • 28. 2012 -now: diversity analysis on read level IGS(informative genomic segment) can represent the novel information of a genome We can use all the data, not only the data we understand!
  • 29. AFGH FGHI ABCE AABC ABCD AAAB ABCD FGHI ABCE AFGH AABC AAAB
  • 30. AFGH FGHI ABCE AABC ABCD AAAB ABCD FGHI ABCE AFGH AABC AAAB
  • 31. Improve the pipeline khmer diginorm error correction
  • 32. Sorcerer II Global Ocean Sampling Expedition
  • 33.
  • 34.
  • 36. khmer development start study/research metagenomics digital normalization diversity analysis on k-mer level Osedax diversity Symbionts analysis on read level(IGS) GPGC soil sample 2008 2009 2010 2011 2012 2013 2014 developing a bottom-up computational approach to investigate microbial diversity
  • 37. 37 Future work • Finish the IGS based diversity analysis paper • Refine pipeline/adjust statistical method to fit IGSs • More real data sets • MetaHIT(Metagenomics of the Human Intestinal Tract) (working..) • HMP (Human Microbiome Project) (working..) • GPGC(Soil) (working..) • Ballast water virome (working..) • Finish a review of the methods and applications of k-mer counting in bioinformatics (will also be part of my dissertation) • Expand the application of IGS • sequencing depth/effort estimation, genome size estimation • reads binning/classification based on coverage profile across samples • relate IGS to phylogenetic info and function • extract IGS(reads) according different coverage profile (shared by all
  • 38. Acknowledgement ● Dr. Titus Brown ● Lab members of GED ● Elijah Lowe ● Jiarong Guo ● Camille Scott ● Michael Crusoe ● Luiz Irber ● Dr. Sherine Awad ● Former members of GED ● Dr. Adina Howe ● Eric McDonald ● Dr. Jason Pell ● Dr. Likit Preeyanon ● RDP ● Dr. Jim Cole ● Jordan Fish