SlideShare a Scribd company logo
1 of 13
www.P2EP.org
THE CURRENT STATUS OF THE
BLUEBERRY GENOME
Robert Reid
rreid2@uncc.edu
Department of Bioinformatics & Genomics
University of North Carolina Charlotte
BLUEPRINTS FOR BLUEBERRY
www.P2EP.org
2009~
• 76 BP Illumina GA II sequencing
• 3Kb & some 20KB 454
pyrosequencing
• 36 BP Illumina sequencing
2011
• 454-pyrosequencing
• 8 kb and 20 kb paired-end insert
2013
• Illumina Hiseq (5 lanes)
• Illumina Nextera paired-end sequencing
• Vaccinium.org website (WSU)
2014
• Masurca and GARM assembly
• BAC libraries (UF), BAC-end sequencing
(NCSU)
2015
• SSPACE (modified) assembly
• Gene annotation, RNA-Seq (Gupta et al.,
2015)
• Repeat annotations, map alignments
BLUEBERRY
PROJECT
TIMELINE
www.P2EP.org
Some Assembly
Numbers
www.P2EP.org
Much room for improvement still
**Estimated genome size = 608 MB (Costich et al., 1993)
www.P2EP.org
MARKER ALIGNMENT TO
SCAFFOLDS
Linkage Map # of markers # of scaffolds Size (bp)
*Tetraploid - Draper 689 358 121,530,818
*Tetraploid - Jewel 576 328 112,427,224
Interspecific hybrid 322 190 74,069,152
Diploid 318 153 56,781319
Cranberry 138 40 15,934975
696 scaffolds were assigned to at least one linkage group, the total size was 214
Mb
*earlier version of map markers than what was published
www.P2EP.org
GENOME COMPLETENESS
Missing
gene
duplicate
complete
fragments
BUSCO2
CEGMA1
(1645 Core genes)
48%
22%
18%
12%
(458 Core genes)
2http://busco.ezlab.org/
Match
No match
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Newbler (454 reads) Nextera hybrid
Assembly
Nextera Plus BAC end
sequencing
356356350
www.P2EP.org
Gage identifies 84 KEGG
pathways from gene
predictions.
Top pathways found:
1. Pyruvate metabolism
2. Βeta-alanine
metabolism
3. Ribosome biogenesis
4. RNA polymerase
5. Pyrimidine
metabolism
ANNOTATING PATHWAYS
VIA GAGE
Luo et al., GAGE: Generally Applicable Gene Set Enrichment for Pathways Analysis. BMC Bioinformatics, 2009, 10:161
Predicted
genes
Predicted
proteins
Align to
Ref-
Seq
Map
to
KEGG
Identify
most
abundant
KEGG
pathways
RNA-Seq
Gene prediction tools
• Augustus
• Genemark
• SNAP
• (100,000 genes)
transdecoder
BLASTP
To
Grape OR
Potato NCBI
REFSEQ
GAGE/
pathview
GAGE
www.P2EP.org
MAPPED TO GRAPE
vvi04141 Protein processing in endoplasmic reticulum
vvi00510 N-Glycan biosynthesis
vvi00350 Tyrosine metabolism
vvi00561 Glycerolipid metabolism
vvi03020 RNA polymerase
vvi04120 Ubiquitin mediated proteolysis
vvi03022 Basal transcription factors
vvi00950 Isoquinoline alkaloid biosynthesis
vvi00030 Pentose phosphate pathway
vvi00730 Thiamine metabolism
vvi00960 Tropane, piperidine and pyridine alkaloid biosynthesis
vvi00500 Starch and sucrose metabolism
vvi03420 Nucleotide excision repair
vvi00196 Photosynthesis - antenna proteins
vvi03060 Protein export
vvi00565 Ether lipid metabolism
vvi03430 Mismatch repair
vvi00770 Pantothenate and CoA biosynthesis
vvi00071 Fatty acid degradation
vvi00380 Tryptophan metabolism
vvi00520 Amino sugar and nucleotide sugar metabolism
vvi00600 Sphingolipid metabolism
vvi00040 Pentose and glucuronate interconversions
www.P2EP.org
MAPPING VIA PATHVIEW*
*Luo W, Brouwer C. Pathview: an R/Biocondutor package for pathway-based data integration and
visualization. Bioinformatics, 2013
Predicted
genes
Predicted
proteins
Align to
Ref-Seq
Map to
KEGG
Overlay
onto
KEGG
pathway
RNA-Seq
1M isoforms
Gene prediction tools
• Augustus
• Genemark
• SNAP
• (100,000 genes)
Transdecoder
BLASTP
To
Grape OR
potato
pathview
pathview
www.P2EP.org
FLAVONOID PATHWAY
www.P2EP.org
2 AVAILABLE ONLINE ANNOTATION
RESOURCES
https://www.vaccinium.org
Anne Lorraine http://bioviz.org/igb/index.html
Dorrie Maine - WSU
www.P2EP.org
FUTURE STEPS FOR GENOMICS
• Improving the genome
• Massimo Iorizzo
• Hamid Ashrafi
• Jeannie Rowland
• Improve contiguity
• Resolve repeat regions
• Fill in the gaps
• Improving the linkage maps
• Higher density map
• More anchoring points
• Optical map
• To improve scaffold / contig ordering
ACKNOWLEDGEMENTS
• Allan Brown –CGIAR
• Ying-Chen Lin –NC State
• Mary Ann Lila –NC State
• Ra’ad Gharaibeh -UNCC
• Rachel Walstead -UNCC
• Gregario Lingchanco-UNCC
• Cory R Brouwer -UNCC
• Jeannie Rowland –USDA-ARS
• Dorrie Maine - WSU
• Garron Wright – DHMRI
• Mark Burk – DHMRI
• James Olmstead – U of Florida
• And many more

More Related Content

Recently uploaded

HIV AND INFULENZA VIRUS PPT HIV PPT INFULENZA VIRUS PPT
HIV AND INFULENZA VIRUS PPT HIV PPT  INFULENZA VIRUS PPTHIV AND INFULENZA VIRUS PPT HIV PPT  INFULENZA VIRUS PPT
Nanoparticles for the Treatment of Alzheimer’s Disease_102718.pptx
Nanoparticles for the Treatment of Alzheimer’s Disease_102718.pptxNanoparticles for the Treatment of Alzheimer’s Disease_102718.pptx
Nanoparticles for the Treatment of Alzheimer’s Disease_102718.pptx
ssusera4ec7b
 
Warming the earth and the atmosphere.pptx
Warming the earth and the atmosphere.pptxWarming the earth and the atmosphere.pptx
Warming the earth and the atmosphere.pptx
GlendelCaroz
 
Electricity and Circuits for Grade 9 students
Electricity and Circuits for Grade 9 studentsElectricity and Circuits for Grade 9 students
Electricity and Circuits for Grade 9 students
levieagacer
 

Recently uploaded (20)

Manganese‐RichSandstonesasanIndicatorofAncientOxic LakeWaterConditionsinGale...
Manganese‐RichSandstonesasanIndicatorofAncientOxic  LakeWaterConditionsinGale...Manganese‐RichSandstonesasanIndicatorofAncientOxic  LakeWaterConditionsinGale...
Manganese‐RichSandstonesasanIndicatorofAncientOxic LakeWaterConditionsinGale...
 
ANITINUTRITION FACTOR GYLCOSIDES SAPONINS CYANODENS
ANITINUTRITION FACTOR GYLCOSIDES SAPONINS CYANODENSANITINUTRITION FACTOR GYLCOSIDES SAPONINS CYANODENS
ANITINUTRITION FACTOR GYLCOSIDES SAPONINS CYANODENS
 
GBSN - Microbiology (Unit 5) Concept of isolation
GBSN - Microbiology (Unit 5) Concept of isolationGBSN - Microbiology (Unit 5) Concept of isolation
GBSN - Microbiology (Unit 5) Concept of isolation
 
Film Coated Tablet and Film Coating raw materials.pdf
Film Coated Tablet and Film Coating raw materials.pdfFilm Coated Tablet and Film Coating raw materials.pdf
Film Coated Tablet and Film Coating raw materials.pdf
 
FORENSIC CHEMISTRY ARSON INVESTIGATION.pdf
FORENSIC CHEMISTRY ARSON INVESTIGATION.pdfFORENSIC CHEMISTRY ARSON INVESTIGATION.pdf
FORENSIC CHEMISTRY ARSON INVESTIGATION.pdf
 
Polyethylene and its polymerization.pptx
Polyethylene and its polymerization.pptxPolyethylene and its polymerization.pptx
Polyethylene and its polymerization.pptx
 
PHOTOSYNTHETIC BACTERIA (OXYGENIC AND ANOXYGENIC)
PHOTOSYNTHETIC BACTERIA  (OXYGENIC AND ANOXYGENIC)PHOTOSYNTHETIC BACTERIA  (OXYGENIC AND ANOXYGENIC)
PHOTOSYNTHETIC BACTERIA (OXYGENIC AND ANOXYGENIC)
 
HIV AND INFULENZA VIRUS PPT HIV PPT INFULENZA VIRUS PPT
HIV AND INFULENZA VIRUS PPT HIV PPT  INFULENZA VIRUS PPTHIV AND INFULENZA VIRUS PPT HIV PPT  INFULENZA VIRUS PPT
HIV AND INFULENZA VIRUS PPT HIV PPT INFULENZA VIRUS PPT
 
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
 
RACEMIzATION AND ISOMERISATION completed.pptx
RACEMIzATION AND ISOMERISATION completed.pptxRACEMIzATION AND ISOMERISATION completed.pptx
RACEMIzATION AND ISOMERISATION completed.pptx
 
Introduction and significance of Symbiotic algae
Introduction and significance of  Symbiotic algaeIntroduction and significance of  Symbiotic algae
Introduction and significance of Symbiotic algae
 
X-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
X-rays from a Central “Exhaust Vent” of the Galactic Center ChimneyX-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
X-rays from a Central “Exhaust Vent” of the Galactic Center Chimney
 
GBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) EnzymologyGBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) Enzymology
 
Efficient spin-up of Earth System Models usingsequence acceleration
Efficient spin-up of Earth System Models usingsequence accelerationEfficient spin-up of Earth System Models usingsequence acceleration
Efficient spin-up of Earth System Models usingsequence acceleration
 
Taphonomy and Quality of the Fossil Record
Taphonomy and Quality of the  Fossil RecordTaphonomy and Quality of the  Fossil Record
Taphonomy and Quality of the Fossil Record
 
Nanoparticles for the Treatment of Alzheimer’s Disease_102718.pptx
Nanoparticles for the Treatment of Alzheimer’s Disease_102718.pptxNanoparticles for the Treatment of Alzheimer’s Disease_102718.pptx
Nanoparticles for the Treatment of Alzheimer’s Disease_102718.pptx
 
Terpineol and it's characterization pptx
Terpineol and it's characterization pptxTerpineol and it's characterization pptx
Terpineol and it's characterization pptx
 
Warming the earth and the atmosphere.pptx
Warming the earth and the atmosphere.pptxWarming the earth and the atmosphere.pptx
Warming the earth and the atmosphere.pptx
 
Electricity and Circuits for Grade 9 students
Electricity and Circuits for Grade 9 studentsElectricity and Circuits for Grade 9 students
Electricity and Circuits for Grade 9 students
 
Vital Signs of Animals Presentation By Aftab Ahmed Rahimoon
Vital Signs of Animals Presentation By Aftab Ahmed RahimoonVital Signs of Animals Presentation By Aftab Ahmed Rahimoon
Vital Signs of Animals Presentation By Aftab Ahmed Rahimoon
 

Featured

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Update on the assembly and annotation of the blueberry genome

  • 1. www.P2EP.org THE CURRENT STATUS OF THE BLUEBERRY GENOME Robert Reid rreid2@uncc.edu Department of Bioinformatics & Genomics University of North Carolina Charlotte BLUEPRINTS FOR BLUEBERRY
  • 2. www.P2EP.org 2009~ • 76 BP Illumina GA II sequencing • 3Kb & some 20KB 454 pyrosequencing • 36 BP Illumina sequencing 2011 • 454-pyrosequencing • 8 kb and 20 kb paired-end insert 2013 • Illumina Hiseq (5 lanes) • Illumina Nextera paired-end sequencing • Vaccinium.org website (WSU) 2014 • Masurca and GARM assembly • BAC libraries (UF), BAC-end sequencing (NCSU) 2015 • SSPACE (modified) assembly • Gene annotation, RNA-Seq (Gupta et al., 2015) • Repeat annotations, map alignments BLUEBERRY PROJECT TIMELINE
  • 4. www.P2EP.org Much room for improvement still **Estimated genome size = 608 MB (Costich et al., 1993)
  • 5. www.P2EP.org MARKER ALIGNMENT TO SCAFFOLDS Linkage Map # of markers # of scaffolds Size (bp) *Tetraploid - Draper 689 358 121,530,818 *Tetraploid - Jewel 576 328 112,427,224 Interspecific hybrid 322 190 74,069,152 Diploid 318 153 56,781319 Cranberry 138 40 15,934975 696 scaffolds were assigned to at least one linkage group, the total size was 214 Mb *earlier version of map markers than what was published
  • 6. www.P2EP.org GENOME COMPLETENESS Missing gene duplicate complete fragments BUSCO2 CEGMA1 (1645 Core genes) 48% 22% 18% 12% (458 Core genes) 2http://busco.ezlab.org/ Match No match 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Newbler (454 reads) Nextera hybrid Assembly Nextera Plus BAC end sequencing 356356350
  • 7. www.P2EP.org Gage identifies 84 KEGG pathways from gene predictions. Top pathways found: 1. Pyruvate metabolism 2. Βeta-alanine metabolism 3. Ribosome biogenesis 4. RNA polymerase 5. Pyrimidine metabolism ANNOTATING PATHWAYS VIA GAGE Luo et al., GAGE: Generally Applicable Gene Set Enrichment for Pathways Analysis. BMC Bioinformatics, 2009, 10:161 Predicted genes Predicted proteins Align to Ref- Seq Map to KEGG Identify most abundant KEGG pathways RNA-Seq Gene prediction tools • Augustus • Genemark • SNAP • (100,000 genes) transdecoder BLASTP To Grape OR Potato NCBI REFSEQ GAGE/ pathview GAGE
  • 8. www.P2EP.org MAPPED TO GRAPE vvi04141 Protein processing in endoplasmic reticulum vvi00510 N-Glycan biosynthesis vvi00350 Tyrosine metabolism vvi00561 Glycerolipid metabolism vvi03020 RNA polymerase vvi04120 Ubiquitin mediated proteolysis vvi03022 Basal transcription factors vvi00950 Isoquinoline alkaloid biosynthesis vvi00030 Pentose phosphate pathway vvi00730 Thiamine metabolism vvi00960 Tropane, piperidine and pyridine alkaloid biosynthesis vvi00500 Starch and sucrose metabolism vvi03420 Nucleotide excision repair vvi00196 Photosynthesis - antenna proteins vvi03060 Protein export vvi00565 Ether lipid metabolism vvi03430 Mismatch repair vvi00770 Pantothenate and CoA biosynthesis vvi00071 Fatty acid degradation vvi00380 Tryptophan metabolism vvi00520 Amino sugar and nucleotide sugar metabolism vvi00600 Sphingolipid metabolism vvi00040 Pentose and glucuronate interconversions
  • 9. www.P2EP.org MAPPING VIA PATHVIEW* *Luo W, Brouwer C. Pathview: an R/Biocondutor package for pathway-based data integration and visualization. Bioinformatics, 2013 Predicted genes Predicted proteins Align to Ref-Seq Map to KEGG Overlay onto KEGG pathway RNA-Seq 1M isoforms Gene prediction tools • Augustus • Genemark • SNAP • (100,000 genes) Transdecoder BLASTP To Grape OR potato pathview pathview
  • 11. www.P2EP.org 2 AVAILABLE ONLINE ANNOTATION RESOURCES https://www.vaccinium.org Anne Lorraine http://bioviz.org/igb/index.html Dorrie Maine - WSU
  • 12. www.P2EP.org FUTURE STEPS FOR GENOMICS • Improving the genome • Massimo Iorizzo • Hamid Ashrafi • Jeannie Rowland • Improve contiguity • Resolve repeat regions • Fill in the gaps • Improving the linkage maps • Higher density map • More anchoring points • Optical map • To improve scaffold / contig ordering
  • 13. ACKNOWLEDGEMENTS • Allan Brown –CGIAR • Ying-Chen Lin –NC State • Mary Ann Lila –NC State • Ra’ad Gharaibeh -UNCC • Rachel Walstead -UNCC • Gregario Lingchanco-UNCC • Cory R Brouwer -UNCC • Jeannie Rowland –USDA-ARS • Dorrie Maine - WSU • Garron Wright – DHMRI • Mark Burk – DHMRI • James Olmstead – U of Florida • And many more

Editor's Notes

  1. P2EP started approximately 3 years ago and is a collaboration between universities and industry studying the nutritional qualities of blueberries, broccoli, oats, strawberries, pineapple and bananas. The program has a major educational component as well and last summer we had 40 grad, undergrad and even a couple high school students actively participating in nutritional research.
  2. The blueberry genomic project in NC State has been actively carried out for several years. Here I want to when and what we have done on this project. Before 2011, we've already have some Illimina short sequences, which were only 36~76 basepairs, we also have long sequences from 454-pyroseuqences. I came to the United States and joined this project in 2012. During these three years, so far, we did one more sequencing using Illumina, and created several version of assembly with different assembler software. Before this April, Newbler assembly was the primary assembly we used for this project. But after we got the BAC-end sequence from BAC libraries, we were able to re-assemble the scaffolds, putting some small pieces together which was really difficult when you only have fragmented scaffolds. Then, now we have a more SSPACE assembly that contained much larger scaffolds than previous one.
  3. For all the species that have been studied well in anthocyanin biosynthesis, grapevine (Vitis vinifera) is suggested to be the closest related species, which just published early this year, so here we used grape protein sequences as reference to search in blueberry genome. The first step was to search grape's protein sequences in KEGG pathway (Kyoto Encyclopedia of Genes and Genomes), then blast against blueberry scaffold data, which is stored in genome database for vaccinium website. From the blast results, we can know which scaffold contained putative genes, and start/stop point of the genes, also got the blueberry protein sequences.