SlideShare a Scribd company logo
1 of 1
Download to read offline
Solanaceae	
  Genomics	
  Resource	
  
Bre1	
  R.	
  Whi1y,	
  C.	
  Robin	
  Buell,	
  Michigan	
  State	
  University,	
  	
  
Department	
  of	
  Plant	
  Biology,	
  East	
  Lansing	
  MI	
  48824	
  
Collec'vely,	
  the	
  Solanaceae	
  (a	
  family	
  which	
  includes	
  Potato,	
  Tomato,	
  Tobacco,	
  Pepper,	
  Eggplant	
  and	
  Petunia)	
  are	
  a	
  valuable	
  component	
  of	
  U.S.	
  agriculture.	
  The	
  
major	
  Solanaceae	
  crop	
  species	
  share	
  both	
  sequence	
  iden'ty	
  and	
  gene	
  order	
  thereby	
  providing	
  the	
  basis	
  for	
  leveraging	
  genomic	
  resources	
  across	
  taxa.	
  Transcriptome	
  and	
  
genome	
  sequencing	
  projects	
  have	
  been	
  ini'ated	
  for	
  the	
  major	
  crop	
  species;	
  albeit	
  none	
  of	
  the	
  three	
  genome	
  ini'a'ves	
  (potato,	
  tomato,	
  tobacco)	
  have	
  yet	
  released	
  to	
  the	
  
public	
  a	
  high	
  quality,	
  finished	
  complete	
  genome	
  sequence.	
   	
  Thus,	
  it	
  is	
  essen'al	
  that	
  all	
  of	
  the	
  par'al	
  Solanaceae	
  transcript	
  and	
  genome	
  sequence	
  be	
  integrated	
  at	
  the	
  
family	
  level	
  and	
  linked	
  to	
  other	
  model	
  dicot	
  species	
  to	
  provide	
  contextual	
  informa'on	
  on	
  the	
  puta've	
  func'on	
  of	
  Solanaceae	
  homologs.	
  In	
  this	
  project,	
  we	
  are	
  working	
  to	
  
iden'fy	
  puta've	
  orthologs,	
  paralogs,	
  and	
  lineage-­‐specific	
  genes	
  within	
  the	
  Solanaceae	
  to	
  facilitate	
  intra-­‐	
  and	
  inter-­‐species	
  comparisons.	
  We	
  also	
  iden'fy	
  homologs	
  of	
  
Solanaceae	
  species	
  within	
  three	
  dicot	
  species	
  (Arabidopsis,	
  Poplar,	
  Grapevine)	
  to	
  permit	
  leveraging	
  resources	
  from	
  these	
  model	
  species	
  to	
  the	
  Solanaceae.	
  We	
  are	
  working	
  
to	
  generate	
  compara've	
  analyses,	
  alignments,	
  views,	
  and	
  displays	
  of	
  the	
  Solanaceae.	
  Overall,	
  we	
  provide	
  a	
  robust	
  and	
  integrated	
  compara've	
  genomics	
  resource	
  that	
  
permits	
  broad	
  and	
  deep	
  data-­‐mining	
  of	
  Solanaceae	
  sequences	
  by	
  the	
  community.	
  	
  
This	
  project	
  was	
  ini'ated	
  January	
  1,	
  2008	
  and	
  we	
  con'nue	
  to	
  update	
  project	
  data	
  quarterly,	
  and	
  develop	
  addi'onal	
  resources	
  and	
  tools	
  for	
  the	
  Solanaceae	
  community.	
  
It	
  is	
  supported	
  by	
  the	
  Na'onal	
  Research	
  Ini'a've	
  (NRI)	
  Plant	
  Genome	
  Program	
  of	
  the	
  USDA	
  Na'onal	
  Ins'tute	
  of	
  Food	
  and	
  Agriculture	
  (NIFA)	
  (2008-­‐35300-­‐18671).	
  
All	
  project	
  data	
  is	
  made	
  available	
  through	
  our	
  web	
  site:	
  
hp://solanaceae.plantbiology.msu.edu	
  
Project	
  email	
  address:	
  sgr@plantbiology.msu.edu	
  
Model	
  Dicot	
  ComparaIve	
  Genome	
  Databases	
  
Alignments	
  of	
  Solanaceae	
  transcript	
  assemblies	
  against	
  model	
  dicot	
  (Arabidopsis,	
  Grapevine,	
  Poplar)	
  
genomic	
  and	
  polypep'de	
  sequence	
  are	
  available	
  for	
  display	
  and	
  search	
  in	
  a	
  Gbrowse	
  database.	
  
Potato,	
  Tomato	
  and	
  Tobacco	
  DraK	
  Genomes	
  
Our	
  analyses	
  and	
  databases	
  include	
  all	
  public	
  data	
  releases	
  from	
  the	
  three	
  genome	
  sequencing	
  efforts	
  in	
  the	
  
Solanaceae.	
  We	
  obtain	
  data	
  as	
  it	
  is	
  released	
  to	
  GenBank	
  from	
  the	
  Interna'onal	
  Tomato	
  Genome	
  Sequencing	
  
Project	
   which	
   includes	
   gene	
   models	
   annotated	
   by	
   the	
   project	
   members,	
   and	
   the	
   Interna'onal	
   Potato	
  
Genome	
  Sequencing	
  Consor'um	
  (of	
  which	
  we	
  are	
  members)	
  which	
  to	
  date	
  has	
  released	
  assemblies	
  with	
  no	
  
gene	
  annota'ons.	
  Our	
  annota'on	
  and	
  analysis	
  pipeline	
  provides	
  gene	
  models	
  for	
  genes	
  present	
  on	
  these	
  
assemblies,	
  supplementary	
  to	
  any	
  previously	
  annotated	
  gene	
  models	
  present	
  in	
  the	
  public	
  data.	
  	
  
A	
  Community	
  Resource	
  	
  
As	
  a	
  component	
  of	
  our	
  project	
  we	
  aim	
  to	
  provide	
  a	
  web	
  portal	
  that,	
  in	
  addi'on	
  to	
  presen'ng	
  
results	
   from	
   our	
   compara've	
   analyses,	
   acts	
   as	
   a	
   unified	
   repository	
   for	
   genomic	
   and	
  
transcriptomic	
   data,	
   and	
   related	
   bioinforma'c	
   resources	
   for	
   the	
   Solanaceae,	
   and	
   thereby	
  
improves	
  the	
  accessibility	
  of	
  this	
  data	
  to	
  the	
  Solanaceae	
  community.	
  
AnnotaIon/Analysis	
  Pipelines	
  
We	
  retrieve	
  all	
  publicly	
  available	
  Solanaceae	
  genomic	
  sequences	
  from	
  GenBank,	
  and	
  the	
  sequences	
  are	
  run	
  
through	
  the	
  GMOD	
  MAKER	
  gene	
  annota'on	
  pipeline	
  to	
  provide	
  a	
  common	
  set	
  of	
  evidence-­‐supported	
  gene	
  
model	
  predic'ons;	
  these	
  supplement	
  the	
  models	
  previously	
  annotated	
  (if	
  any)	
  on	
  the	
  public	
  assemblies.	
  
Our	
  transcriptomic	
  analyses	
  are	
  performed	
  on	
  transcript	
  assemblies	
  generated	
  by	
  PlantGDB	
  (PUTs).	
  	
  
Some	
  of	
  the	
  analyses	
  we	
  perform	
  on	
  genomic	
  and	
  transcriptomic	
  sequence	
  include:	
  
• 	
  Ortholog/paralog	
  predic'on	
  by	
  best	
  hit	
  and	
  OrthoMCL	
  clustering	
  
• 	
  SSR	
  iden'fica'on	
  in	
  transcript	
  and	
  genome	
  sequence,	
  and	
  genera'on	
  of	
  primers	
  (using	
  Primer3)	
  
• 	
  Iden'fica'on	
  of	
  puta've	
  SNPs	
  in	
  transcript	
  assemblies	
  
• 	
  Alignment	
  of	
  PlantGDB-­‐assembled	
  Solanaceae	
  transcripts	
  (PUTs)	
  to	
  the	
  genomic	
  sequence	
  using	
  exonerate	
  
• 	
  Alignment	
  of	
  UniProt's	
  SwissProt	
  &	
  UniRef	
  protein	
  databases	
  to	
  the	
  genomic	
  sequence	
  using	
  exonerate	
  
• 	
  BLASTP	
  of	
  Solanaceae	
  gene	
  models	
  against	
  model	
  dicot	
  proteomes	
  (Arabidopsis,	
  Grapevine,	
  Poplar)	
  
• 	
  InterProScan	
  search	
  on	
  the	
  models	
  to	
  iden'fy	
  func'onal	
  domains	
  
• 	
  Repeat	
  feature	
  predic'on	
  (using	
  RepeatMasker)	
  
• 	
  ncRNA	
  feature	
  predic'on	
  (using	
  tRNAscan-­‐SE	
  and	
  RNAmmer)	
  
Integrated	
  and	
  Accessible	
  Data	
  
Available	
   sequence	
   data,	
   analysis	
   results,	
   and	
   tools	
   for	
   species	
   in	
   the	
   Solanaceae	
   are	
   presented	
   in	
  
centralized	
   views	
   on	
   the	
   project	
   site	
   to	
   aid	
   users	
   in	
   applying	
   these	
   resources	
   in	
   their	
   research.	
   At	
   the	
  
genome	
  level,	
  our	
  species	
  overview	
  page	
  consolidates	
  available	
  sequence	
  data,	
  genome	
  informa'on	
  and	
  
resources,	
   and	
   lists	
   available	
   analysis	
   results	
   and	
   tools.	
   At	
   the	
   transcript	
   level,	
   our	
   gene	
   overview	
   page	
  
presents	
  a	
  summary	
  of	
  gene	
  informa'on	
  and	
  analyses,	
  such	
  as	
  BLAST	
  results,	
  computa'onally	
  predicted	
  
SNPs,	
  SSRs,	
  orthology/paralogy,	
  and	
  links	
  transcripts	
  to	
  other	
  site	
  resources	
  including	
  our	
  genome	
  browsers.	
  	
  
Solanaceae	
  ComparaIve	
  Genome	
  Database	
  
Our	
   database	
   contains	
   annota'on	
   and	
   compara've	
   data	
   for	
   all	
   public	
   Solanaceae	
   genomic	
   sequence	
  
assemblies.	
  We	
  currently	
  use	
  the	
  GMOD	
  Generic	
  Genome	
  Browser	
  (Gbrowse)	
  to	
  facilitate	
  the	
  web-­‐based	
  
display	
  and	
  searching	
  of	
  our	
  annota'on	
  and	
  compara've	
  analyses.	
  
Potato	
  Genome	
  Sequencing	
  ConsorIum	
  Potato	
  DraK	
  Genome	
  Browser	
  
As	
   members	
   of	
   the	
   Potato	
   Genome	
   Sequencing	
   Consor'um	
   we	
   are	
   hos'ng	
   the	
   public	
   Potato	
   genome	
  
browser.	
   Presently,	
   the	
   doubled	
   monoploid	
   Solanum	
   phureja	
   DM1-­‐3	
   516R44	
   (CIP801092)	
   v3.2	
   genome	
  
assembly	
  and	
  annota'on	
  is	
  online.	
  Visit	
  hp://potatogenome.net	
  for	
  details	
  on	
  this	
  draj	
  genome	
  release.	
  
In	
   the	
   genome	
   browser	
   all	
   aligned	
   Solanaceae	
   transcript	
   assemblies	
   are	
   linked	
   to	
   the	
   the	
   full	
   set	
   of	
  
resources	
  associated	
  with	
  those	
  assemblies	
  provided	
  by	
  the	
  Solanaceae	
  Genomics	
  Resource	
  site.	
  	
  
Upcoming	
  Features	
  for	
  2010/2011	
  
We	
  expect	
  that	
  the	
  finished	
  Potato	
  and	
  Tomato	
  genomes	
  will	
  be	
  released	
  to	
  the	
  public	
  sequence	
  databases	
  
in	
  the	
  coming	
  months.	
  At	
  that	
  'me,	
  we	
  will	
  integrate	
  the	
  complete	
  genomes	
  into	
  our	
  exis'ng	
  resources,	
  and	
  
will	
  make	
  available	
  addi'onal	
  tools	
  and	
  analysis	
  results;	
  one	
  of	
  these	
  new	
  tools	
  will	
  be	
  a	
  genome	
  synteny	
  
viewer.	
  
We	
   have	
   produced	
   a	
   significant	
   amount	
   of	
   RNA-­‐Seq	
   data	
   from	
   our	
   par'cipa'on	
   in	
   the	
   Potato	
   Genome	
  
Sequencing	
  Consor'um	
  (PGSC)	
  hp://potatogenome.net	
  and	
  Solanaceae	
  Coordinated	
  Agricultural	
  Project	
  
(SolCAP)	
   hp://solcap.msu.edu,	
   and	
   when	
   publicly	
   released	
   it	
   will	
   be	
   incorporated	
   into	
   the	
   Solanaceae	
  
Genomics	
  Resource	
  databases	
  and	
  tools.	
  This	
  data	
  will	
  greatly	
  expand	
  our	
  exis'ng	
  SNP	
  database	
  tool,	
  and	
  
we	
  will	
  provide	
  new	
  tools	
  for	
  the	
  query	
  and	
  display	
  of	
  expression	
  data.	
  	
  
Coming Soon

More Related Content

Viewers also liked

May Month in Review: US performance data
May Month in Review: US performance dataMay Month in Review: US performance data
May Month in Review: US performance dataHOTEL NEWS NOW (STR)
 
Webinar: zasady copywritingu zwiększające konwersję
Webinar: zasady copywritingu zwiększające konwersjęWebinar: zasady copywritingu zwiększające konwersję
Webinar: zasady copywritingu zwiększające konwersjęCallPage
 
Improvements in the Tomato Reference Genome (SL3.0) and Annotation (ITAG3.0)
Improvements in the Tomato Reference Genome (SL3.0) and Annotation (ITAG3.0)Improvements in the Tomato Reference Genome (SL3.0) and Annotation (ITAG3.0)
Improvements in the Tomato Reference Genome (SL3.0) and Annotation (ITAG3.0)solgenomics
 
Functional Genomics Resources in Thellungiella and Sinorhizobium
Functional Genomics Resources in Thellungiella and SinorhizobiumFunctional Genomics Resources in Thellungiella and Sinorhizobium
Functional Genomics Resources in Thellungiella and SinorhizobiumBrett Whitty
 
Preguntas frecuentes2015
Preguntas frecuentes2015Preguntas frecuentes2015
Preguntas frecuentes2015Misael Vargas
 
유저해빗(UserHabit) 서비스 소개서 요약본
유저해빗(UserHabit) 서비스 소개서 요약본유저해빗(UserHabit) 서비스 소개서 요약본
유저해빗(UserHabit) 서비스 소개서 요약본Ha ji Yoon
 
Quantum Mechanics - Super Position
Quantum Mechanics - Super PositionQuantum Mechanics - Super Position
Quantum Mechanics - Super PositionMostafa Elsheikh
 
Laws of thermodynamics
Laws of thermodynamicsLaws of thermodynamics
Laws of thermodynamicsFaraz Ahmed
 
RCUK Global Challenges Research Fund
RCUK Global Challenges Research FundRCUK Global Challenges Research Fund
RCUK Global Challenges Research FundScott McGee
 
Dlaczego landing page nie działa
Dlaczego landing page nie działaDlaczego landing page nie działa
Dlaczego landing page nie działaCallPage
 

Viewers also liked (15)

resumeforcharlee
resumeforcharleeresumeforcharlee
resumeforcharlee
 
May Month in Review: US performance data
May Month in Review: US performance dataMay Month in Review: US performance data
May Month in Review: US performance data
 
Cultura na rede
Cultura na redeCultura na rede
Cultura na rede
 
Encuestas
EncuestasEncuestas
Encuestas
 
Webinar: zasady copywritingu zwiększające konwersję
Webinar: zasady copywritingu zwiększające konwersjęWebinar: zasady copywritingu zwiększające konwersję
Webinar: zasady copywritingu zwiększające konwersję
 
Improvements in the Tomato Reference Genome (SL3.0) and Annotation (ITAG3.0)
Improvements in the Tomato Reference Genome (SL3.0) and Annotation (ITAG3.0)Improvements in the Tomato Reference Genome (SL3.0) and Annotation (ITAG3.0)
Improvements in the Tomato Reference Genome (SL3.0) and Annotation (ITAG3.0)
 
El korfball
El korfballEl korfball
El korfball
 
Functional Genomics Resources in Thellungiella and Sinorhizobium
Functional Genomics Resources in Thellungiella and SinorhizobiumFunctional Genomics Resources in Thellungiella and Sinorhizobium
Functional Genomics Resources in Thellungiella and Sinorhizobium
 
Preguntas frecuentes2015
Preguntas frecuentes2015Preguntas frecuentes2015
Preguntas frecuentes2015
 
유저해빗(UserHabit) 서비스 소개서 요약본
유저해빗(UserHabit) 서비스 소개서 요약본유저해빗(UserHabit) 서비스 소개서 요약본
유저해빗(UserHabit) 서비스 소개서 요약본
 
Quantum Mechanics - Super Position
Quantum Mechanics - Super PositionQuantum Mechanics - Super Position
Quantum Mechanics - Super Position
 
Laws of thermodynamics
Laws of thermodynamicsLaws of thermodynamics
Laws of thermodynamics
 
RCUK Global Challenges Research Fund
RCUK Global Challenges Research FundRCUK Global Challenges Research Fund
RCUK Global Challenges Research Fund
 
Sap standard pp reports
Sap standard pp reportsSap standard pp reports
Sap standard pp reports
 
Dlaczego landing page nie działa
Dlaczego landing page nie działaDlaczego landing page nie działa
Dlaczego landing page nie działa
 

Similar to 2010-09-03.Whitty_B.Solanaceae_Genomics_Resource.poster

Phylogeny-driven approaches to microbial & microbiome studies: talk by Jonath...
Phylogeny-driven approaches to microbial & microbiome studies: talk by Jonath...Phylogeny-driven approaches to microbial & microbiome studies: talk by Jonath...
Phylogeny-driven approaches to microbial & microbiome studies: talk by Jonath...Jonathan Eisen
 
Algal Functional Annotation Tool
Algal Functional Annotation ToolAlgal Functional Annotation Tool
Algal Functional Annotation ToolSarah Adams
 
Rap db(rice annotation project data base)
Rap db(rice annotation project data base)Rap db(rice annotation project data base)
Rap db(rice annotation project data base)PrajaktaKale17
 
Application of bioinformatics in climate smart horticulture
Application of bioinformatics in climate smart horticultureApplication of bioinformatics in climate smart horticulture
Application of bioinformatics in climate smart horticultureDr.Hetalkumar Panchal
 
Genomic aided selection for crop improvement
Genomic aided selection for crop improvementGenomic aided selection for crop improvement
Genomic aided selection for crop improvementtanvic2
 
Genome resource databases in horticutural crops
Genome resource databases in horticutural cropsGenome resource databases in horticutural crops
Genome resource databases in horticutural cropsPulipati Gangadhara Rao
 
VectorBase - PopGenBase Meeting at ASTMH08
VectorBase - PopGenBase Meeting at ASTMH08VectorBase - PopGenBase Meeting at ASTMH08
VectorBase - PopGenBase Meeting at ASTMH08Yoosook Lee
 
Comparative genomics.pdf
Comparative genomics.pdfComparative genomics.pdf
Comparative genomics.pdfshinycthomas
 
Bioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahuBioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahuKAUSHAL SAHU
 
Building bioinformatics resources for the global community
Building bioinformatics resources for the global communityBuilding bioinformatics resources for the global community
Building bioinformatics resources for the global communityExternalEvents
 
Comparative genomics @ sid 2003 format
Comparative genomics @ sid 2003 formatComparative genomics @ sid 2003 format
Comparative genomics @ sid 2003 formatsidjena70
 
Role of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchRole of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchAnshika Bansal
 

Similar to 2010-09-03.Whitty_B.Solanaceae_Genomics_Resource.poster (20)

0032-Ijabpt-Imed pub
0032-Ijabpt-Imed pub0032-Ijabpt-Imed pub
0032-Ijabpt-Imed pub
 
Phylogeny-driven approaches to microbial & microbiome studies: talk by Jonath...
Phylogeny-driven approaches to microbial & microbiome studies: talk by Jonath...Phylogeny-driven approaches to microbial & microbiome studies: talk by Jonath...
Phylogeny-driven approaches to microbial & microbiome studies: talk by Jonath...
 
Algal Functional Annotation Tool
Algal Functional Annotation ToolAlgal Functional Annotation Tool
Algal Functional Annotation Tool
 
Major biological nucleotide databases
Major biological nucleotide databasesMajor biological nucleotide databases
Major biological nucleotide databases
 
genomic comparison
genomic comparison genomic comparison
genomic comparison
 
MORPH-R article
MORPH-R articleMORPH-R article
MORPH-R article
 
Rap db(rice annotation project data base)
Rap db(rice annotation project data base)Rap db(rice annotation project data base)
Rap db(rice annotation project data base)
 
Application of bioinformatics in climate smart horticulture
Application of bioinformatics in climate smart horticultureApplication of bioinformatics in climate smart horticulture
Application of bioinformatics in climate smart horticulture
 
Genomic aided selection for crop improvement
Genomic aided selection for crop improvementGenomic aided selection for crop improvement
Genomic aided selection for crop improvement
 
Genome resource databases in horticutural crops
Genome resource databases in horticutural cropsGenome resource databases in horticutural crops
Genome resource databases in horticutural crops
 
VectorBase - PopGenBase Meeting at ASTMH08
VectorBase - PopGenBase Meeting at ASTMH08VectorBase - PopGenBase Meeting at ASTMH08
VectorBase - PopGenBase Meeting at ASTMH08
 
Comparative genomics.pdf
Comparative genomics.pdfComparative genomics.pdf
Comparative genomics.pdf
 
Comgen Final Poster
Comgen Final PosterComgen Final Poster
Comgen Final Poster
 
Bioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahuBioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahu
 
Building bioinformatics resources for the global community
Building bioinformatics resources for the global communityBuilding bioinformatics resources for the global community
Building bioinformatics resources for the global community
 
Comparative genomics @ sid 2003 format
Comparative genomics @ sid 2003 formatComparative genomics @ sid 2003 format
Comparative genomics @ sid 2003 format
 
Karyotype DAS client
Karyotype DAS clientKaryotype DAS client
Karyotype DAS client
 
Tair workshop stanford2017
Tair workshop stanford2017Tair workshop stanford2017
Tair workshop stanford2017
 
Protein databases
Protein databasesProtein databases
Protein databases
 
Role of bioinformatics in life sciences research
Role of bioinformatics in life sciences researchRole of bioinformatics in life sciences research
Role of bioinformatics in life sciences research
 

2010-09-03.Whitty_B.Solanaceae_Genomics_Resource.poster

  • 1. Solanaceae  Genomics  Resource   Bre1  R.  Whi1y,  C.  Robin  Buell,  Michigan  State  University,     Department  of  Plant  Biology,  East  Lansing  MI  48824   Collec'vely,  the  Solanaceae  (a  family  which  includes  Potato,  Tomato,  Tobacco,  Pepper,  Eggplant  and  Petunia)  are  a  valuable  component  of  U.S.  agriculture.  The   major  Solanaceae  crop  species  share  both  sequence  iden'ty  and  gene  order  thereby  providing  the  basis  for  leveraging  genomic  resources  across  taxa.  Transcriptome  and   genome  sequencing  projects  have  been  ini'ated  for  the  major  crop  species;  albeit  none  of  the  three  genome  ini'a'ves  (potato,  tomato,  tobacco)  have  yet  released  to  the   public  a  high  quality,  finished  complete  genome  sequence.    Thus,  it  is  essen'al  that  all  of  the  par'al  Solanaceae  transcript  and  genome  sequence  be  integrated  at  the   family  level  and  linked  to  other  model  dicot  species  to  provide  contextual  informa'on  on  the  puta've  func'on  of  Solanaceae  homologs.  In  this  project,  we  are  working  to   iden'fy  puta've  orthologs,  paralogs,  and  lineage-­‐specific  genes  within  the  Solanaceae  to  facilitate  intra-­‐  and  inter-­‐species  comparisons.  We  also  iden'fy  homologs  of   Solanaceae  species  within  three  dicot  species  (Arabidopsis,  Poplar,  Grapevine)  to  permit  leveraging  resources  from  these  model  species  to  the  Solanaceae.  We  are  working   to  generate  compara've  analyses,  alignments,  views,  and  displays  of  the  Solanaceae.  Overall,  we  provide  a  robust  and  integrated  compara've  genomics  resource  that   permits  broad  and  deep  data-­‐mining  of  Solanaceae  sequences  by  the  community.     This  project  was  ini'ated  January  1,  2008  and  we  con'nue  to  update  project  data  quarterly,  and  develop  addi'onal  resources  and  tools  for  the  Solanaceae  community.   It  is  supported  by  the  Na'onal  Research  Ini'a've  (NRI)  Plant  Genome  Program  of  the  USDA  Na'onal  Ins'tute  of  Food  and  Agriculture  (NIFA)  (2008-­‐35300-­‐18671).   All  project  data  is  made  available  through  our  web  site:   hp://solanaceae.plantbiology.msu.edu   Project  email  address:  sgr@plantbiology.msu.edu   Model  Dicot  ComparaIve  Genome  Databases   Alignments  of  Solanaceae  transcript  assemblies  against  model  dicot  (Arabidopsis,  Grapevine,  Poplar)   genomic  and  polypep'de  sequence  are  available  for  display  and  search  in  a  Gbrowse  database.   Potato,  Tomato  and  Tobacco  DraK  Genomes   Our  analyses  and  databases  include  all  public  data  releases  from  the  three  genome  sequencing  efforts  in  the   Solanaceae.  We  obtain  data  as  it  is  released  to  GenBank  from  the  Interna'onal  Tomato  Genome  Sequencing   Project   which   includes   gene   models   annotated   by   the   project   members,   and   the   Interna'onal   Potato   Genome  Sequencing  Consor'um  (of  which  we  are  members)  which  to  date  has  released  assemblies  with  no   gene  annota'ons.  Our  annota'on  and  analysis  pipeline  provides  gene  models  for  genes  present  on  these   assemblies,  supplementary  to  any  previously  annotated  gene  models  present  in  the  public  data.     A  Community  Resource     As  a  component  of  our  project  we  aim  to  provide  a  web  portal  that,  in  addi'on  to  presen'ng   results   from   our   compara've   analyses,   acts   as   a   unified   repository   for   genomic   and   transcriptomic   data,   and   related   bioinforma'c   resources   for   the   Solanaceae,   and   thereby   improves  the  accessibility  of  this  data  to  the  Solanaceae  community.   AnnotaIon/Analysis  Pipelines   We  retrieve  all  publicly  available  Solanaceae  genomic  sequences  from  GenBank,  and  the  sequences  are  run   through  the  GMOD  MAKER  gene  annota'on  pipeline  to  provide  a  common  set  of  evidence-­‐supported  gene   model  predic'ons;  these  supplement  the  models  previously  annotated  (if  any)  on  the  public  assemblies.   Our  transcriptomic  analyses  are  performed  on  transcript  assemblies  generated  by  PlantGDB  (PUTs).     Some  of  the  analyses  we  perform  on  genomic  and  transcriptomic  sequence  include:   •   Ortholog/paralog  predic'on  by  best  hit  and  OrthoMCL  clustering   •   SSR  iden'fica'on  in  transcript  and  genome  sequence,  and  genera'on  of  primers  (using  Primer3)   •   Iden'fica'on  of  puta've  SNPs  in  transcript  assemblies   •   Alignment  of  PlantGDB-­‐assembled  Solanaceae  transcripts  (PUTs)  to  the  genomic  sequence  using  exonerate   •   Alignment  of  UniProt's  SwissProt  &  UniRef  protein  databases  to  the  genomic  sequence  using  exonerate   •   BLASTP  of  Solanaceae  gene  models  against  model  dicot  proteomes  (Arabidopsis,  Grapevine,  Poplar)   •   InterProScan  search  on  the  models  to  iden'fy  func'onal  domains   •   Repeat  feature  predic'on  (using  RepeatMasker)   •   ncRNA  feature  predic'on  (using  tRNAscan-­‐SE  and  RNAmmer)   Integrated  and  Accessible  Data   Available   sequence   data,   analysis   results,   and   tools   for   species   in   the   Solanaceae   are   presented   in   centralized   views   on   the   project   site   to   aid   users   in   applying   these   resources   in   their   research.   At   the   genome  level,  our  species  overview  page  consolidates  available  sequence  data,  genome  informa'on  and   resources,   and   lists   available   analysis   results   and   tools.   At   the   transcript   level,   our   gene   overview   page   presents  a  summary  of  gene  informa'on  and  analyses,  such  as  BLAST  results,  computa'onally  predicted   SNPs,  SSRs,  orthology/paralogy,  and  links  transcripts  to  other  site  resources  including  our  genome  browsers.     Solanaceae  ComparaIve  Genome  Database   Our   database   contains   annota'on   and   compara've   data   for   all   public   Solanaceae   genomic   sequence   assemblies.  We  currently  use  the  GMOD  Generic  Genome  Browser  (Gbrowse)  to  facilitate  the  web-­‐based   display  and  searching  of  our  annota'on  and  compara've  analyses.   Potato  Genome  Sequencing  ConsorIum  Potato  DraK  Genome  Browser   As   members   of   the   Potato   Genome   Sequencing   Consor'um   we   are   hos'ng   the   public   Potato   genome   browser.   Presently,   the   doubled   monoploid   Solanum   phureja   DM1-­‐3   516R44   (CIP801092)   v3.2   genome   assembly  and  annota'on  is  online.  Visit  hp://potatogenome.net  for  details  on  this  draj  genome  release.   In   the   genome   browser   all   aligned   Solanaceae   transcript   assemblies   are   linked   to   the   the   full   set   of   resources  associated  with  those  assemblies  provided  by  the  Solanaceae  Genomics  Resource  site.     Upcoming  Features  for  2010/2011   We  expect  that  the  finished  Potato  and  Tomato  genomes  will  be  released  to  the  public  sequence  databases   in  the  coming  months.  At  that  'me,  we  will  integrate  the  complete  genomes  into  our  exis'ng  resources,  and   will  make  available  addi'onal  tools  and  analysis  results;  one  of  these  new  tools  will  be  a  genome  synteny   viewer.   We   have   produced   a   significant   amount   of   RNA-­‐Seq   data   from   our   par'cipa'on   in   the   Potato   Genome   Sequencing  Consor'um  (PGSC)  hp://potatogenome.net  and  Solanaceae  Coordinated  Agricultural  Project   (SolCAP)   hp://solcap.msu.edu,   and   when   publicly   released   it   will   be   incorporated   into   the   Solanaceae   Genomics  Resource  databases  and  tools.  This  data  will  greatly  expand  our  exis'ng  SNP  database  tool,  and   we  will  provide  new  tools  for  the  query  and  display  of  expression  data.     Coming Soon