Bioinformatica e genomica comparata: nuove strategie
sperimentali e computazionali per la produzione e
analisi di dati NGS...
Next-­‐Genera*on	
  Sequencing	
  	
  
Several NGS platforms are now available on the market, which are progressively
impr...
Worldwide	
  distribu*on	
  of	
  NGS	
  pla;orms	
  

>2558	
  total	
  machines	
  (>1000	
  	
  US,	
  >300	
  	
  Chin...
Our	
  first	
  experience	
  

RNA-­‐Seq	
  reads:	
  
-­‐  Roche	
  454	
  (GS-­‐20):	
  >120000	
  conBgs	
  (assembled	...
Available	
  compu*ng	
  facili*es	
  at	
  UNIBA/IBBE-­‐CNR	
  

HP	
  Proliant	
  Server	
  
-­‐  256	
  GB	
  RAM	
  
-...
Available	
  NGS	
  pla;orms	
  at	
  UNIBA/IBBE	
  
Illumina	
  MiSeq	
  
-­‐  >	
  50	
  M	
  reads/run	
  
-­‐  Paired	...
DNA	
  sequencing:	
  variant	
  analysis
	
  
ProducBon	
  of	
  DNA-­‐Seq	
  reads	
  for	
  the	
  idenBficaBon	
  of	
 ...
DNA	
  sequencing:	
  de	
  novo	
  sequencing
	
  
Using	
  the	
  Illumina	
  MiSeq	
  plaform	
  we	
  have	
  sequence...
DNA	
  sequencing:	
  sequencing	
  of	
  mtDNA
	
  
Assembly	
  of	
  full	
  mtDNAs	
  in	
  human	
  by	
  WXS	
  and	
...
DNA	
  sequencing:	
  metagenomics
	
  
Deep	
   sequencing	
   of	
   environmental	
   samples	
   to	
   funcBonally	
 ...
RNA	
  sequencing:	
  RNA-­‐Seq	
  analysis
	
  
Development	
  of	
  dedicated	
  computaBonal	
  tools	
  to	
  automate...
RNA	
  sequencing:	
  miRNA-­‐Seq	
  analysis
	
  
ProducBon	
  of	
  miRNA-­‐Seq	
  data	
  and	
  development	
  of	
  a...
RNA	
  sequencing:	
  RNA	
  edi*ng
	
  
Development	
  of	
  the	
  first	
  toolkit	
  to	
  detect	
  RNA	
  ediBng	
  b...
Other	
  NGS-­‐related	
  experiences
	
  
-­‐	
  Gene	
  expression	
  in	
  collaboraBon	
  
with	
  Prof.	
  Giorgino	
...
Acknowledgments	
  	
  

PRIN2009	
  
PRIN2010	
  
PRIN2013	
  

MICROMAP

CNR-­‐Aging	
  program	
  

http://www.arisla.o...
Upcoming SlideShare
Loading in …5
×

Ernesto Picardi – Bioinformatica e genomica comparata: nuove strategie sperimentali e computazionali per la produzione e analisi di dati NGS finalizzati a sviluppare processi e prodotti innovativi per la salute dell’uomo, l’ambiente e l’agroalimen

1,359 views
1,185 views

Published on

Bioinformatica e genomica comparata: nuove strategie sperimentali e computazionali per la produzione e analisi di dati NGS finalizzati a sviluppare processi e prodotti innovativi per la salute dell’uomo, l’ambiente e l’agroalimentare.

Published in: Health & Medicine, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,359
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Ernesto Picardi – Bioinformatica e genomica comparata: nuove strategie sperimentali e computazionali per la produzione e analisi di dati NGS finalizzati a sviluppare processi e prodotti innovativi per la salute dell’uomo, l’ambiente e l’agroalimen

  1. 1. Bioinformatica e genomica comparata: nuove strategie sperimentali e computazionali per la produzione e analisi di dati NGS finalizzati a sviluppare processi e prodotti innovativi per la salute dell’uomo, l’ambiente e l’agroalimentare. Ernesto Picardi University of Bari IBBE-CNR http://www.pesolelab.it/ ernesto.picardi@uniba.it   BiP-­‐Day  5/12/2013  
  2. 2. Next-­‐Genera*on  Sequencing     Several NGS platforms are now available on the market, which are progressively improved in term of throughput and cost- and time-efficiency. Ion Proton   Roche / 454 Genome Sequencer FLX titanium (800 bp, 800 Mb / run) PacBio   Illumina / Solexa Genetic Analyzer HiSeq 2000 (150x2 bp, 600 Gb / run) Applied Biosystems SOLiD 4 SystemTM (100x2 bp, 400 Gb / run)
  3. 3. Worldwide  distribu*on  of  NGS  pla;orms   >2558  total  machines  (>1000    US,  >300    China,  ..,  >50  Italy)   Source:  omicsmap.com  
  4. 4. Our  first  experience   RNA-­‐Seq  reads:   -­‐  Roche  454  (GS-­‐20):  >120000  conBgs  (assembled  reads  from  4  Bssues)   -­‐  Illumina/Solexa:  >  200  M  reads  (from  2  Bssues)   -­‐  SOLiD  :  >  320  M  reads  (from  4  Bssues)   Jaillon  et  al.  2007  Nature    
  5. 5. Available  compu*ng  facili*es  at  UNIBA/IBBE-­‐CNR   HP  Proliant  Server   -­‐  256  GB  RAM   -­‐  40  cores   -­‐  2  RAIDs  (24  +  36  TB)   HP  Proliant  Server   -­‐  2  TB  RAM   -­‐  80  cores   -­‐  2  RAIDs  (36  +  48  TB)   -­‐  4  GPUs   Tape  library   3  Xserve  Apple   -­‐  16x3  GB  RAM   -­‐  RAID  3  TB     External   Partners  
  6. 6. Available  NGS  pla;orms  at  UNIBA/IBBE   Illumina  MiSeq   -­‐  >  50  M  reads/run   -­‐  Paired  2x300   •  DNA sequencing (DNA-Seq) -­‐   genome  resequencing  (SNPs,  CNV,  GWAS)   -­‐   de  novo  sequencing     -­‐   idenBficaBon  of  genome  structural  variants    (cancer  genome)   -­‐   Epigenomics  (chromaBn  state  and  genome  methylaBon)   -­‐   Metagenomics  (Microbiota  analysis  of  clinical  /environmental  samples)     •  RNA sequencing (RNA-Seq) -­‐   QualitaBve  and  quanBtaBve  analysis  of  the  Transcriptome     -­‐   IdenBficaBon  and  characterizaBon  of  miRNAs  and  other  ncRNAs     -­‐  RNA  ediBng   -­‐   Metatrancriptomics  (funcBonal  analysis  of  environmental  samples)   Illumina  HiSeq1500   -­‐  >  300  GB/run   -­‐  Paired  2x100    
  7. 7. DNA  sequencing:  variant  analysis   ProducBon  of  DNA-­‐Seq  reads  for  the  idenBficaBon  of  SNPs  or  Indels  in  normal  and/or   pathological  condiBons,  and  development  of  dedicated  socware.   QC  (quality  control)     Align  to  reference   genome   Variant  (SNP  &   INDEL)  calling     Epstein-­‐Barr  genotyping   In  mulBple  sclerosis   (30  samples  –  0.2  M  reads/sample  )   A  “family  trio”  for  variants  linked   to  the  Majewski  syndrome  –  like   disease  (>260  M  reads  Illumina)   Variant  annotation   Functional  validation   Somatic  mutations  in  human   tissues  (3  donors/  6  tissues)    (18  WXSs  >  1080  M  reads  Illumina)  
  8. 8. DNA  sequencing:  de  novo  sequencing   Using  the  Illumina  MiSeq  plaform  we  have  sequenced  >20  full  prokaryoBc  genomes   generaBng  on  average  2  M  reads  per  sample.     Illumina  MiSeq  Reads   In  collaboraBon  with:   -­‐  Doh.  Parisi  (IsBtuto  ZooprofillaBco)   -­‐  Prof.  Palmieri  (UNIBA-­‐IBBE)   -­‐  Doh.  Ceci  (IBBE)   -­‐  Prof.  Golyshin  (Bangor  Univ.  UK)   Read  Assembling     ConBgs  
  9. 9. DNA  sequencing:  sequencing  of  mtDNA   Assembly  of  full  mtDNAs  in  human  by  WXS  and  WGS  experiments.      In  collaboraBon  with:   Prof.  Akmonelli   Picardi  and  Pesole  2012  Nature  Methods    
  10. 10. DNA  sequencing:  metagenomics   Deep   sequencing   of   environmental   samples   to   funcBonally   characterize   the   biodiversity   (no  need  for  laboratory  culBvaBon  and/or  isolaBon  of  individual  specimens).     BioMaS   (BioinformaBc  analysis  of  Metagenomic  ampliconS)   hhp://webgateway.ba.infn.it:9999/     Raw  Data   Merged  and  Denoised   Sequences   Dereplicated   Sequences   DB  matching   TANGO   Specimen   assessment   In  collaboraBon  with:   Prof.  Valiente  (Catalogna  Univ.)     CorrelaBon  between  Invasive   Species  and  the  microbial   composiBon  (>  15  M  reads)   Microbiota  variaBon  in   colorectal  cancer   (>  125  M  reads)   Development  of     BioinformaBc  tools   MICROMAP
  11. 11. RNA  sequencing:  RNA-­‐Seq  analysis   Development  of  dedicated  computaBonal  tools  to  automate  the  analysis  of  RNA-­‐Seq   samples.   Transcriptome  variaBons   in  6  ALS  samples     (>120  M  reads  Illumina/sample)   Transcriptome  variaBons   in  6  human  Bssues     (>150  M  reads  Illumina/sample)   Transcriptome  variaBons   in  Alzheimer  disease  (15  samples)   (>150  M  reads  Illumina/sample)  
  12. 12. RNA  sequencing:  miRNA-­‐Seq  analysis   ProducBon  of  miRNA-­‐Seq  data  and  development  of  ad  hoc  tools  to  analyse  reads  in   mulBple  experimental  condiBons.   miRNA  expression  in   6  ALS  samples     (>5  M  reads  Illumina/sample)   miRNA  expression   in  6  human  Bssues     (>5  M  reads  Illumina/sample)   miRNA  expression  in    aging  and  Alzheimer  disease   (15  samples)  (>3  M  reads  sample)   Dr  David  Horner  
  13. 13. RNA  sequencing:  RNA  edi*ng   Development  of  the  first  toolkit  to  detect  RNA  ediBng  by  massive  sequencing  data   (RNA-­‐Seq,  DNA-­‐Seq  and  miRNA-­‐Seq).   Picardi and Pesole 2013 Bioinformatics r1 r2 r3 r4 r5 r1 r2 r3 r4 r5 GGGTGCCTTTATGCAGCAAGGATGCGATATT! GGGTGTCTTTATGCAGCAAGGATGCGATACTTCGC! GGGTGCCTTTATGCAGCAAGGATGCGATATTTCG! GGGTGCCTTTATGCAGCAAGGATGCGATATTTCG! GGGTGCCTTTATGCAGCAAGGATGCGATATTTCG! ..............A.....................! TGGGTGCCTTTATGCAGCAAGGATGCGATATTTCGCC gDNA! ..............G.....................! GGGTGCCTTTATGCGGCAAGGATGCGATATT! GGGTGTCTTTATGCAGCAAGGATGCGATACTTCGC! GGGTGCCTTTATGCGGCAAGGATGCGATATTTCG! GGGTGCCTTTATGCGGCAAGGATGCGATATTTCG! GGGTGCCTTTATGCGGCAAGGATGCGATATTTCG! RNA  ediBng  in  ALS   (RNA-­‐Seq:  >120  M  reads/sample   WXS:  >  30  M  reads/sample)   RNA  ediBng  in  human  Bssues   (RNA-­‐Seq:  >120  M  reads/sample   WGS:  >  300  M  reads/sample)   RNA  ediBng  in  Alzheimer  disease   Fluidigm  tech.  and  Illumina  Seq.   (>2000x  per  ediBng  candidate)   Italy – Israel Actions
  14. 14. Other  NGS-­‐related  experiences   -­‐  Gene  expression  in  collaboraBon   with  Prof.  Giorgino  (UNIBA)   -­‐  Gene/exon  expression  in  renal   carcinoma   OMNIA  LH75   Automated  qPCR  for   validaBon  of  molecular   biomarkers  
  15. 15. Acknowledgments     PRIN2009   PRIN2010   PRIN2013   MICROMAP CNR-­‐Aging  program   http://www.arisla.org/ http://www.epigen.it/ Italy – Israel Actions

×