HTS data analysis


Published on

A talk on high-throughput RNai and compound screening data analysis given at Finnish Institute for Molecular medicine (FIMM) March 13, 2008, 12.30—17.30.

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

HTS data analysis

  1. 1. VTT MEDICAL BIOTECHNOLOGY Analysis of HTS data FIMM & Biomedicum Medical Bioinformatics Day, March 13, 2008, 12.30—17.30 Pekka Kohonen VTT Medical Biotechnology FIMM
  2. 2. VTT MEDICAL BIOTECHNOLOGY Presentation overview 1. The high-throughput screening workflow 2. Design considerations in the screens • Which genes to assay: biological question at hand • Sources of error in the screens: • Biological/technical variance (negative controls) • Transfectability of the cells (positive controls) • Off-target effects (redundancy and replication) 3. RNAi screening data normalization 4. Hit picking and prioritization 5. New technologies: Cell Arrays and Lysate Arrays 6. Integration of data from other sources 7. Hight Throughput screening database (HTSdb) • Combines multiple assays and platforms • plate based, lysate arrays, cell arrays, supporting data(GE, aCGH) • Based on R/MySQL • quot;First Lightquot; recently
  3. 3. VTT MEDICAL BIOTECHNOLOGY Screening work-flow Biological question Reagents: Libraries of siRNAs, miRNAs, Biological assay compounds Primary screens Replicating hits Data integration with gene expression, Investigation of pathways targeted, aCGH, other screens (cancer/normal) literature mining Secondary screening Prioritized hits for further validation
  4. 4. Flow-through of a High-throughput screen VTT MEDICAL BIOTECHNOLOGY in 384 wells 2) Add transfection agent 1) Pipet diluted siRNAs 3) 35 ul of trypsinized cell suspension 384 well plates 4) Incubate 72 hrs 5) Add cell phenotype stains & incubate 6) Fluorescence measurement & data analysis
  5. 5. VTT MEDICAL BIOTECHNOLOGY Design considerations: Off-target effects • Non-sequence specific off-target effects: – Interferon response – siRNA causing miRNA machinery saturation – Lipid toxicity • Specific: – Effects on related mRNAs – miRNA mechanism based off-target effects Off-target effects are usually cell line and siRNA specific The best way to mitagate them is to have 2-4 siRNAs per gene
  6. 6. VTT MEDICAL BIOTECHNOLOGY RNAi screening data normalization Edge-effects and B-score normalization Raw data showing an B-score normalized data edge effect after removal of the edge effect • Edge effect is seen especially with the Cell Titer Blue (CTB) reagent • Edge effect causes a lowered signal intensity at the edges • In the B-score normalization estimates of row/column effects are obtained using a two-way median polish. (Brideau et al., J Biomol Screen. 2003)
  7. 7. VTT MEDICAL BIOTECHNOLOGY Functional screens are used to define the effects of the siRNAs on cell proliferation Raw data CTB Normalised data Cell proliferation hits from the screens
  8. 8. VTT MEDICAL BIOTECHNOLOGY In red: siRNAs that cause growth inhibition 3 Cell Line 1 Cell Line 2 3 2 2 1 1 0 Z score: growth inhibition 0 0 50 100 150 200 250 300 350 400 450 0 50 100 150 200 250 300 350 400 450 -1 -1 -2 -2 -3 -4 -3 -5 -4 -6 -5 -7 3 (Z score: Growth inhibition) 2 Cell Line 2 1 0 -5 -4 -3 -2 -1 0 1 2 -1 -2 -3 -4 -5 -6 -7 Common Anti- proliferative hits Cell Line 1 (Z score: Growth inhibition)
  9. 9. VTT MEDICAL BIOTECHNOLOGY Cell Titer Blue (CTB) growth inhibition screens (Blue means growth inhibited) siRNAs hitting preferentially the parent cell line siRNAs hitting the variant_1 cell line siRNAs hitting the parental cell line Pan-hitting siRNAs Parental Variant_1 Variant_2 by Pasi Halonen
  10. 10. VTT MEDICAL BIOTECHNOLOGY I TECHNOLOGY INTRODUCTION - TRANSFECTION CELL ARRAYS • Up to 46 000 spots with different individual siRNA transfections in single assay plate. • Arrays with cells growing only on arrayed spots. • System allows low cost uHTS with minimal infastructure requirements. • Has five measurement channels for visualization of different antibodies and stains by Juha Rantala
  11. 11. VTT MEDICAL BIOTECHNOLOGY Image analysis will be a bioinformatics challenge for the cell array technology 1. Imaging 2. Automated image analysis • image based cytometry 10,000s of images from each experiment - requiring terabytes for storage • Analysis of antibody staining/ organelle stains DNA ACTIN Antibody 1. Antibody 2. + Antibody 3. ? 3. Result classification by morphology, intensity, localisation, number etc.
  12. 12. VTT MEDICAL BIOTECHNOLOGY II Cell lysate microarrays for multiple end-point analysis Protein lysates Pre-miR transfections siRNA transfections Multiple protein Lysates from cultured cell lines microarray slides Phenotype markers Proliferation: Ki-67, Cyclin E, Histone H3 Apoptosis: Caspase-3, PARP, Histone H2AX Cell cycle: Cyclins D, E, A, B1, p-HistoneH3(Ser10) EMT: E-cadherin, Vimentin, Beta-catenin Targets & pathways: p53, c-Myc, Met by Rami Mäkelä Signal quantification and analysis of functional effects
  13. 13. VTT MEDICAL BIOTECHNOLOGY Integration of data from other sources Two cell lines: GE+siRNA One cell line: GE+siRNA+aCGH sirNA growth inhibition difference Expression ratio to parental Gene amplification, siRNA Increased gene expression growth inhibition and gene and greater siRNA growth expression increase inhibition by Henrik Edgren
  14. 14. VTT MEDICAL BIOTECHNOLOGY High Throughput Screening Database: Multiple Assays of the same Model System Plate based: HTSdb Lysate arrays: - CTB - up to 3 channels - CellTiter-Glo™ - multiple endpoints - ApoOne™ - use of ratios - luciferase assays Supporting Data: Cell Arrays: - gene expression - up to 5 channels - uHTS (10000's) - aCGH - improved repeatability - miRNA expression - use ratios for normalization
  15. 15. VTT MEDICAL BIOTECHNOLOGY HTSdb Design Principles • Pragmatic - focused on analysis needs • Extensible to new data sources, normalizations and sample annotation terms • Different assays done on same biological samples can be combined (eg. CTB, ApoOne, Lysate Arrays) • Other data sources (gene expression, miRNA expression) can be combined with screening datas • MySQL open source database • R statistical programming language is used to access the database and to analyze the datas • Bioconductor R-libraries are used when applicable • Ensembl: all identifiers are linked to ensembl genes quot;First Lightquot; recently - data input, normalization and retrieval
  16. 16. VTT MEDICAL BIOTECHNOLOGY Database Structure Annotations of reagents siRNA, miRNA, compouns Datas: raw and normalized Screen Annotations
  17. 17. VTT MEDICAL BIOTECHNOLOGY VTT Medical Biotechnology, Turku, Finland CONFIDENTIAL Canceromics • Matthias Nees • Elmar Bucher • Henrik Edgren • Kalle Ojala • Sami Kilpinen Biochips • John-Patrick Mpindi John- High-throuput screening • Petri Saviranta • Tommi Pisto • Rami Mäkelä • Merja Perälä kelä • Pekka Tiikkainen • Juha Rantala • Pekka Kohonen • Arttu Heinonen • Henri Sara • Niko Sahlberg • Maija Wolf • Pasi Halonen • Suvi-Katri Leivonen Suvi- Harri Siitari • Saija Haapa-Paananen Haapa- • Vidal Fey Olli Kallioniemi