BITS - Genevestigator to easily access transcriptomics data
Upcoming SlideShare
Loading in...5
×
 

BITS - Genevestigator to easily access transcriptomics data

on

  • 532 views

These are the presentation slides of the BITS training session about 'Genevestigator'.

These are the presentation slides of the BITS training session about 'Genevestigator'.

Many thanks to Nebion for contributing these slides.

Statistics

Views

Total Views
532
Views on SlideShare
530
Embed Views
2

Actions

Likes
0
Downloads
4
Comments
0

2 Embeds 2

http://www.linkedin.com 1
http://www.slashdocs.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-ShareAlike LicenseCC Attribution-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

BITS - Genevestigator to easily access transcriptomics data BITS - Genevestigator to easily access transcriptomics data Presentation Transcript

  • GENEVESTIGATOR TUTORIAL VIB - Gent 12.04.20111
  • Goals Understand what Genevestigator is and why it has been developed Understand the function of the tools provided by the software Learn how to use Genevestigator to find genes of interest 2
  • Content Microarray technology Concept of Genevestigator Data curation Tools: – Meta-profile analysis – Biomarker search – RefGenes – Clustering analysis
  • Microarray technology Advantages: – Genome wide – Relatively cheap – Standardized streamlined handling – Use of an optimized system based on oligonucleotide sequences – Possibility to store data in publicly available repositories Disadvantages: – Sequence must be known in advance – Hybridization reaction
  • Workflow of a microarray experiment Conditions selection and experiments RNA extraction, amplification and Hybridization labelling Hybridization on chips Each pixel intensity is determined by the DAT file expression level of a gene in the specific Scanned raw image sample hybridized on the array Raw Data (Probe level) CEL file Quality Control Normalization Normalized Data TXT file Analysis Submission to repository Validation (Q-PCR) 5
  • Concept of Genevestigator Tissue type 1 Tissue type 2 Tissue type 3 Tissue type 4 … … … … … Tissue type 200 Thousands of microarray Model of aexperiments exist world-wide summarized output=> Summarize information from thousands of public experiments into easily interpretable results 6
  • Concept of Genevestigator Build a systematic database of gene expression information Data repositories Curation Genevestigator anatomy development condition genotype Data Expert annotation quality with systematic meta-analysis? control ontologies meta-analysis! 7
  • 1. Data Curation - Overview Quality control all sample data 1. Data Curation Collect raw data files and normalize data anatomy development Read and understand the experiment condition genotype Quality control Expert annotation Manually annotate experiments using + with systematic structured vocabularies (ontologies) Normalization ontologies Final goal of curation: translate experimental information in computer- readable and „statistically usable“ form 8
  • Curation: Quality control Unprocessed probe intensity RNA degradation plots Probe-level analysis (RLE, NUSE) Border element analysis Array-array correlation plots 9
  • Curation: normalization models Multi-array models – e.g. dChip, RMA, gcRMA – all arrays from an experiment are normalized simultaneously – cannot easily be used to create large databases – RMA and gcRMA use perfect-match information only (background estimation by statistical approaches) Single array models – e.g. MAS5 – normalize each array independantly – does not correct for biases between experiments – MAS5 uses both perfect-match and mismatch probe information (mismatch is used to model background (biochemical approach))10
  • Curation: Ontologies Ontologies built for – Anatomical parts Anatomy – Stages of development Ontology: - Arabidopsis – Perturbations (diseases, chemicals, etc.) - Rice - Barley Ontologies (version 2008) – Were compiled from various public ontology sources and own developments – Are built using tree structures Development Ontology: - Mouse11
  • Curation: Meta-profiles sample meta-data expression data summarized results [space] [time] [response] [response]12 12
  • Curation: Data contentTotal 1’742 54’786 As of December 2010: > 54’000 Affymetrix arrays World’s largest standardized, quality controlled, and manually annotated gene expression compendium for plants, animals, and microorganisms!13
  • Genevestigator application Database and analysis engine Website with user support Analysis tool for the user Requirements Browser – Genevestigator works in Internet Explorer, Firefox, Safari, Opera, and Chrome Java – Sun Microsystems; Minimal: Java 1.4.2. or higher Computer: – 500 MB RAM or more14
  • Toolsets15
  • Analytical approach 1genes which conditions? Anatomy [space] Development [time] Condition / Genotype [response]16
  • Meta-Profile Analysis 1. Choose an organism 2. Enter the genes you wish to work with17
  • Meta-Profile Analysis tools View and interpret the results across: – Anatomical categories (Anatomy tab) – Developmental stages (Development tab) – Chemicals, diseases, tumors, etc. (Conditions tab) – Genetic modifications (Genotype tab) – Tumors (Neoplasm tab, only for Human)18
  • Note: Select by experiment or annotation19
  • Meta-Profile Analysis: Anatomy tool Looks at how genes are expressed in different tissues Mean and standard deviation Anatomy categories as a tree (ontology); expand / collapse Number of arrays per category is indicated20
  • Meta-Profile Analysis: Neoplasm tool Looks at how genes are expressed in different tumors Clinical parameters of the tumors are available Mean and standard deviation Anatomy categories as a tree (ontology); expand / collapseExpression profile of NPY across different tumor types Number of arrays per category is indicated 21
  • Meta-Profile Analysis: Development tool Looks at how genes are expressed during the life cycle of an organism Example for barley Example for mouse / rat22
  • Meta-Profile Analysis: Conditions and Genotype tools Most upregulating conditionsList (or tree)of various Spots indicate theconditions responses of selected gene(s) to the list of conditions Most downregulating conditions23
  • Meta-Profile Analysis: Scanner tool All arrays are represented on a single screen Easily find and select experiments in which expression is particularly high (screen for peaks) Magnifying glass and tooltip allow to look into details of signals, arrays, and experiments.24
  • Meta-Profile Analysis: Samples tool All arrays are represented in a single plot, scroll down Look at expression level and “absent / present” calls Tooltips allow to look into details of arrays and experiments.25
  • Analytical approach 2conditions which genes?Anatomy[space]Development[time]Conditions /Genotypes[response] 26
  • Biomarker search 1. Choose an organism 2. Choose conditions and run analysis 3. Save target genes for further analysis27
  • Biomarker Search Identify genes that exhibit specific expression characteristics Anatomy Development Conditions / Genotype28
  • Classical biomarker search condition 14 condition 15 condition 10 condition 11 condition 12 condition 13 condition 16 condition 17 condition 5 condition 1 condition 2 condition 3 condition 4 condition 6 condition 7 condition 8 condition 9gene 1 Most biomarker searchgene 2 approaches look for the genes,gene 3 which respond the most to agene 4 given conditiongene 5gene 6gene 7 This condition may includegene 8 multiple similar studies ? ?gene 9gene 10gene 11 How these genes respond togene 12 other conditions is unknown,gene 13 because they were not includedgene 14 into the analysisgene 15gene 16gene 17 29
  • Biomarker validation in Genevestigator condition 14 condition 15 condition 10 condition 11 condition 12 condition 13 condition 16 condition 17 condition 5 condition 1 condition 2 condition 3 condition 4 condition 6 condition 7 condition 8 condition 9gene 1 Genevestigator allows to find outgene 2 how specific these genes aregene 3 (Meta-Profile Analysis ->gene 4 Stimulus/Mutation tools)gene 5gene 6gene 7 Only few are responsive only togene 8 condition 9 (black arrows). Allgene 9 others are sensitive to one (greygene 10 arrows) or more othergene 11 conditions.gene 12gene 13gene 14gene 15gene 16gene 17 30
  • Biomarker Search in Genevestigator condition 14 condition 15 condition 10 condition 11 condition 12 condition 13 condition 16 condition 17 condition 5 condition 1 condition 2 condition 3 condition 4 condition 6 condition 7 condition 8 condition 9 The Genevestigator Biomarker Searchgene 3 tools identify genes that aregene 5 specifically responsive to thegene 7 chosen condition (they respondgene 13 minimally to other conditions).gene 17gene 10gene 2gene 15 These genes are not necessarily thegene 9 ones with the strongest response togene 12 the chosen conditiongene 4gene 11gene 16gene 1 The Genevestigator Biomarker Searchgene 6 tools usually find other targetgene 8 candidates than classical tools, whichgene 14 analyze only a subset of experiments 31
  • 32 gene 8 gene 6 gene 1 gene 4 gene 9 gene 2 gene 7 gene 5 gene 3 gene 14 gene 16 gene 11 gene 12 gene 15 gene 10 gene 17 gene 13 condition 1 condition 2 – condition 3 condition 4 condition 5 condition 6 condition 7 condition 8 condition 9 condition 10 condition 11 condition 12 condition 13 target condition condition 14 condition 15 condition 16 condition 17 condition 18 condition 19 condition 20 condition 21 condition 22 condition 23 condition 24 condition 25 condition 26 condition 27 condition 28 condition 29 condition 30 condition 31 condition 32 condition 33 condition 34 condition 35 condition 36 condition 37 condition 38 condition 39 condition 40 condition 41 condition 42 condition 43 condition 44 condition 45 condition 46 condition 47 condition 48 condition 49 condition 50 condition 51 Biomarker Search in Genevestigator condition 52 condition 53 condition 54 Imagine extending this to a much wider set of conditions… condition 55 condition 56 you may find other conditions to which the set of genes respond condition 57 condition 58 condition 59 condition 60 condition 61 condition 62 other conditions to which the genes are responding condition 63 condition 64 condition 65 condition 66 condition 67 condition 68 condition 69 condition 70 condition 71 condition 72 condition 73 condition 74 condition 75
  • Biomarker Search: example Search for genes that are associated with a set of conditions, e.g. how do abiotic stresses relate to hormonal responses? hormonal responses abiotic stresses BL / H3BO3(+) ABA (+) --- ABA (+) MeJA (+) ethylene (+) anoxia (-) salt (+) salt (-) salt (+) salt (+) hypoxia (-) hypoxia (-) osmotic (+) osmotic (-) osmotic (+) drought (+) cold (+)33
  • Biomarker Search in Genevestigator Example: human genes responsive to Actinomycin-D target condition(s) Actinomycin-D vMyb Oncolytic herpes Propiconazole Sapphyrin Echinomycin simplex virus Cell cycle inhibition co-inducing conditions Chemical: ARC34
  • RefGenes Goal: identify reference genes for use in qPCR. Solution: search the Genevestigator database for genes that show constant expression in a certain category of arrays.35
  • RefGenes: validation experiment with mouse liver Validation experiment on mouse liver geNorm selection of the most stable reference genes within this experiment Dataset: 197 arrays from mouse liver36
  • Clustering Analysis Goal: to identify groups of genes that have similar expression characteristics Tools: – Hierarchical clustering (with leaf ordering) – Biclustering (BiMax algorithm)37
  • Biclustering Search for biclusters in a list of 64 genes responsive to myocardial infarction One of many possible biclusters Development profile of these 7 genes38
  • Advantages of using Genevestigator Benefit from the normalized data from 54’000 arrays on 12 organisms Extended and precise gene search according to: - Anatomy - Development - Stimulus / Mutation Find genes, which might be interesting for a further study Gain further information about specific gene sets Find appropriate reference genes for the conditions you study Rapidly compare, validate and extend data39
  • QUESTIONS?
  • Supplementary Slides
  • Select Genes42
  • Problems with classical reference genes Most groups use common housekeeping genes such as β-Actin or GAPDH to normalize qPCR data Depending on the condition studied, these genes show some regulations and are therefore unsuitable Hypothesis: for each biological context, there is a subset of genes that are most suitable to normalize expression data from this context.43
  • Summary44
  • Affymetrix GeneChip® Scan
  • Affymetrix GeneChip® scanned image DAT file Scanned raw image CEL file TXT fileEach pixel intensity is determined by the Raw Data (Probe level) Normalized Dataexpression level of a gene in the specific Quality Control Into repositorysample hybridized on the array Normalization46