Bioinformàtica per a la
Recerca Biomèdica
Ricardo Gonzalo Sanz
ricardo.gonzalo@vhir.org
20/05/14
Hospital Universitari Val...
Affymetrix microarrays manufacture.
2
3
4
5
6
Microarray experiment workflow.
Quality Controls.
Different types of Affymet...
1 Introduction
 reproducibility
 only show you what you’re looking for
 what about ‘indels’, inversions, translocations...
1 Introduction
1 Introduction
 RNA-Seq was superior in detecting low abundance transcripts
 also better detecting differentiating biolo...
1 Introduction
• In molecular biology exist a lot of techniques to measure the gene expression
(Northern blot)
• Main char...
1 Introduction
• But.... what is a microarray in few words?
 DNA fixed to a solid surface (nylon, silica, glass,...)
 RN...
Important to know in advanced...
1 Introduction
• Microarrays are usually hypothesis-generating:
They highlight specific ...
2
Two color microarrays (cDNA)
• Usually probes are long (20nt)
• Probe is fixed to a glass
• Labeling is with two fluoroc...
2
One color microarrays
• Short probes (20-25 nt)
• Target is labeled with only one fluorocrom
• Only one sample is hybrid...
2 Different types of arrays. Manufactoring. DNA/RNA
• DNA Polymorphism (GWAS)
• Transcription Factors
• Resequencing
• Cyt...
2 Different types of Affymetrix arrays.
3’5’
3’ IVT Arrays
• Biased measurement of the gene expression
• Array more used i...
2 Different types of Affymetrix arrays.
3’5’
Gene Arrays
Exon Arrays
Gene/Exon Arrays
• Gene arrays are the most used (goo...
2 Different types of expression arrays.
•153 organisms in the array (human, mouse, rat, canine, ….)
•100% miRBase v17
•2.2...
2 Different types of expression arrays.
HTA array
Affymetrix microarrays manufacture.3
Photolitografy
Affymetrix microarrays manufacture.3
5 Microarray experiment workflow
5 Microarray experiment workflow
5 Microarray experiment workflow
6 Quality Controls
6 Quality Controls
6 Quality Controls
Length of amplified cRNA
6 Quality Controls
Length of fragmented cRNA
Bioinformàtica per a la
Recerca Biomèdica
Ricardo Gonzalo Sanz
ricardo.gonzalo@vhir.org
20/05/14
Hospital Universitari Val...
Filtering
2
3
4
5
6
Statistical inference of diferential expression
Clustering
Normalization
1 Introduction. Experimental ...
1 Introduction. Experimental design
1 Introduction. Experimental design
1 Introduction. Experimental design
1 Introduction. Experimental design
1 Introduction. Experimental design
1 Introduction. Experimental design
Microarrays Analysis
Workflow
2 Quality Control
2 Quality Control
Was the experiment a success???
• Microarray experiments generate huge quantitites of data
• Standard st...
2 Quality Control
Diagnostics plots for microarrays:
• Microarray data usually considered at two levels
1. Low level. Data...
2 Quality Control
Diagnostics plots for microarrays:
1. Low level:
 Layout image
 Degradation plots (only in 3’IVT)
 Hi...
2 Quality Control
Diganostics plots for microarrays. Low level. Layout image.
2 Quality Control
Diagnostic plots for microarrays. Low level. RNA degradation plot (3’IVT arrays)
2 Quality Control
Diagnostics plots for microarrays. Low level. Histogram/density Plot
2 Quality Control
Diagnostics plots for microarrays. Low level. Boxplot
2 Quality Control
2 Quality Control
Diagnostics plots for microarrays. Low level. PCA
2 Quality Control
Diagnostics plots for microarrays. Low level. PCA
2 Quality Control
2 Quality Control
Diagnostics plots for microarrays. High level. RLE
2 Quality Control
2 Quality Control
Diagnostics plots for microarrays. High level. NUSE
2 Quality Control
Diagnostics plots for microarrays. High level. MA plots
• MA plots allow pair wise comparison of log-int...
2 Quality Control
Diagnostics plots for microarrays. High level. MA plots
2 Quality Control
3 Normalization
The goal of normalization is to adjust for the effects that are due to variations in the
technology rather...
3 Normalization
3 Normalization
3 Normalization
4 Filtering
• In a microarray experiment only a few hundreds/thousand of genes change their
expression due to the differen...
4 Filtering
Exists different types of filtering:
• Annotation features (specific):
 Specific gene features (i.e. GO term,...
4 Filtering
Signal filtering: This technique has as its premise the removal of genes that are
deemed to be not expressed o...
5 Statistical inference of diferential expression
• Indirect comparisons: 2 groups, unpaired
• Direct comparsions: 2 group...
5 Statistical inference of diferential expression
Limma package (Gordon Smith)
5 Statistical inference of diferential expression
5 Statistical inference of diferential expression
5 Statistical inference of diferential expression
5 Statistical inference of diferential expression
6 Clustering
Types:
 Supervised clustering try to find the best partition for data that belong to a
know set o classes
 ...
6 Clustering
6 Clustering
Hierarchical Clustering (HCL)
• HCL is an agglomerative /divise clustering method.
• The iterative process co...
6 Clustering
7 Annotation
8 Biological interpretation
Gene Ontology
8 Biological interpretation
Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)
Upcoming SlideShare
Loading in …5
×

Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

1,246 views

Published on

Course: Bioinformatics for Biomedical Research (2014).
Session: 3.2- Basic Aspects of Microarray Technology and Data Analysis.
Statistics and Bioinformatisc Unit (UEB) & High Technology Unit (UAT) from Vall d'Hebron Research Institute (www.vhir.org), Barcelona.

Published in: Science, Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,246
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
59
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Basic Aspects of Microarray Technology and Data Analysis (UEB-UAT Bioinformatics Course - Session 3.2 - VHIR, Barcelona)

  1. 1. Bioinformàtica per a la Recerca Biomèdica Ricardo Gonzalo Sanz ricardo.gonzalo@vhir.org 20/05/14 Hospital Universitari Vall d’Hebron Institut de Recerca - VHIR Institut d’Investigació Sanitària de l’Instituto de Salud Carlos III (ISCIII) Basic aspects of Microarray technology
  2. 2. Affymetrix microarrays manufacture. 2 3 4 5 6 Microarray experiment workflow. Quality Controls. Different types of Affymetrix arrays. 1 Introduction Different types of arrays. Manufactoring. DNA/RNA/Protein
  3. 3. 1 Introduction  reproducibility  only show you what you’re looking for  what about ‘indels’, inversions, translocations...  accuracy  sensitivity
  4. 4. 1 Introduction
  5. 5. 1 Introduction  RNA-Seq was superior in detecting low abundance transcripts  also better detecting differentiating biologically isoforms  RNA-Seq demonstrated a broader dynamic range than microarray.
  6. 6. 1 Introduction • In molecular biology exist a lot of techniques to measure the gene expression (Northern blot) • Main characteristic from the microarrays discovery (Schena et al. (1995) Science 270:467-70), was not what could be measured, instead the quantity of simultaneous measures that could be done. • Pre microarrays time: study of genes was one by one • Post microarrays time: all the genes together.
  7. 7. 1 Introduction • But.... what is a microarray in few words?  DNA fixed to a solid surface (nylon, silica, glass,...)  RNA “problem” is labeled and have to bind to DNA fixed in the solid surface in an specific way.  DNA binded usually is called “probe”  Labeled RNA usually is called “target”
  8. 8. Important to know in advanced... 1 Introduction • Microarrays are usually hypothesis-generating: They highlight specific genes or features that are particularly interesting for follow-up experiments. An exception would be the biomarkers discovery studies. • This does not reduce the importance of experimental design
  9. 9. 2 Two color microarrays (cDNA) • Usually probes are long (20nt) • Probe is fixed to a glass • Labeling is with two fluorocrom (Cy3/Cy5). • Direct comparison of the two samples due to they are hybridized in the same array. • Each gene appear few times in the array • Long probes facilitate crosshybridization • Not very good reproducibility. Different types of arrays. Manufactoring. DNA/RNA
  10. 10. 2 One color microarrays • Short probes (20-25 nt) • Target is labeled with only one fluorocrom • Only one sample is hybridized in each array. • Each gene is represented by a lot of probes in the array Different types of arrays. Manufactoring. DNA/RNA
  11. 11. 2 Different types of arrays. Manufactoring. DNA/RNA • DNA Polymorphism (GWAS) • Transcription Factors • Resequencing • Cytogenetics • Expression • Alternative splicing • microRNA DNA RNA
  12. 12. 2 Different types of Affymetrix arrays. 3’5’ 3’ IVT Arrays • Biased measurement of the gene expression • Array more used in the literature. A lot of species present. Only genes with polyA tail and good 3’ site will be amplified and will have the chance of hybridize correctly.
  13. 13. 2 Different types of Affymetrix arrays. 3’5’ Gene Arrays Exon Arrays Gene/Exon Arrays • Gene arrays are the most used (good quality and price ratio) • Gene arrays 2.0 more updated library and also includes lncRNAs
  14. 14. 2 Different types of expression arrays. •153 organisms in the array (human, mouse, rat, canine, ….) •100% miRBase v17 •2.216 snoRNAs and scaRNAs (human small nuclear RNAs) •Low inputs amounts (130 ng total RNA) •2.999 probe sets unique to pre-miRNA hairpins •Able to differentiate pre and mature miRNAs •Useful for FFPE samples miRNA
  15. 15. 2 Different types of expression arrays. HTA array
  16. 16. Affymetrix microarrays manufacture.3 Photolitografy
  17. 17. Affymetrix microarrays manufacture.3
  18. 18. 5 Microarray experiment workflow
  19. 19. 5 Microarray experiment workflow
  20. 20. 5 Microarray experiment workflow
  21. 21. 6 Quality Controls
  22. 22. 6 Quality Controls
  23. 23. 6 Quality Controls Length of amplified cRNA
  24. 24. 6 Quality Controls Length of fragmented cRNA
  25. 25. Bioinformàtica per a la Recerca Biomèdica Ricardo Gonzalo Sanz ricardo.gonzalo@vhir.org 20/05/14 Hospital Universitari Vall d’Hebron Institut de Recerca - VHIR Institut d’Investigació Sanitària de l’Instituto de Salud Carlos III (ISCIII) Basic aspects of Microarray Data Analysis
  26. 26. Filtering 2 3 4 5 6 Statistical inference of diferential expression Clustering Normalization 1 Introduction. Experimental design Quality control 7 8 Annotation Biological interpretation
  27. 27. 1 Introduction. Experimental design
  28. 28. 1 Introduction. Experimental design
  29. 29. 1 Introduction. Experimental design
  30. 30. 1 Introduction. Experimental design
  31. 31. 1 Introduction. Experimental design
  32. 32. 1 Introduction. Experimental design Microarrays Analysis Workflow
  33. 33. 2 Quality Control
  34. 34. 2 Quality Control Was the experiment a success??? • Microarray experiments generate huge quantitites of data • Standard statistical approach use plots to check the quality  show all data together  highlight structures  may help to detect problems (“unusual patterns”) It is hard to decide if things “seem to be all right” just by looking at the numbers.
  35. 35. 2 Quality Control Diagnostics plots for microarrays: • Microarray data usually considered at two levels 1. Low level. Data directly coming from the scanner 2. High level. Processed from low level data. Expression values, normalized or not. • Some plots are specific for some type of arrays or for some level
  36. 36. 2 Quality Control Diagnostics plots for microarrays: 1. Low level:  Layout image  Degradation plots (only in 3’IVT)  Histogram/density plots  PCA, Boxplot 2. High level:  MA plots  Model based plots (NUSE,RLE,)  PCA, Boxplot
  37. 37. 2 Quality Control Diganostics plots for microarrays. Low level. Layout image.
  38. 38. 2 Quality Control Diagnostic plots for microarrays. Low level. RNA degradation plot (3’IVT arrays)
  39. 39. 2 Quality Control Diagnostics plots for microarrays. Low level. Histogram/density Plot
  40. 40. 2 Quality Control Diagnostics plots for microarrays. Low level. Boxplot
  41. 41. 2 Quality Control
  42. 42. 2 Quality Control Diagnostics plots for microarrays. Low level. PCA
  43. 43. 2 Quality Control Diagnostics plots for microarrays. Low level. PCA
  44. 44. 2 Quality Control
  45. 45. 2 Quality Control Diagnostics plots for microarrays. High level. RLE
  46. 46. 2 Quality Control
  47. 47. 2 Quality Control Diagnostics plots for microarrays. High level. NUSE
  48. 48. 2 Quality Control Diagnostics plots for microarrays. High level. MA plots • MA plots allow pair wise comparison of log-intensity of each array to a reference array and identification of intensity-dependent biases. • The Y axis of the plot contains the log-ratio intentsity of one array to the reference median array, which is called “M” while the X axis contains the average log-intensity of both arrays – called “A”. • The probe levels are not likely to differ a lot so we expect a MA plot centered on the Y=0 axis from low to high intensities.
  49. 49. 2 Quality Control Diagnostics plots for microarrays. High level. MA plots
  50. 50. 2 Quality Control
  51. 51. 3 Normalization The goal of normalization is to adjust for the effects that are due to variations in the technology rather than the biology.
  52. 52. 3 Normalization
  53. 53. 3 Normalization
  54. 54. 3 Normalization
  55. 55. 4 Filtering • In a microarray experiment only a few hundreds/thousand of genes change their expression due to the different conditions •Researcher is interested in keeping the number of tests/genes as low as possible while keeping the interesting genes in the selected subset. •If the truly diferentially expressed genes are over-represented among those selectec in the filtering step, the FDR associated with a certain threshold of the statistic test will be lowered due to the filtering. Genes that do not change introduce noise, therefore is better not to be present when the statistical analysis is done
  56. 56. 4 Filtering Exists different types of filtering: • Annotation features (specific):  Specific gene features (i.e. GO term, presence of transcriptional regulative elements in promoters, etc.) Data derived from IPA • Signal features (non specific)  % intensities greater of a user defined value  Interquantile range (IQR) greater of a defined value
  57. 57. 4 Filtering Signal filtering: This technique has as its premise the removal of genes that are deemed to be not expressed or unchanged according to some specific criterion that is under the control of the user.
  58. 58. 5 Statistical inference of diferential expression • Indirect comparisons: 2 groups, unpaired • Direct comparsions: 2 groups. paired
  59. 59. 5 Statistical inference of diferential expression Limma package (Gordon Smith)
  60. 60. 5 Statistical inference of diferential expression
  61. 61. 5 Statistical inference of diferential expression
  62. 62. 5 Statistical inference of diferential expression
  63. 63. 5 Statistical inference of diferential expression
  64. 64. 6 Clustering Types:  Supervised clustering try to find the best partition for data that belong to a know set o classes  Unsupervised clustering try to define the number and the size of the classes in which the transcription profiles can be fitted in.
  65. 65. 6 Clustering
  66. 66. 6 Clustering Hierarchical Clustering (HCL) • HCL is an agglomerative /divise clustering method. • The iterative process continues until all groups are connected in a hierarchical tree. • Samples more similar between them are closed.
  67. 67. 6 Clustering
  68. 68. 7 Annotation
  69. 69. 8 Biological interpretation Gene Ontology
  70. 70. 8 Biological interpretation

×