Your SlideShare is downloading.
×

×
# Saving this for later?

### Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

#### Text the download link to your phone

Standard text messaging rates apply

Like this presentation? Why not share!

- How to do successful gene expressio... by Biogazelle 3400 views
- 20100625 Siena gene_expression_anal... by Biogazelle 964 views
- Evolution and systematics.ppt by Jasper Obico 1421 views
- Lab Gene Expression Data Analysis by USD Bioinformatics 1338 views
- Measuring Gene Expression by Atul Narkhede 84 views
- SAGE (Serial analysis of Gene Expre... by talhakhat 3121 views
- Green Book 9 by Wesley McCammon 733 views
- Null hypothesis for an independent-... by BYU Center for Te... 67 views
- Lassa virus detection using gene ex... by Nacho Caballero 205 views
- Systematics by John Wilkins 147 views
- Advanced Topics In Business Intelli... by guest1a9ef2 3017 views
- Why Do Pesticides Have To Be Danger... by Pure Solutions 353 views

Like this? Share it with your network
Share

No Downloads

Total Views

1,078

On Slideshare

0

From Embeds

0

Number of Embeds

1

Shares

0

Downloads

46

Comments

0

Likes

1

No embeds

No notes for slide

- 1. Analysis ofGene Expression Data _______________________ Jhoirene B. Clemente Algorithms and Complexity Lab University of the Philippines Diliman
- 2. Overview● Definitions● Clustering of Gene Expression Data● Visualizations of Gene Expression Data
- 3. DefinitionsGeneBasic unit of heredity in a living organism.It is normally a stretch of DNA that codesfor a type of protein or for an RNA chainthat has a function in the organism.Gene Expression DataExpression level of genes in an individualthat is measured through Microarray
- 4. Definitions
- 5. Definitions
- 6. DefinitionsGene Expression Data Gene Gene Expression a b c ... n
- 7. DefinitionsGene Expression Data 1 Sample Gene Gene Expression a b n Samples c ... n
- 8. Definitions (n x m) Data Matrix m Samples Gene Sample Sample ..... Sample 1 1 m a b nSamples c ... n
- 9. Definitions (n x m) Data Matrix m Samples Gene Sample Sample ..... Sample 1 1 m a b nSamples c ... n
- 10. ClusteringClustering is the unsupervised classiﬁcation ofpatterns including observations, data sets andfeature vectors into groups called clusters,such that objects in the same cluster are similar toeach other while objects in different clusters aredissimilar as possible.
- 11. ClusteringClustering is the unsupervised classiﬁcation ofpatterns including observations, data sets andfeature vectors into groups called clusters,such that objects in the same cluster are similar toeach other while objects in different clusters aredissimilar as possible.
- 12. Cluster AnalysisPreprocessing ● Filtering ● Normalization Clustering Analysis
- 13. ClusteringPartitional● K-means Algorithm● X-means AlgorithmHierarchical
- 14. ClusteringGiven the (n x m) data matrix, we can● Cluster the set of genes● Cluster the set of samples● Cluster the set of genes and samples simultaneously.
- 15. Data SetData set is a time series gene expression data froma synchronized population of yeast.
- 16. Data SetData set is a time series gene expression data froma synchronized population of yeast.
- 17. PreprocessingFiltering ● Removed genes not involved in cell cycle regulation ● Removed genes belonging to more than one groupNormalization● All gene expression values range from -1.0 to 1.0.
- 18. Data SetData matrix (384 genes and 17 samples) with 5classifications.Groupings based from cell cycle phase activation.
- 19. Data SetGroup 1: Resting Phase
- 20. Data SetGroup 2: First Growth Phase
- 21. Data SetGroup 3: Synthesis Phase
- 22. Data SetGroup 4: Second Growth Phase
- 23. Data SetGroup 5: Cell Division
- 24. Clustering of genesK-means AlgorithmGiven n data points in Rd1. Assign k initial centers of the k clusters2. Assign all the data points to the nearest cluster (Euclidean distance, Manhattan distance, etc.)3. Adjust the k centers4. Repeat steps 2 and 3 until convergence
- 25. Clustering of genesK-means AlgorithmGiven n data points in Rd1. Assign k initial centers of the k clusters2. Assign all the data points to the nearest cluster (Euclidean distance, Manhattan distance, etc.)3. Adjust the k centers4. Repeat steps 2 and 3 until convergence k =5 since we want to approximate the 5
- 26. Clustering of genesInitialization1. Choose the first k centers that will maximize the distance between the clusters2. Sort the distances between all the data points and then choose the k initial points at constant intervals from the sorted list3. Use the first k points in the data set as the first k centers
- 27. Clustering of genesUsing k-means clustering, with k =5
- 28. Clustering of genes● Clustering may suggest possible roles for genes with unknown functions● Clustering the samples or experiments may shed light on new subtypes of diseases.● Identify which type of treatment is suited for a specific type of cancer.● Building genetic networks
- 29. visualizationVector FusionNon-metric Multidimensional Scaling (nMDS)Principal Components Analysis (PCA)
- 30. Vector fusionVisualization technique that uses the Single pointbroken line parallel algorithm
- 31. nMDS visualizationInput (Dissimilarity Matrix=|ij|) actual distance ● In nMDS, only the rank order of entries is assumed to contain the significant information. ● Thus, the purpose of the non-metric MDS algorithm is to find a configuration of points whose distances reflect as closely as possible the rank order of the data. ● The transformation is by using a non parametric function f. (monotone regression) dij= f(dij) pseudo-distance
- 32. PCA
- 33. vector fusionvisualization
- 34. nmds visualization
- 35. nmds visualization
- 36. nmds visualization
- 37. nmds visualization
- 38. nmds visualization
- 39. nmds visualization
- 40. nmds visualization
- 41. References2010: "Non-Metric Multidimensional Scaling and VectorFusion Visualization of Cell Cycle Independent GeneExpressions for Gene Function Analysis", Clemente J.,Salido J.A., (2010), Published in the conferenceproceedings of National Conference on InformationTechnology for Education(NCITE) 2010 and Philippine ITJournal Feb 2011 Issue.2010: "Cluster Analysis for Identifying Genes HighlyCorrelated with a Phenotype", Clemente J.,Undergraduate thesis, Department of Computer Science,University of the Philippines Diliman
- 42. Thank you for Listening

Be the first to comment