SlideShare a Scribd company logo
1 of 32
Azhar Ali Shah @ Interdisciplinary Optimization and Decision Making  Journal Club (IODMJC) IODMJC, March 20 , 2009
Overview  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Azhar A Shah Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space /31
Introduction:  authors Azhar A Shah Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space /31
Introduction:  Hierarchical  Clustering Azhar A Shah Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space /31
Introduction:  Hierarchical Clustering ,[object Object],[object Object],[object Object],[object Object],[object Object],Azhar A Shah Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space /31
Introduction:  about the topic  Azhar A Shah Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space /31 There is no guideline for selecting the best linkage method. In practice, people almost always use  average linkage. UPGMA  (Unweighted Pair Group Method using arithmetic Averages) Scalable to large datasets as it requires only (O(1)) edges in memory. BUT Highly susceptible to outliers!
Introduction:  UPGMA ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Introduction:  UPGMA -Sparse input N=11  input singletons ( vertices ): {1,2,3,4,11,12,13,14,21,22,23}  and  14 edges  in the sparse input.   The input is considered  sparse  since  not all pairs are given  e.g. there is no edge b/w 1 and 22.  Clusters  1,2,3,4  form a  clique  A.  Clusters  11,12,13,14  are missing edge < 11,14 > to form  clique  B.  Clusters  21,22,23  are loosely connected to each other and to the cluster of  clique  A.  In total there are two connected components in the input graph:  ({1,2,3,4,21,22,23})  (producing 6 merges for 7 vertices) and  {11,12,13,14}  (producing 4 merges for 3 nodes), which therefore forms a  forest of two disjoint trees , rather than the full tree of N-1=10 merges.  UPGMA-input 90 23 1 70 23 22 50 22 21 30 14 13 20 14 12 12 13 12 11 13 11 1e+01 12 11 4e-10 4 3 1e-50 4 2 1e-80 3 2 2e-40 4 1 1e-40 3 1 1e-100 2 1 UPGMA-tree 32 99.167 31 26 31 85 29 23 30 50 28 14 29 50 22 21 28 11.5 27 13 27 10 12 11 26 1.33e-10 25 4 25 5e-41 24 3 24 1e-100 2 1
Research Problem:  UPGMA ,[object Object],Azhar A Shah Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space /31 This data renders UPGMA impractical
Methodology: 1)  Sparse-UPGMA Azhar A Shah Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space /31 Can’t  cope with huge datasets, where an  O ( E ) memory requirement is intolerable (e.g. Table 1).  UPGMA (mean): New eq: Time and memory improvement:
Methodology: 2)  Multi-Round MC-UPGMA ,[object Object],[object Object],[object Object],Illustration of  non-metric  constraints imposed by BLAST sequence similarities (eges).  False transitivity  is possible due to CSKP_HUMAN.
Methodology: 2)  Multi-Round MC-UPGMA ,[object Object],[object Object],Azhar A Shah Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space /31
Methodology: 2)  Multi-Round MC-UPGMA Azhar A Shah Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space /31 ,[object Object],[object Object]
Methodology: 2)  Single-Round MC-UPGMA Azhar A Shah Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space /31 Requires O(n) memory for holding forming tree!
Methodology: 2)  Single-Round MC-UPGMA
Methods ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Methods ,[object Object],[object Object],[object Object],Jaccard Score
Results ,[object Object],[object Object],[object Object],[object Object]
Results Smith–Waterman BLAST Sparse UPGMA With reduced dataset 220K 1.80M
Results 200 clustering rounds on a single 4GB memory 4-CPU workstation took about 1-2 days.
Results
Observations ,[object Object],[object Object]
Azhar A Shah Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space /31
Cluster Card Page
View Proteins of Cluster
Keywords Appearances
Cluster Similarity Distribution
similarity matrix for the proteins in this cluster
 
 
 
 

More Related Content

What's hot

B.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blastB.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blastRai University
 
Presentation for blast algorithm bio-informatice
Presentation for blast algorithm bio-informaticePresentation for blast algorithm bio-informatice
Presentation for blast algorithm bio-informaticezahid6
 
BTrees - Great alternative to Red Black, AVL and other BSTs
BTrees - Great alternative to Red Black, AVL and other BSTsBTrees - Great alternative to Red Black, AVL and other BSTs
BTrees - Great alternative to Red Black, AVL and other BSTsAmrinder Arora
 
Blast fasta
Blast fastaBlast fasta
Blast fastayaghava
 
Graphs, Trees, Paths and Their Representations
Graphs, Trees, Paths and Their RepresentationsGraphs, Trees, Paths and Their Representations
Graphs, Trees, Paths and Their RepresentationsAmrinder Arora
 
Bioinformatics t5-database searching-v2013_wim_vancriekinge
Bioinformatics t5-database searching-v2013_wim_vancriekingeBioinformatics t5-database searching-v2013_wim_vancriekinge
Bioinformatics t5-database searching-v2013_wim_vancriekingeProf. Wim Van Criekinge
 
Swaati algorithm of alignment ppt
Swaati algorithm of alignment pptSwaati algorithm of alignment ppt
Swaati algorithm of alignment pptSwati Kumari
 
Product to a Power
Product to a PowerProduct to a Power
Product to a Powertoni dimella
 
Splay Trees and Self Organizing Data Structures
Splay Trees and Self Organizing Data StructuresSplay Trees and Self Organizing Data Structures
Splay Trees and Self Organizing Data StructuresAmrinder Arora
 
Prediction of transcription factor binding to DNA using rule induction methods
Prediction of transcription factor binding to DNA using rule induction methodsPrediction of transcription factor binding to DNA using rule induction methods
Prediction of transcription factor binding to DNA using rule induction methodsziggurat
 
Data Structure with C -Part-2 ADT,Array, Strucure and Union
Data Structure with C -Part-2 ADT,Array, Strucure and  UnionData Structure with C -Part-2 ADT,Array, Strucure and  Union
Data Structure with C -Part-2 ADT,Array, Strucure and UnionSyed Mustafa
 

What's hot (20)

B.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blastB.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blast
 
Presentation for blast algorithm bio-informatice
Presentation for blast algorithm bio-informaticePresentation for blast algorithm bio-informatice
Presentation for blast algorithm bio-informatice
 
Syabus
SyabusSyabus
Syabus
 
BTrees - Great alternative to Red Black, AVL and other BSTs
BTrees - Great alternative to Red Black, AVL and other BSTsBTrees - Great alternative to Red Black, AVL and other BSTs
BTrees - Great alternative to Red Black, AVL and other BSTs
 
Phylogenetics: Tree building
Phylogenetics: Tree buildingPhylogenetics: Tree building
Phylogenetics: Tree building
 
Blast fasta
Blast fastaBlast fasta
Blast fasta
 
Graphs, Trees, Paths and Their Representations
Graphs, Trees, Paths and Their RepresentationsGraphs, Trees, Paths and Their Representations
Graphs, Trees, Paths and Their Representations
 
synopsis_divyesh
synopsis_divyeshsynopsis_divyesh
synopsis_divyesh
 
Bioinformatics t5-database searching-v2013_wim_vancriekinge
Bioinformatics t5-database searching-v2013_wim_vancriekingeBioinformatics t5-database searching-v2013_wim_vancriekinge
Bioinformatics t5-database searching-v2013_wim_vancriekinge
 
dot plot analysis
dot plot analysisdot plot analysis
dot plot analysis
 
blast and fasta
 blast and fasta blast and fasta
blast and fasta
 
Upgma
UpgmaUpgma
Upgma
 
Use of the Tree.
Use of the Tree.Use of the Tree.
Use of the Tree.
 
Swaati algorithm of alignment ppt
Swaati algorithm of alignment pptSwaati algorithm of alignment ppt
Swaati algorithm of alignment ppt
 
Product to a Power
Product to a PowerProduct to a Power
Product to a Power
 
Biological sequences analysis
Biological sequences analysisBiological sequences analysis
Biological sequences analysis
 
Splay Trees and Self Organizing Data Structures
Splay Trees and Self Organizing Data StructuresSplay Trees and Self Organizing Data Structures
Splay Trees and Self Organizing Data Structures
 
Prediction of transcription factor binding to DNA using rule induction methods
Prediction of transcription factor binding to DNA using rule induction methodsPrediction of transcription factor binding to DNA using rule induction methods
Prediction of transcription factor binding to DNA using rule induction methods
 
Slides -a._afanasiev
Slides  -a._afanasievSlides  -a._afanasiev
Slides -a._afanasiev
 
Data Structure with C -Part-2 ADT,Array, Strucure and Union
Data Structure with C -Part-2 ADT,Array, Strucure and  UnionData Structure with C -Part-2 ADT,Array, Strucure and  Union
Data Structure with C -Part-2 ADT,Array, Strucure and Union
 

Viewers also liked

Final Journal Club Presentation
Final Journal Club PresentationFinal Journal Club Presentation
Final Journal Club PresentationAnna Schemel
 
The Structural Basis for Agonist and Partial Agonist
The Structural Basis for Agonist and Partial AgonistThe Structural Basis for Agonist and Partial Agonist
The Structural Basis for Agonist and Partial AgonistLucas Man
 
20140328 TNTL journal club axion electrodynamics, TI-FI interface (nomura, ...
20140328 TNTL journal club   axion electrodynamics, TI-FI interface (nomura, ...20140328 TNTL journal club   axion electrodynamics, TI-FI interface (nomura, ...
20140328 TNTL journal club axion electrodynamics, TI-FI interface (nomura, ...Dongwook Go
 
Pseudogene Journal Club Presentation
Pseudogene Journal Club PresentationPseudogene Journal Club Presentation
Pseudogene Journal Club PresentationLucas Man
 
Journal Club - Early versus Late Parenteral Nutrition in Critically Ill Adults
Journal Club - Early versus Late Parenteral Nutrition in Critically Ill AdultsJournal Club - Early versus Late Parenteral Nutrition in Critically Ill Adults
Journal Club - Early versus Late Parenteral Nutrition in Critically Ill AdultsJoy Awoniyi
 
Schaefer, Joseph, R. Fidaxomicin Presentation
Schaefer, Joseph, R. Fidaxomicin PresentationSchaefer, Joseph, R. Fidaxomicin Presentation
Schaefer, Joseph, R. Fidaxomicin PresentationJoseph Schaefer
 
Parkinson's Disease Presentation
Parkinson's Disease PresentationParkinson's Disease Presentation
Parkinson's Disease PresentationSteven Zuckerman
 
Azithromycin for prevention of exacerbations of copd
Azithromycin for prevention of exacerbations of copdAzithromycin for prevention of exacerbations of copd
Azithromycin for prevention of exacerbations of copdWarawut Ia
 
Acute exacerbation of COPD
Acute exacerbation of COPDAcute exacerbation of COPD
Acute exacerbation of COPDThomas Kurian
 
Journal Club: Daily Corticosteroids Reduce Infection-associated Relapses in F...
Journal Club: Daily Corticosteroids Reduce Infection-associated Relapses in F...Journal Club: Daily Corticosteroids Reduce Infection-associated Relapses in F...
Journal Club: Daily Corticosteroids Reduce Infection-associated Relapses in F...Hofstra Northwell School of Medicine
 
Journal Club: Fidaxomicin versus Vancomycin for Clostridium Difficile Infection
Journal Club: Fidaxomicin versus Vancomycin for Clostridium Difficile InfectionJournal Club: Fidaxomicin versus Vancomycin for Clostridium Difficile Infection
Journal Club: Fidaxomicin versus Vancomycin for Clostridium Difficile InfectionJoy Awoniyi
 
Prevention of Venous Thromboembolism
Prevention of Venous ThromboembolismPrevention of Venous Thromboembolism
Prevention of Venous ThromboembolismJoy Awoniyi
 
Journal Club: Thrombin-Receptor Antagonist Vorapaxar in Acute Coronary Syndromes
Journal Club: Thrombin-Receptor Antagonist Vorapaxar in Acute Coronary SyndromesJournal Club: Thrombin-Receptor Antagonist Vorapaxar in Acute Coronary Syndromes
Journal Club: Thrombin-Receptor Antagonist Vorapaxar in Acute Coronary SyndromesJoy Awoniyi
 
Parkinsons Disease
Parkinsons DiseaseParkinsons Disease
Parkinsons Diseasetest
 
How to present a journal club
How to present a journal clubHow to present a journal club
How to present a journal clubsanch1684
 

Viewers also liked (19)

Journal Club @ UVigo 2011.07.22
Journal Club @ UVigo 2011.07.22Journal Club @ UVigo 2011.07.22
Journal Club @ UVigo 2011.07.22
 
Final Journal Club Presentation
Final Journal Club PresentationFinal Journal Club Presentation
Final Journal Club Presentation
 
The Structural Basis for Agonist and Partial Agonist
The Structural Basis for Agonist and Partial AgonistThe Structural Basis for Agonist and Partial Agonist
The Structural Basis for Agonist and Partial Agonist
 
20140328 TNTL journal club axion electrodynamics, TI-FI interface (nomura, ...
20140328 TNTL journal club   axion electrodynamics, TI-FI interface (nomura, ...20140328 TNTL journal club   axion electrodynamics, TI-FI interface (nomura, ...
20140328 TNTL journal club axion electrodynamics, TI-FI interface (nomura, ...
 
Pseudogene Journal Club Presentation
Pseudogene Journal Club PresentationPseudogene Journal Club Presentation
Pseudogene Journal Club Presentation
 
Journal Club - Early versus Late Parenteral Nutrition in Critically Ill Adults
Journal Club - Early versus Late Parenteral Nutrition in Critically Ill AdultsJournal Club - Early versus Late Parenteral Nutrition in Critically Ill Adults
Journal Club - Early versus Late Parenteral Nutrition in Critically Ill Adults
 
Schaefer, Joseph, R. Fidaxomicin Presentation
Schaefer, Joseph, R. Fidaxomicin PresentationSchaefer, Joseph, R. Fidaxomicin Presentation
Schaefer, Joseph, R. Fidaxomicin Presentation
 
Rituximab CJASN Journal Club
Rituximab CJASN Journal ClubRituximab CJASN Journal Club
Rituximab CJASN Journal Club
 
Parkinson's Disease Presentation
Parkinson's Disease PresentationParkinson's Disease Presentation
Parkinson's Disease Presentation
 
Azithromycin for prevention of exacerbations of copd
Azithromycin for prevention of exacerbations of copdAzithromycin for prevention of exacerbations of copd
Azithromycin for prevention of exacerbations of copd
 
Acute exacerbation of COPD
Acute exacerbation of COPDAcute exacerbation of COPD
Acute exacerbation of COPD
 
Journal Club: Daily Corticosteroids Reduce Infection-associated Relapses in F...
Journal Club: Daily Corticosteroids Reduce Infection-associated Relapses in F...Journal Club: Daily Corticosteroids Reduce Infection-associated Relapses in F...
Journal Club: Daily Corticosteroids Reduce Infection-associated Relapses in F...
 
Journal Club: Fidaxomicin versus Vancomycin for Clostridium Difficile Infection
Journal Club: Fidaxomicin versus Vancomycin for Clostridium Difficile InfectionJournal Club: Fidaxomicin versus Vancomycin for Clostridium Difficile Infection
Journal Club: Fidaxomicin versus Vancomycin for Clostridium Difficile Infection
 
Genetic Basis Of Parkinson Disease
Genetic Basis Of Parkinson DiseaseGenetic Basis Of Parkinson Disease
Genetic Basis Of Parkinson Disease
 
Prevention of Venous Thromboembolism
Prevention of Venous ThromboembolismPrevention of Venous Thromboembolism
Prevention of Venous Thromboembolism
 
Journal Club
Journal ClubJournal Club
Journal Club
 
Journal Club: Thrombin-Receptor Antagonist Vorapaxar in Acute Coronary Syndromes
Journal Club: Thrombin-Receptor Antagonist Vorapaxar in Acute Coronary SyndromesJournal Club: Thrombin-Receptor Antagonist Vorapaxar in Acute Coronary Syndromes
Journal Club: Thrombin-Receptor Antagonist Vorapaxar in Acute Coronary Syndromes
 
Parkinsons Disease
Parkinsons DiseaseParkinsons Disease
Parkinsons Disease
 
How to present a journal club
How to present a journal clubHow to present a journal club
How to present a journal club
 

Similar to Presentation 2009 Journal Club Azhar Ali Shah

The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...CSCJournals
 
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...Waqas Tariq
 
20100515 bioinformatics kapushesky_lecture07
20100515 bioinformatics kapushesky_lecture0720100515 bioinformatics kapushesky_lecture07
20100515 bioinformatics kapushesky_lecture07Computer Science Club
 
Clustering and Visualisation using R programming
Clustering and Visualisation using R programmingClustering and Visualisation using R programming
Clustering and Visualisation using R programmingNixon Mendez
 
CMSI計算科学技術特論A (2015) 第13回 Parallelization of Molecular Dynamics
CMSI計算科学技術特論A (2015) 第13回 Parallelization of Molecular Dynamics CMSI計算科学技術特論A (2015) 第13回 Parallelization of Molecular Dynamics
CMSI計算科学技術特論A (2015) 第13回 Parallelization of Molecular Dynamics Computational Materials Science Initiative
 
Kernal based speaker specific feature extraction and its applications in iTau...
Kernal based speaker specific feature extraction and its applications in iTau...Kernal based speaker specific feature extraction and its applications in iTau...
Kernal based speaker specific feature extraction and its applications in iTau...TELKOMNIKA JOURNAL
 
Automated Clustering Project - 12th CONTECSI 34th WCARS
Automated Clustering Project - 12th CONTECSI 34th WCARS Automated Clustering Project - 12th CONTECSI 34th WCARS
Automated Clustering Project - 12th CONTECSI 34th WCARS TECSI FEA USP
 
Msa & rooted/unrooted tree
Msa & rooted/unrooted treeMsa & rooted/unrooted tree
Msa & rooted/unrooted treeSamiul Ehsan
 
04 15029 active node ijeecs 1570310145(edit)
04 15029 active node ijeecs 1570310145(edit)04 15029 active node ijeecs 1570310145(edit)
04 15029 active node ijeecs 1570310145(edit)nooriasukmaningtyas
 
Nural network ER.Abhishek k. upadhyay
Nural network  ER.Abhishek k. upadhyayNural network  ER.Abhishek k. upadhyay
Nural network ER.Abhishek k. upadhyayabhishek upadhyay
 
B.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blastB.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blastRai University
 
Graph theoretic neuromorphology
Graph theoretic neuromorphologyGraph theoretic neuromorphology
Graph theoretic neuromorphologyTamalBatabyal
 
An Efficient Clustering Method for Aggregation on Data Fragments
An Efficient Clustering Method for Aggregation on Data FragmentsAn Efficient Clustering Method for Aggregation on Data Fragments
An Efficient Clustering Method for Aggregation on Data FragmentsIJMER
 
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...Natalio Krasnogor
 
Elastic path2path (International Conference on Image Processing'18)
Elastic path2path (International Conference on Image Processing'18)Elastic path2path (International Conference on Image Processing'18)
Elastic path2path (International Conference on Image Processing'18)TamalBatabyal
 

Similar to Presentation 2009 Journal Club Azhar Ali Shah (20)

The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
 
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
The Positive Effects of Fuzzy C-Means Clustering on Supervised Learning Class...
 
report
reportreport
report
 
20100515 bioinformatics kapushesky_lecture07
20100515 bioinformatics kapushesky_lecture0720100515 bioinformatics kapushesky_lecture07
20100515 bioinformatics kapushesky_lecture07
 
post119s1-file2
post119s1-file2post119s1-file2
post119s1-file2
 
BioINfo.pptx
BioINfo.pptxBioINfo.pptx
BioINfo.pptx
 
Clustering and Visualisation using R programming
Clustering and Visualisation using R programmingClustering and Visualisation using R programming
Clustering and Visualisation using R programming
 
CMSI計算科学技術特論A (2015) 第13回 Parallelization of Molecular Dynamics
CMSI計算科学技術特論A (2015) 第13回 Parallelization of Molecular Dynamics CMSI計算科学技術特論A (2015) 第13回 Parallelization of Molecular Dynamics
CMSI計算科学技術特論A (2015) 第13回 Parallelization of Molecular Dynamics
 
Kernal based speaker specific feature extraction and its applications in iTau...
Kernal based speaker specific feature extraction and its applications in iTau...Kernal based speaker specific feature extraction and its applications in iTau...
Kernal based speaker specific feature extraction and its applications in iTau...
 
Automated Clustering Project - 12th CONTECSI 34th WCARS
Automated Clustering Project - 12th CONTECSI 34th WCARS Automated Clustering Project - 12th CONTECSI 34th WCARS
Automated Clustering Project - 12th CONTECSI 34th WCARS
 
Msa & rooted/unrooted tree
Msa & rooted/unrooted treeMsa & rooted/unrooted tree
Msa & rooted/unrooted tree
 
04 15029 active node ijeecs 1570310145(edit)
04 15029 active node ijeecs 1570310145(edit)04 15029 active node ijeecs 1570310145(edit)
04 15029 active node ijeecs 1570310145(edit)
 
Nural network ER.Abhishek k. upadhyay
Nural network  ER.Abhishek k. upadhyayNural network  ER.Abhishek k. upadhyay
Nural network ER.Abhishek k. upadhyay
 
B.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blastB.sc biochem i bobi u 3.2 algorithm + blast
B.sc biochem i bobi u 3.2 algorithm + blast
 
FractalTreeIndex
FractalTreeIndexFractalTreeIndex
FractalTreeIndex
 
H010223640
H010223640H010223640
H010223640
 
Graph theoretic neuromorphology
Graph theoretic neuromorphologyGraph theoretic neuromorphology
Graph theoretic neuromorphology
 
An Efficient Clustering Method for Aggregation on Data Fragments
An Efficient Clustering Method for Aggregation on Data FragmentsAn Efficient Clustering Method for Aggregation on Data Fragments
An Efficient Clustering Method for Aggregation on Data Fragments
 
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
 
Elastic path2path (International Conference on Image Processing'18)
Elastic path2path (International Conference on Image Processing'18)Elastic path2path (International Conference on Image Processing'18)
Elastic path2path (International Conference on Image Processing'18)
 

Recently uploaded

Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jisc
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibitjbellavia9
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...Poonam Aher Patil
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfDr Vijay Vishwakarma
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseAnaAcapella
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Pooja Bhuva
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxDr. Ravikiran H M Gowda
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024Elizabeth Walsh
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Association for Project Management
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17Celine George
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 

Recently uploaded (20)

Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdfUnit 3 Emotional Intelligence and Spiritual Intelligence.pdf
Unit 3 Emotional Intelligence and Spiritual Intelligence.pdf
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
Sensory_Experience_and_Emotional_Resonance_in_Gabriel_Okaras_The_Piano_and_Th...
 
REMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptxREMIFENTANIL: An Ultra short acting opioid.pptx
REMIFENTANIL: An Ultra short acting opioid.pptx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 

Presentation 2009 Journal Club Azhar Ali Shah

  • 1. Azhar Ali Shah @ Interdisciplinary Optimization and Decision Making Journal Club (IODMJC) IODMJC, March 20 , 2009
  • 2.
  • 3. Introduction: authors Azhar A Shah Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space /31
  • 4. Introduction: Hierarchical Clustering Azhar A Shah Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space /31
  • 5.
  • 6. Introduction: about the topic Azhar A Shah Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space /31 There is no guideline for selecting the best linkage method. In practice, people almost always use average linkage. UPGMA (Unweighted Pair Group Method using arithmetic Averages) Scalable to large datasets as it requires only (O(1)) edges in memory. BUT Highly susceptible to outliers!
  • 7.
  • 8. Introduction: UPGMA -Sparse input N=11 input singletons ( vertices ): {1,2,3,4,11,12,13,14,21,22,23} and 14 edges in the sparse input. The input is considered sparse since not all pairs are given e.g. there is no edge b/w 1 and 22. Clusters 1,2,3,4 form a clique A. Clusters 11,12,13,14 are missing edge < 11,14 > to form clique B. Clusters 21,22,23 are loosely connected to each other and to the cluster of clique A. In total there are two connected components in the input graph: ({1,2,3,4,21,22,23}) (producing 6 merges for 7 vertices) and {11,12,13,14} (producing 4 merges for 3 nodes), which therefore forms a forest of two disjoint trees , rather than the full tree of N-1=10 merges. UPGMA-input 90 23 1 70 23 22 50 22 21 30 14 13 20 14 12 12 13 12 11 13 11 1e+01 12 11 4e-10 4 3 1e-50 4 2 1e-80 3 2 2e-40 4 1 1e-40 3 1 1e-100 2 1 UPGMA-tree 32 99.167 31 26 31 85 29 23 30 50 28 14 29 50 22 21 28 11.5 27 13 27 10 12 11 26 1.33e-10 25 4 25 5e-41 24 3 24 1e-100 2 1
  • 9.
  • 10. Methodology: 1) Sparse-UPGMA Azhar A Shah Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space /31 Can’t cope with huge datasets, where an O ( E ) memory requirement is intolerable (e.g. Table 1). UPGMA (mean): New eq: Time and memory improvement:
  • 11.
  • 12.
  • 13.
  • 14. Methodology: 2) Single-Round MC-UPGMA Azhar A Shah Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space /31 Requires O(n) memory for holding forming tree!
  • 15. Methodology: 2) Single-Round MC-UPGMA
  • 16.
  • 17.
  • 18.
  • 19. Results Smith–Waterman BLAST Sparse UPGMA With reduced dataset 220K 1.80M
  • 20. Results 200 clustering rounds on a single 4GB memory 4-CPU workstation took about 1-2 days.
  • 22.
  • 23. Azhar A Shah Efficient algorithms for accurate hierarchical clustering of huge datasets: tackling the entire protein space /31
  • 25. View Proteins of Cluster
  • 28. similarity matrix for the proteins in this cluster
  • 29.  
  • 30.  
  • 31.  
  • 32.