PhDc exam presentation
Upcoming SlideShare
Loading in...5
×
 

PhDc exam presentation

on

  • 706 views

This are the slides I used on my PhD candidature exam.

This are the slides I used on my PhD candidature exam.

Statistics

Views

Total Views
706
Views on SlideShare
523
Embed Views
183

Actions

Likes
0
Downloads
0
Comments
0

11 Embeds 183

http://tics-tag-tongue.blogspot.com 100
http://tics-tag-tongue.blogspot.mx 29
http://tics-tag-tongue.blogspot.ru 16
http://tics-tag-tongue.blogspot.com.es 12
http://tics-tag-tongue.blogspot.de 8
http://tics-tag-tongue.blogspot.se 7
http://tics-tag-tongue.blogspot.com.br 4
http://tics-tag-tongue.blogspot.com.ar 4
http://tics-tag-tongue.blogspot.ca 1
http://tics-tag-tongue.blogspot.ch 1
http://tics-tag-tongue.blogspot.in 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

PhDc exam presentation PhDc exam presentation Presentation Transcript

  • Functional Characterisation ofMetabolic NetworksCarlos Manuel Estévez-Bretón MScDoctorate in Systems Engineering and Computer SciencesAdvisors: Luis Fernando Niño PhDLiliana Lopez Kleine PhDIntelligent Systems Research Laboratory - LISIBioinformatics and Computational Biology research line “BioLisi”Examining Committee:Dr. Jason Papin, -U. ofVirginia, Bioengineering.Dr.Andres Gonzalez, - U. de los Andes, Chemical Engineering.Dr. Fabio Gonzalez, U. Nacional, Systems Engineering.
  • What...Why...Research QuestionHow...Progress ...AgendaGoalsEvaluationDeliverables
  • What?http://www.impactcommunicationsinc.com/wp-content/uploads/2011/10/11-11_speak_up.jpg View slide
  • Metabolism are thecomplete set ofmetabolicnetworks andphysical processesthat determine thephysiological andbiochemical propertiesof a cell.With the sequencing of completegenomes, it is now possible toreconstruct the network of biochemicalreactions in many organisms, frombacteria to humans... View slide
  • PMC 2011 August 17.Wiley Interdiscip Rev Syst Biol Med. 2010 Jul-Aug; 2(4): 438–459.doi: 10.1002/wsbm.75Ecological ScaleLucas B. Edelman, James A. Eddy, and Nathan D. PriceSystems BiologyIntroduction
  • PMC 2011 August 17.Wiley Interdiscip Rev Syst Biol Med. 2010 Jul-Aug; 2(4): 438–459.doi: 10.1002/wsbm.75Ecological ScaleLucas B. Edelman, James A. Eddy, and Nathan D. PriceSystems BiologyIntroduction
  • PMC 2011 August 17.Wiley Interdiscip Rev Syst Biol Med. 2010 Jul-Aug; 2(4): 438–459.doi: 10.1002/wsbm.75Ecological ScaleLucas B. Edelman, James A. Eddy, and Nathan D. PriceSystems BiologyIntroduction
  • PMC 2011 August 17.Wiley Interdiscip Rev Syst Biol Med. 2010 Jul-Aug; 2(4): 438–459.doi: 10.1002/wsbm.75Ecological ScaleLucas B. Edelman, James A. Eddy, and Nathan D. PriceSystems BiologyIntroduction
  • PMC 2011 August 17.Wiley Interdiscip Rev Syst Biol Med. 2010 Jul-Aug; 2(4): 438–459.doi: 10.1002/wsbm.75Ecological ScaleLucas B. Edelman, James A. Eddy, and Nathan D. PriceMultilevelfieldSystems BiologyIntroduction
  • PMC 2011 August 17.Wiley Interdiscip Rev Syst Biol Med. 2010 Jul-Aug; 2(4): 438–459.doi: 10.1002/wsbm.75Ecological ScaleLucas B. Edelman, James A. Eddy, and Nathan D. PriceMultilevelfieldStudiedInterdisciplinarySystems BiologyIntroduction
  • IntroductionBetter and cheaper processing power
  • Multilevel InformationIntroductionBetter and cheaper processing power
  • IntroductionRegulatory NetworksProtein Protein InteractionNetworksMetabolic NetworksEcological Networks
  • IntroductionRegulatory NetworksProtein Protein InteractionNetworksMetabolic NetworksEcological NetworksMain Data Sources
  • “Techniques such as high-trougput (HT)sequencing and gene/protein profiling havetransformed biological Research” (Khatri et al,2012)“In this way,the advent of HT profiling technologiespresents a new challenge,that of extracting meaning froma long list of differentially expressed genes and proteins”.(Khatri et al,2012)
  • “Techniques such as high-trougput (HT)sequencing and gene/protein profiling havetransformed biological Research” (Khatri et al,2012)“In this way,the advent of HT profiling technologiespresents a new challenge,that of extracting meaning froma long list of differentially expressed genes and proteins”.(Khatri et al,2012)These biological techniques changes the way we studybiological science.Interdisciplinary effort to extract meaning, analyze, andobtain information with high levels of confidence andquality.
  • [14:56 18/11/2011 Bioinformatics-btr585.tex] Page: 3331 3331–3332commonly used in bioinformatics and their common synonyms,plural forms and abbreviations. We then searched this list againstthe PubMed titles and abstracts to identify the number of paperspublished per year for each machine learning technique. To match asmany papers as possible, searches were case insensitive and allowedfor variation in hyphenation.Fig. 1. The growth of supervised machine learning methods in PubMed.∗To whom correspondence should be addressedperhaps going out of fashion. The results show that none of themajor league methods has gone out of fashion, but we do seemoderate decreases in the use of both ANNs and Markov models inthe literature.We were also curious to find out if certain machine learningtechniques were used in combination with each other. To investigatethis, we looked at what machine learning methods are co-mentionedin articles (See Fig. 2). For all pairs of methods from the SupervisedFig. 2. Heatmap showing the co-occurrence of machine learning techniqueswithin articles.© The Author(s) 2011. Published by Oxford University Press.This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.byguestonDecember7,2011ormatics.oxfordjournals.org/“Hot techniques”: ANN,Markov Models,and“new ones”SVM and Random Forests.(Jensen & Bateman in 2011)IntelligentSystemsLatent Topic Analysisis not in the list of methods.
  • “In particular,supervised machine learning has beenused to great effect in numerous bioinformaticsprediction methods”.(Jensen & Bateman,2011)Machine learning is of immense importance inbioinformatics and more generally for biomedicalsciences (Larrañaga et al.,2006;Tarca et al.,2007).Because in metabolic systems analysis,is not common,I think that is important to emphasise that:
  • There are no references in the literature foranalysis of metabolic pathways from afunctional approach,or using proposedmachine learning methods.IntelligentSystems
  • Larrañaga et al. bib.oxfordjournals.org at The Reference Shelf on May 30, 2011achineLearning
  • Larrañaga et al. bib.oxfordjournals.org at The Reference Shelf on May 30, 2011Bayesian classifiers, Feature subsetselectionSVM,ANN, classification trees,Evolutionary algorithmstabu searchnearest neighbour, SVM, Bayesianclassifier, fuzzy k-NNBayesiangeneralizationoftheSVM,ANN,lineardiscriminantanalysis,classificationtrees,ANNSVMandHMM,linear discriminant analysis,quadratic discriminantanalysis, k-NN classifier,bagging and boostingclassification trees, SVM andrandom forestachineLearning
  • Larrañaga et al. bib.oxfordjournals.org at The Reference Shelf on May 30, 2011Bayesian classifiers, Feature subsetselectionSVM,ANN, classification trees,Evolutionary algorithmstabu searchnearest neighbour, SVM, Bayesianclassifier, fuzzy k-NNBayesiangeneralizationoftheSVM,ANN,lineardiscriminantanalysis,classificationtrees,ANNprobabilistic graphicalmodels, classificationtrees, boosting withclassification treesSVMandHMM,linear discriminant analysis,quadratic discriminantanalysis, k-NN classifier,bagging and boostingclassification trees, SVM andrandom forestachineLearning
  • Why?http://www.perftrends.com/images/why.jpg
  • ... or Methods arenot applied toMetabolic Pathways......or are based onTopological (Graph Based)network representations
  • • It should be possible to make someadvances in understanding theunderlying functional conformationof metabolic pathways.Statementhttp://www.scriptmag.com/wp-content/uploads/BrainStorm-NewColor-12-22_32-1280x980at86.jpg
  • http://www.scriptmag.com/wp-content/uploads/BrainStorm-NewColor-12-22_32-1280x980at86.jpg• Supervised Clustering - useful to test thegiven representation - by classifying the biochemicalreactions.http://www.ee.ryerson.ca/~courses/ele888/ele_888_pat_class.gifStatement
  • http://diversity-mining-lab.wikispaces.com/Statement
  • • Information Retrieval algebraic models, likevector space based ones, should “reveal” topicsthat occurs in document collections.• Is it possible to generate new - “really new” pathways?• ...I’m talking about synthetic biology.http://diversity-mining-lab.wikispaces.com/Statement
  • Research QuestionIs it possible toclassify metabolicnetworks onlyusing functionalfeatures?
  • How?http://www.wired.com/images_blogs/threatlevel/2012/10/harris002.jpg
  • Goals• To Classify functionally, (without considering thetopological structure) metabolic pathwaysbased on machine learning methods.
  • Goals• To Classify functionally, (without considering thetopological structure) metabolic pathwaysbased on machine learning methods.• To Build or adapt a system of functional representationfor metabolic networks.
  • Goals• To Classify functionally, (without considering thetopological structure) metabolic pathwaysbased on machine learning methods.• To Build or adapt a system of functional representationfor metabolic networks.• To Classify metabolic networks using machine learningmethods.
  • Goals• To Classify functionally, (without considering thetopological structure) metabolic pathwaysbased on machine learning methods.• To Build or adapt a system of functional representationfor metabolic networks.• To Classify metabolic networks using machine learningmethods.• To Apply (in new ways) machine learning methods inthe study of systems biology.
  • MethodologyS1 + S2 + … Sn P1 + P2 + … PnEnzimeCoFactor CoEnzimeGeneral Metabolic Reaction Model - GMRMVectorization of GMRMS1 S2 S3 Enzime CoF CoE P1 P2 P3MetaCycKEGG12RepresentationClassificationCarlosManuelEstévez-BretónR.2012DataSourceEvaluationMethod 2Method 1ROCConfusionmatrixEntropypurityadjustedRand IndexAccuracyPipelinepaper paperpaper
  • DataSources MetaCycKEGG12
  • DataRepresentationS1 + S2 + … Sn P1 + P2 + … PnEnzimeCoFactor CoEnzimeGeneral Metabolic Reaction Model - GMRMVectorization of GMRMS1 S2 S3 Enzime CoF CoE P1 P2 P3
  • ClassificationSupervised ClassificationMethod 1
  • •Let’s think about clustering without anyprior knowledge...• Applying Information Retrieval methods toMetabolic Pathways data.Method 2
  • Evaluation ROCConfusionmatrixEntropypurityadjustedRand IndexAccuracyhttp://www.intechopen.com/source/html/38584/media/image56.jpegClassified as:Really is:Positive NegativePositiveNegativeFalse NegativeTrue NegativeFalse PositiveTrue Positive
  • Evaluation ROCConfusionmatrixEntropypurityadjustedRand IndexAccuracyhttp://www.intechopen.com/source/html/38584/media/image56.jpegClassified as:Really is:Positive NegativePositiveNegativeFalse NegativeTrue NegativeFalse PositiveTrue PositiveError RateRecall/sensitivitySpecificity/True Negative RatePrecision1-Specificity/False Alarm Rate
  • EvaluationROCConfusionmatrixEntropypurityadjustedRand IndexAccuracyhttp://www.intechopen.com/source/html/38584/media/image56.jpeghttp://wwww.cbgstat.com/v2/method_ROC_curve_MedCalc/images/ROC_curve_MedCalc_Snap17.gif
  • DeliverablesA computational metabolicrepresentation proposalA computational metabolicclassification methodA generative metabolicpathways modelA pipeline for metabolicpathways analysis
  • Progress ...http://desktop.freewallpaper4.me/view/original/3714/the-lonely-man.jpg
  • PreliminaryResultsS1 + S2 + … Sn P1 + P2 + … PnEnzimeCoFactor CoEnzimeGeneral Metabolic Reaction Model - GMRMVectorization of GMRMS1 S2 S3 Enzime CoF CoE P1 P2 P3MetaCycKEGG12RepresentationClassificationCarlosManuelEstévez-BretónR.2012DataSourceEvaluationMethod 2Method 1ROCConfusionmatrixEntropypurityadjustedRand IndexAccuracyPipelinepaper paperpaper
  • ComplexityMetabolic PathwayReactionMetabolites/omeMetabolic SwitchGlucoseGlucose 6P ATPHidrolasePyrophosphateVocabularyWords MoleculestheMurder for a jar of red rumfrogsoapDocumentPhraseParagraphrum Murder forjaraofredrum Murder forjaraofredGlucose Glucose 6PATPHidrolaseADP+ +ADPLinguisticAnalogyS1 + S2 + … Sn P1 + P2 + … PnEnzimeCoFactor CoEnzimeGeneral Metabolic Reaction Model - GMRMVectorization of GMRMS1 S2 S3 Enzime CoF CoE P1 P2 P3
  • RepresentationS1 + S2 + … Sn P1 + P2 + … PnEnzimeCoFactor CoEnzimeGeneral Metabolic Reaction Model - GMRMVectorization of GMRMS1 S2 S3 Enzime CoF CoE P1 P2 P3
  • Classification Supervised4Pathways2carbohydrate metabolism1lipid metabolism1from nucleotide metabolismSupport Vector MachinesClassification TreeK Nearest NeighbourCN2Naive Bayes24organismsMethod 1
  • Pipeline
  • Review- Proposing a vector representation of biochemicalreactions, based in a linguistic analogy.I´m going to classify metabolic networksonly using functional features...To find patterns that suggests constitutionrules on metabolic pathways.- Searching patterns by clustering.
  • Thanks@karelman