Your SlideShare is downloading. ×
  • Like
INBIOMEDvision Workshop at MIE 2011. Victoria López
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

INBIOMEDvision Workshop at MIE 2011. Victoria López

  • 944 views
Published

INBIOMEDvision_Workshop MIE 2011

INBIOMEDvision_Workshop MIE 2011

Published in Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
944
On SlideShare
0
From Embeds
0
Number of Embeds
3

Actions

Shares
Downloads
35
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide
  • This talk outlines recent developments in sequencing technologies and genome analysis methods for application in personalized medicine. New methods are needed in four areas to realize the potential of personalized medicine: (i) processing large scale robust genomic data; (ii) interpreting the functional effect and the impact of genomic variation; (iii) integrating systems data to relate complex genetic interactions with phenotypes; and (iv) Translating these discoveries into medical practice.
  • This talk outlines recent developments in sequencing technologies and genome analysis methods for application in personalized medicine. New methods are needed in four areas to realize the potential of personalized medicine: (i) processing large scale robust genomic data; (ii) interpreting the functional effect and the impact of genomic variation; (iii) integrating systems data to relate complex genetic interactions with phenotypes; and (iv) Translating these discoveries into medical practice.

Transcript

  • 1. Victoria López Alonso PhD Medical Bioinformátics Area Instituto de Salud Carlos III Spain Bioinformatics challenges in a personalized medicine pipeline Workshop INBIOMEDvision, MIE 2011
  • 2. Bridging gaps between Bioinformatics and MI BMI deals with the integrative management and synergic exploitation of the wide and inter-related scope of information that is generated and needed in healthcare settings, biomedical research institutions and health-related industry.
  • 3. Overview: Personalized medicine in current practice 1- Processing large-scale genomic data 2- Interpretation of functional effect of genomic variation 3- Integration of systems data 4- Translation into medical practice Bioinformatics challenges for Personalized medicine
  • 4. Personalized medicine in current practice Translational bioinformatics utilizes computational tools for the analysis of large biological databases and to fully comprehend disease mechanisms by not only understanding the genetics and the proteomics but also by associating them with the clinical data.
  • 5. Advances of molecular science
    • Human Genome Project in 2003
    • Finishing the euchromatic sequence of the human genome.
    • Nature 2004; 431 (7011): 931-945.
    • Phase I HapMap project in 2005
    • Phase II and Phase III
    • A haplotype map of the human genome.
    • Nature 2005: 437(7063):1299-1320
    • Encyclopedia of DNA Elements (ENCODE) project in 2007
    • Identification and analysis of functional elements in 1% of the human genome.
    • Nature 2007; 447(7146):799-816
    • 1000 Genomes Project in 2008
    • DNA sequences. A plan to capture human diversity in 1000 genomes.
    • Science 2008; 319(5863):395
    $1000 Genome in …2013 ??
  • 6. Personalized medicine in current practice Chemotherapy medications trastuzumab and Imatinib (Gambacorti-Passerini, 2008; Hudis, 2007) Targeted pharmacogenetic dosing algorithm is used for warfarin ( International Warfarin Pharmacogenetics Consortium et al., 2009 ) Incidence of adverse events for drugs Abacavir, Carbamazepine and Clozapine (Dettling et al., 2007; Ferrell and McLeod, 2008, 2002). The inclusion of genetics in EHRs will provide risk assesment. Clinical assessment incorporating a personal genome . Ashley et al. Lancet (2010)
  • 7. Bentley D. “Genomes for Medicine”. (2004). Nature Insight 429, p440-446
    • Today patient´s genetics are consulted only for few diagnoses and treatments and only in certain medical centers (cystic fribrosis , breast cancer)
    With easy access to a well annotated human genome an individual could adquire a genetic health profile including risk and resistance factors that could be used to guide medical decisions. Personalized medicine in current practice
  • 8. 1- Processing large-scale genomic data 2- Interpretation of functional effect of genomic variation 3- Integration of systems data 4- Translation into medical practice Bioinformatics challenges for Personalized medicine
    • Different informatics challenges should be addressed to create the tools to tailor medical care to each individual genome and also to realize the potential of personalized medicine
  • 9.
    • SNPs (Single Point Polymorphims) are key enablers in realizing the concept of personalized medicine.
    • Sequencing technologies are becoming accessible
    • Whole genome < 2 weeks
    • 1 error per 100 kb-------30.000 erroneous variant calls
    • The error rate of these technologies is a source of significant challenges in applications, including discovering novel variants
    1-Processing large-scale genomic data SNP: frequency in the human population is higher than 1%
  • 10.
    • 100.000 and 300.000 previously undiscovered SNPs
    • Variant discovery---”needle in a haystack”
    • Verification of novel variants due to the false positive rate
    • In addition there are other important classes of variations for clinical applications:
        • short insertion–deletion variants (indels),
        • copy number variants (CNVs)
        • structural variants (SVs)
    1-Processing large-scale genomic data
    • New algorithms to detect these variations from sequencing data
  • 11. 1- Processing large-scale genomic data
    • High quality sequence reads must be placed into their genomic context to identify variants.
    • The challenge is to develop new algorithms to do the “novo assembly” computationally possible.
    • De novo assembly is slow and complicated by repetitive elements.
    • Sequences are mapped to a genomic reference sequence :
    • BLAST have been traditionally used, but their execution speed depends on the genome size.
  • 12. 1- Processing large-scale genomic data
    • New Mapping and alignment algorithms
      • BLAT indexed version of the genome (Kent, 2002).
      • Burrows-Wheeler Aligner (BWA) (Li and Homer, 2010).
    • Ideally performed in a cluster or by using cloud computing
    • Program must allow for mismatches without resulting in false alignments
    • Improving of quality control metrics: ratios of base transition, Mendelian inheritance errors (MIE), relative quality scores…
  • 13. 2- Interpretation of functional effect After genomic data has been processed, the functional effect and the impact of the genetic variation must be analyzed Genome-wide association studies (GWASs) have been used to assess the statistical associations of SNPs with many important common diseases. GWAS provides new insights but only a limited number of variants have been characterized and understanding the functional relationship between variants and phenotypes. https://www.wtccc.org.uk
  • 14. 2- Interpretation of functional effect
    • Important issues for predicting the effect of SNPs are data management, retrieval and quality control.
    • SNP databases:
    • The dbSNP database (20 millions of validated SNPs)
    • The Human Gene Mutation Database (HGMD) (SNPs associated with diseases)
    • SwissVar
    • Online Mendelian Inheritance in Man (OMIM) database
    • PharmGKB database
    • Catalogue of Somatic Mutations in Cancer (COSMID)
    Number of known SNPs Fernald et al. 2011
  • 15. 2-Interpretation of functional effect
    • Computational methods to predict mSNPs:
      • Empirical rules (Ng and Henikoff, 2003; Ramensky et al., 2002),
      • Hidden Markov Models (HMMs) (Thomas and Kejariwal, 2004),
      • Neural Networks (Bromberg et al., 2008; Ferrer-Costa et al., 2005),
      • Decision Trees (Dobson et al., 2006; Krishnan and Westhead, 2003),
      • Random Forests (Li et al., 2009; Wainreb et al., 2010)
      • Support Vector Machines (Calabrese et al., 2009 ).
    • The prediction algorithms input features include:
        • amino acid sequence
        • protein structure
        • evolutionary information
  • 16.
    • New algorithms that include knowledge-based information are being developed on evolutionary information for the prediction of SNPs:
    • PANTHER uses a library of protein family HMM .
    • http://www.pantherdb.org/
    • PolyPhen uses different sequence-based features.
    • http://genetics.bwh.harvard.edu/pph
    • MutPred evaluates the probabilities of gain or loss of structure and function upon mutations using random forest. http://mutdb.org/mutpred
    • SIFT uses a multiple sequence alignment between homolog proteins. http://sift.jcvi.org
    • SNAP Sequence http://rostlab.org/services/snap
    • SNPEffect http://snpeffect.vib.be
    • SNPs3D Structure-based SVM predictor http://www.snps3d.org
    2- Interpretation of functional effect
  • 17. 2- Interpretation of functional effect
    • Experimental test are required to validate genetic predictions.
    • There are is a need for fast and accurate methods for gene prioritization
    Eleftherohorinou et al., 2010 Currently the most effective strategy uses the concept of genes that are linked to the biological process of interest. The input data for gene priorization is the functional annotation, the protein–protein interactions, biological pathways and literature.
  • 18. 2- Interpretation of functional effect
    • SUSPECT: sequence features, gene expression data, functional terms…
    • ToppGene : mouse phenotype data with human gene annotations and literature
    • MedSim: human disease genes with mouse genes
    • ENDEAVOUR: genes involved in a known biological process
    • G2D and PolySearch data mining on biological databases
    • MimMiner: text mining comparing the human phenome and disease phenotypes
    • PhenoPred : uses protein sequence and function
    • GeneMANIA : uses functional assays
    • The Gene Priorization Portal provides comprehensive descriptions of available predictors:
    http://homes.esat.kuleuven.be/~bioiuser/gpp/index.php
  • 19. 2- Interpretation of functional effect
    • Last year, the first edition of the Critical Assessment of Genome Interpretation (CAGI) was organized to assess the available methods for predicting phenotypic impact of genomic variation and to stimulate future research.
    http://genomeinterpretation.org/
  • 20. 3-Integration of systems data
    • There is concern that pharmacogenomics GWAS themselves are susceptible to many limitations:
    • insufficient sample size, selection biases for genetic variants and environmental interactions may affect the outcome measures
    • Multiple gene–gene interactions may underlie unexplained.
    HapMap Project, 2004
  • 21. 3- Integration of systems data
    • Model Selection Methods have been successful with disease and trait GWAS studies using selection techniques to choose multifactorial models that balance the false positive rate, statistical power and computational requirements of the search
    • Dimensionality reduction methods
    • Principal Components Analysis
    • Information Gain and
    • Multifactor Dimensionality Reduction
    • (ie. hypertension and familial amyloid polyneuropathy type I)
    Ritchie and Monsimger, 2010
  • 22. 3- Integration of systems data Naylor and Chen, 2010
    • No external knowledge sources informs about the biology behind the interactions.
    • Systems biology and network approaches address to the problem of complexity integrating molecular data at multiple levels of biology including genomes, transcriptomes, metabolomes, proteomes and functional and regulatory networks.
  • 23. 3- Integration of systems data
    • The simple “one SNP, one phenotype” approach is insufficient.
    • Most medically relevant phenotypes are thought to be the result of gene–gene and gene–environment interactions
    Adeyemo et al., 2010
  • 24. 3- Integration of systems data Limdi and Veenstra , 2008
    • Drug response often depends on multiple pharmacokinetic and pharmacodynamic interactions .
    • Some success: studies of warfarin have linked the majority of variation in response to two genes, CYP2C9 and VKORC1. Improved dosing algorithm.
  • 25. Goh et al., 2007 3- Integration of systems data
    • Disease–Gene Networks
    • Chemical structures, Diseases and Protein sequences
    • Epigenetic data and Drug Phenotypes
    • Pathways and Gene sets
      • Gene Set Enrichment Analysis (GSEA)
      • SNP Ratio Test
      • The Prioritizing Risk Pathways method
    • Assumptions must also be examined carefully ¡¡¡
    • Combining disparate data sources can result in novel associations
  • 26. 4- Translation into medical practice
    • Much of this research has yet to be translated to the clinic for improved patient care.
    • One of the areas where bioinformatics can have the greatest clinical impact is pharmacogenomics improving drug prescription and dosing.
    • Pharmacogenomic prescription and dosing algorithms need to be accessible to physicians.
    Martin-Sanchez et al. 2006 Warfarindosing could save up to 60% of the cost and reduce possible adverse events
  • 27.
    • Medical practice needs to be updated to include routine pharmacogenetic testing, educating and training physicians in personalized medicine, and futher clinical trials to prove the efficacy of predictions
    • Bioinformatics also translates discoveries to the clinic by disseminating discoveries through curated, searchable databases
    4-Translation into medical practice http://pacdb.org/ http :// www.pharmgkb.org / The database of Genotypes and Phenotypes The Pharmakogenomics Knowledge Database Pharmacogenetics-Cell line database www.ncbi.nlm.nih.gov/gap The Adverse Event Reporting System (AERS) www.fda.gov/Drugs/
  • 28.
    • Biologically and medically focused text mining algorithms can speed the collection of this structured data, such as methods that use sentence syntax and natural language processing to derive drug–gene and gene–gene interactions from scientific literature.
    • Opportunities for bioinformatics to integrate with the electronic medical record (EMR)
    4- Translation into medical practice www.mc.vanderbilt.edu/ www.phenx.org/ BioBank system at Vanderbilt RTI International with NHGRI
  • 29.
        • http://biotic.isciii.es/
        • [email_address]
    Instituto de Salud Carlos III Medical Bioinformatics Area Thanks ¡¡¡ http://www.inbiomedvision.eu/