• Save
Identification of pathological mutations from the single-gene case to exome projects: lessons from the Fabry disease (Xavier de la Cruz)  Identification of pathological mutations from the single-gene case to exome projects: lessons from the Fabry disease
Upcoming SlideShare
Loading in...5
×
 

Identification of pathological mutations from the single-gene case to exome projects: lessons from the Fabry disease (Xavier de la Cruz) Identification of pathological mutations from the single-gene case to exome projects: lessons from the Fabry disease

on

  • 725 views

*Watch the video at the end of the presentation ...

*Watch the video at the end of the presentation
Seminar led by Dr. Xavier de la Cruz, ICREA Research Professor. Head of the Translational Bioinformatics in Neuroscience group of VHIR, at VHIR (22nd November 2012).

Content: The need to identify the pathological character of mutations may arise in different contexts in biomedical research. However, the methods available to address this problem essentially depend on the number of cases under analysis. When we work with only a few mutations we can use an artisan-like approach, where all information available on protein sequence, structure and function is manually retrieved and studied. However, when we need to characterize many variants, as can be the case in exome projects, faster methods are required to assess their pathogenicity. In my talk I will illustrate the principles underlying these two approaches with examples from the study of Fabry disease mutations, resulting from our collaborative work at the VHIR.

Statistics

Views

Total Views
725
Views on SlideShare
724
Embed Views
1

Actions

Likes
0
Downloads
0
Comments
0

1 Embed 1

https://twitter.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Identification of pathological mutations from the single-gene case to exome projects: lessons from the Fabry disease (Xavier de la Cruz)  Identification of pathological mutations from the single-gene case to exome projects: lessons from the Fabry disease Identification of pathological mutations from the single-gene case to exome projects: lessons from the Fabry disease (Xavier de la Cruz) Identification of pathological mutations from the single-gene case to exome projects: lessons from the Fabry disease Presentation Transcript

  • Identification of pathological mutationsfrom the single-gene case to exomeprojects: lessons from the Fabry diseaseXavier de la Cruz
  • Identificationofpathological… Interpretationcontextofmutation data Identifyingpathologicalmutations: ◦ Presenttools ◦ Problems? A VHIR-basedtoolformutationscoring ◦ Development ◦ Performance Futuredirections ◦ Implementing a standardizedmutationreport
  • From base pairs to bedside (Green & Guyer, Nature, 2011) Understanding Understanding Improving genome disease biology healthcare structure Understanding Advancing effectiveness genome biology medical science 1990-2003 Genome Project 2004-2010 2011-2020Beyond 2020
  • So, is there a problem? 2017 $517000 $285000 INTERPRETATION COST <$100
  • From base pairs to …Sample Exome sequencing Variant identification and quality control INTERPRETATION
  • The interpretation problem “…enormous amounts of raw data, but still very little understanding of what it means” Exome sequencing context: ◦ Identify disease causative variants ◦ Prioritize of variants ◦ Speed: can we do this for 100‟s-1000‟s variants in “reasonable” time? ◦ Reliability: can we provide good error models for counseling/diagnosis/prognosis?
  • Exome-ready mutation annotationtools PolyPhen (Adzhubei et al., 2010), SIFT (Kumar et al., 2009): ◦ Mutation mining  pathological : databases + literature + private datasets  neutral: experimental sets (LacI, lysozyme, etc), evolutionary model, databases (dbSNP) ◦ Building model: machine learning
  • PERFORMANCE
  • CONDEL González-Pérez & López-Bigas, 2011 ROC area: •CONDEL: 0.849 •CAROL: 0.852 CAROLLopes et al., 2012
  • Limitsofpresentannotationtools Consensustools: ◦ Understandingof molecular damageisharder ◦ Theydependontheexistenceofprimarytoold s (PolyPhen, SIFT) Primary (PolyPhen, SIFT): ◦ Average overmanymutationsand genes
  • Type III Hereditary Hemochromatosis – TFR2 • TFR2, a dimeric type II transmembrane membrane protein expressed mostly in the liver and CD71+ early erythroids. • At least 50 families and 69 patients have been described with mutations in TFR2 gene.
  • Fabry disease Systemic disorder characterized by: progressive renal failure, cardiovascular or cerebrovascular disease, etc. Caused by mutations in lysosomal enzyme -galactosidase A
  • CYS52
  • PRESENT PREDICTORSGENE 1 PATHOL. MUT. 1GENE 2 PATHOL. MUT. 2 MUTATIONGENE 3 PATHOL. MUT. 3 PREDICTOR……… ………
  • GENE-SPECIFICPREDICTORS GENE 1 PATHOL. MUT. 1 MUTATION PREDICTOR 1 GENE 2 PATHOL. MUT. 2 MUTATION PREDICTOR 2 GENE 3 PATHOL. MUT. 3 MUTATION PREDICTOR 3 ……… ……… MUTATION PREDICTOR …
  • Improving mutation annotation tools Train in single genes (Ferrer-Costa et al., 2004): increase 5%-10% successrate
  • METHOD S
  • MUTATIONproperty 1 property N property 2 property i property j PATHOLOGICAL / NEUTRAL
  • DATAMINE pathological mutations CHARACTERIZE PROTEIN DAMAGE BUILD COMPUTATIONAL MODEL •Experimental Application: study • score •Counseling • prioritize •Etc
  • Datamine PathologicalMutations General databases: UniProt, OMIM Specificdatabases: ◦ Fabrydatabase(http://fabry-database.org/) ◦ p53 database(http://p53.free.fr/Database p53_database.html) Literature Institutionmutationcollections
  • Proteinstability Functional interactionsProtein damage…KKRHCSGWL… Unspecific cellular Y interactions
  • Conceptual context: impact of mutationson protein structure/function Empirical rules from site-directed mutagenesis (‟80s, „90s): ◦ break disulphide bridges, burial of charged residues, hydrogen bond loss, disturb protein-protein interface, etc ◦ protein structure destabilization is associated to function loss Evolutionary conservation is linked to biological function
  • Mutation properties Sequence-based: V, , Blosum62 elements, etc Structure-based: relate to mutation location: accessibility, contact number and type, etc Evolutionary-based properties: ◦ wild-type (wt) conservation degree ◦ mutant rarity ◦ sequence variability at the mutation locus (entropy)
  • Multiple alignments Low similarity, only two sequences: AVTTGLNMWTTAKRPGMDDFYTILLPGLMNCI GLFTAIDMHFFGRKPACEEYFTLVVDGLCNCI Low similarity, multiple sequences: GIFTDIDMHFYVKKPGLDEFFTLVLRTLCMAA ALTTGIDMWTTAKRPDMDDYYTIIIPGLMNCI AVTTGLNMWTTAKRPGMDDFYTILLPGLMNCI GVTTGLNMYFTARRPGLDEFYTLVLRTLCMCL GIFTDIDMHFYVKKPGLDEFFTLVLRTLCMAA AVTTGLNMWTTAKRPGMDDFYTILLPGLMNCI GLFTALNMHFFGRKPACEEYFTLVVDGLCNCI
  • MSA: thetechnicalside Forverydivergentproteinsgood MSA are veryhard to obtain Protocol to buildalignments: ◦ RecoverfamilymemberswithPsiBlast (E- value:0.001; seq.id.>40%) UniRef100 ◦ Align with MUSCLE Conservation may be misleading: ◦ proteinfunctionishighlyrelevantfor living beings. E.g. histones. OK ! ◦ databasebias. E.g. onlyhominidaesequences are available. PROBLEM ?!
  • Ourpredictor 7 properties: sequence-based ( V, , Blosum62), structure (relativeaccessibility), MSA- based (entropy, pssm(wt), pssm(mt)) Neural networks (Wekapackage) ◦ Multilayerpercetron (1 hiddenlayer-4 units) ◦ No hiddenlayer Training: 2-fold cross- validationscheme (25 replicas to
  • Performance measure: Matthewscorrelationcoefficient MCC=(tp.tn-fp.fn)/[(tp+fp)(tp+fn)(tn+fp)(tn+fn)]1/2 -1≤ MCC ≤1 ◦ 0: predictivepower similar to random ◦ 1:perfectpredictionpower ◦ -1: badprediction, smallsamples, theproblemcannot be solved?
  • ult
  • Pathogenicity prediction in Fabrydisease Mutationdataset: 313 pathological and 59 neutral mutations Discriminantpowerofparameters Performance
  • Aminoacid volume
  • Residueconservation
  • Performance
  • ROC curves
  • Performance (Successrate)
  • Performance (MCC)
  • GENE-SPECIFICPREDICTORS -galactosidase PREDICTOR MYH7 PREDICTOR ……… PREDICTOR
  • MYH7 (Beta-cardiac myosin heavychain) Largestructuralprotein (1390aa) Mutations cause familial hypertrophiccardiomyopathy 1 MutationdatasetobtainedfromUniProt, OMIM andCardioGenomics ◦ 74 disease-causingmutations ◦ 45 neutral mutations (MSA)
  • Performance (Successrate)
  • Performance (MCC)
  • Gene-specific performancesQtot= (tp+tn)/(tp+tn+fp+fn) Sensitivity= tp/(tp+fn) Specificity= tn/(tn+fp) (neutral) (pathological)
  • Futuredirections:pathogenicityprediction/analysis Extend to more genes: ◦ Enoughmutation data ◦ Notenoughmutation data Can wepredictotherdiseasephenotypes? ◦ First tests suggest a similar approachcouldworkforseverity
  • Summary Thereisroomforimprovement in mutationannotationtools We are developping a new, gene- basedtoolthatimprovespresentmethod s Ourmethodwillworkforlarge- scalescoringprojects (exome) andfor single-mutationanalyses
  • WORKING TOGETHER
  • Towards a uniquemutationdamagereport Standardizethedescription/reportingof mutationimpact: ◦ Sequence-level ◦ Structure-level ◦ MSA ◦ Miscellaneousinformation Communityeffort
  • TRANSLATIONAL BIOINFORMATICS IN NEUROSCIENCES GROUP•Neurovascular Disease, Neurosciences •Joan Montaner •Israel Fernández-Cadenas•Nanomed.lysos.storage diseas., CIBBIM, Nanomedicine •M.Carmen Domínguez•Immunology, Respiratory & Systemic Diseases: •Ricardo Pujol •Mónica Martínez •Roger Colobran•Neuromusc.& Mitoch.Pathol, Neurosciences: •Elena García •Tomás Pinós•Cancer and Iron group, IMPPC: •Mayka Sánchez •Ricky Joshi•Biomedicine & Translat. & Pediatrics Oncol., Oncology •Jaume Reventos •Eva Colás •Andreas Doll •Marina Rigau •Marta García