SlideShare a Scribd company logo
1 of 24
Download to read offline
A short introduction to Centrosomal
Variants
SVM based prioritization of cancer causing
mutations in centromere protein family
Centromere
 The  centromere  is  the  part  of  a  chromosome  that  links  sister 
chromatids.
 During  anaphase  of  mitosis,  paired  centromeres  in  each  distinct 
chromosome  begin  to  move  apart  as  daughter  chromosomes 
migrate centromere first toward opposite ends of the cell.
 It is the most condensed and constricted region of a chromosome.
 It serves as the point of attachment for spindle fibers. 
 Deregulation  in  the  their  activity  leads  to  several  checkpoint 
dissorders and pathogeneticities.
Mutation induced centromere dysfunctioning is 
linked with several human diseases
 Bardet­Biedl­syndrome
 Polycystic kidney disease
 Lissencephaly
 Primordial Dwarfism
 Autosomal Primary Recessive Microcephaly
 Cancer
Few important Centromere protein
families
 CEP family proteins
 CENP family proteins
 MAD family proteins
 hSAS family proteins
 CEPTIN family proteins
CENP-E recruitment and its activity is
mediated by several other proteomic
complexes
Proteins selected for evaluation
CENPA, CENPB, CENPC, CENPE, CENPF, CENPH, CENPI, CENPJ, CENPK, CENPL,
CENPM, CENPN, CENPO, CENPP, CENPQ, CENPR, CENPS, CENPT, CENPU, CENPV,
CENPW, CENPX, CENPY, CENPZ
Total 823 structural variants from CENP protein family were collected for
this study
Machine Learning: What is it all about
1. Computers are very intelligent and has greater compilaton ability.
2. It can learn everything, no matter what you give.
3. Training data must not contain any wrong values.
4. To prevent the use of spurious datas we must validate and scale the entire dataset
before starting the training session.
5. There are three different methodologies in machine learning.
a. Supervised learning methods
b. Unsupervised learning methods
c. Reinforcement learning methods
 Supervised learning is the machine learning task of inferring a function from
supervised (labeled) training data.
 A supervised learning algorithm analyzes the training data and produces an inferred
function.
 The parallel task in human and animal psychology is often carride out by this method.
 Few widely used supervised learning algorithms are:
1. Support vector machines
2. Bayesian statistics
3. Artificial neural network
4. Random Forests
5. Regression analysis
Support Vector Machines
 A support vector machine (SVM) is a concept in statistics and computer science for a set of
related supervised learning methods that analyze data and recognize patterns, used for
classification and regression analysis.
 Given a set of training examples, each marked as belonging to one of two categories, an
SVM training algorithm builds a model that assigns new examples into one category or
the other.
 More formally, a support vector machine constructs a hyperplane or set of hyperplanes in a
high or infinite dimensional space, which can be used for classification.

Here consider Đ as a training data for which,
Đ = {(xi,yi) | xi Є Rp, yi Є {1, -1}} (for i=1 to n)
 For training we used radial basis function kernal for greater accuarcy
(RBF): K(xi , xj) = exp(−γ ||xi − xj||^2), γ > 0.
Objective
To identify the cancer associated nsSNP's in CENP 
protein family using support vector machine 
approach
1. Examination of protocol
2. Application of protocol to collect datasets for training the machine
4. Application of designed classifier to identify the cancer associated
mutations in CENP family proteins.
3. Designing a Support Vector Machine classifier system using machine
learning algorithm
Methodology
5. Studying the dynamic behaviour of cancer associated structural variants
Examination of protocol was carried out on CENPE proteinExamination of protocol was carried out on CENPE protein
➔Centromere-associated protein-E (CENPE), a protein with 2701 amino acids and relative
molecular weight of 312 kDa, is highly expressed in mitosis and accumulates in the cell just
prior to mitosis.
➔It is required for efficient, stable microtubule capture at kinetochores.
➔It plays an essential role in integrating the mechanics of microtubule-chromosome
interactions with mitotic checkpoint signaling, and has emerged as a novel target for cancer
therapy.
➔It contains ATP-sensitive motor-like domain at its N-terminus that is actively involved in
hydrolyzing ATP to produce directed mechanical force along microtubules.
➔Absence of CENPE reduces tension at the bi-orientated chromosomes resulting in
misaligned chromosomes in the metaphase plate, leading to metaphase arrest.
➔CENPE expression was also found to be reduced in human HCC tissue, and lower
expression of CENPE was found to be inducing aneuploidy in LO2 cells.
Prediction of oncogenic mutant in CENPE using SNP prediction tools
 We first collected 100 nsSNP reported in CENPE coding gene from NCBI dbSNP database.
 SIFT, Polyphen, PhDSnp, Pmut, CancPredict and Dr. Cancer tools were used to identify the
cancer associated SNP from the available dataset.
 We found Y63H as highly deleterious and cancer associated using above tools.
 To analyse the structural consequences of this mutation we further carried out olecular
dynamic simulation of CENPE native and mutant motor domain for 5 ns timescale.
 Insilico X-ray scatering was carried out throughout the simulation in order to observe the
change in ionic density in native and mutant structure.
 Root mean square deviation was then plotted to analyze the relative fluctuation of the
structures.
Molecular blueprint of structural variation in CENPE motor 
domain: Inside body environment
Native Mutant
Time (ps) Time (ps)
NativeNative MutantMutant
Root Mean Square FluctuationRoot Mean Square Fluctuation
Calculation of R208K CENPE-ATP association constantCalculation of R208K CENPE-ATP association constant
According to Debye-Huckel theory
Ҡ(reaction rate constant) œ ­U (electrostatic interaction energy)
Ҡnative            Ҡmutant
134.6        Ҡmutant
Ҡmutant                    134.6 Ҳ 3.06
Unative
=
Umutant
-13.42
=
-3.06
=
13.42
= 30.69
CENPEnative + ATP -> CENPEnative-ATP complex; = 134.6Ҡ
CENPEmutant + ATP -> CENPEmutant-ATP complex; = 30.69Ҡ
0 s
Time (seconds) Time (seconds)
Native Mutant
CENPE-ATP
CENPE-ADP CENPE-ADP
CENPE-ATP
Time (seconds) Time (seconds)
Tools used to collect training data's
Row 1 Row 2 Row 3 Row 4
0
2
4
6
8
10
12
Column 1
Column 2
Column 3
Tools used to collect SNP training datas
1. SIFT, Polyphen, PhDSnp, Pmut, CancPredict and Dr. Cancer tools were used to collect the SNP
datasets.
2. Cancer variant datas were obtained from Swissvar database.
3. Neutral variants were randomly picked from Swissprot database.
4. Scaling, training and model generation were carried out using support vector machine algorithm.
5. RBF kernal was used to generate the classifier model.
6. Rescaling and cross-validation was carried out by changing the Ć and γ values untill the maximum
accuracy was obtained.
Model designed for neutral variants
Model designed for 100 Neutral and Cancer variants
References
 Kim Y, Holland AJ, Lan W, Cleveland DW. Aurora kinases and protein phosphatase 1
mediate chromosome congression through regulation of CENP-E. Cell. 2010 142:444-
55.
 Maia AF, Feijão T, Vromans MJ, Sunkel CE, Lens SM. Aurora B kinase cooperates with
CENP-E to promote timely anaphase onset. Chromosoma. 2010 119:405-13.
 Yang CP, Liu L, Ikui AE, Horwitz SB. The interaction between mitotic checkpoint proteins,
CENP-E and BubR1, is diminished in epothilone B-resistant A549 cells. Cell Cycle.
2010 Mar 15;9(6):1207-13.
 Hess B, Kutzner C, van der Spoel D, Lindahl E (2008) GROMACS 4: Algorithms for
Highly Efficient, Load-Balanced, and Scalable Molecular Simulation. J Chem Theory
Comput. 4:435–447.
 Frisch C, Fersht AR, Schreiber G. Experimental assignment of the structure of the transition
state for the association of barnase and barstar. J Mol Biol. 2001 308:69-77.
AcknowledgementAcknowledgement
J. Febin Prabhudass
Asst. Prof. Seniour
School of Biosciences and Technology
VIT Univerity
SVM based prioritization of cancer causing mutations in centromere protein family

More Related Content

What's hot

Geveart Lab SIMR Paper
Geveart Lab SIMR PaperGeveart Lab SIMR Paper
Geveart Lab SIMR Paper
Nathan Dalal
 
BiPday 2014 --Creanza Teresa
BiPday 2014 --Creanza TeresaBiPday 2014 --Creanza Teresa
BiPday 2014 --Creanza Teresa
eventi-ITBbari
 
ASEE-GSW_2015_submission_75
ASEE-GSW_2015_submission_75ASEE-GSW_2015_submission_75
ASEE-GSW_2015_submission_75
Sam Yang
 
Presentation july 28_2015
Presentation july 28_2015Presentation july 28_2015
Presentation july 28_2015
gkoytiger
 

What's hot (20)

Slides 3
Slides 3Slides 3
Slides 3
 
Geveart Lab SIMR Paper
Geveart Lab SIMR PaperGeveart Lab SIMR Paper
Geveart Lab SIMR Paper
 
Biotechnology of High Sensitivity PCR for Oncology Biomarkers
 Biotechnology of High Sensitivity PCR for Oncology Biomarkers Biotechnology of High Sensitivity PCR for Oncology Biomarkers
Biotechnology of High Sensitivity PCR for Oncology Biomarkers
 
Cell lines breast cancer-project
Cell lines breast cancer-project Cell lines breast cancer-project
Cell lines breast cancer-project
 
Metabolomics in the 21st century - perspective
Metabolomics in the 21st century - perspectiveMetabolomics in the 21st century - perspective
Metabolomics in the 21st century - perspective
 
BiPday 2014 --Creanza Teresa
BiPday 2014 --Creanza TeresaBiPday 2014 --Creanza Teresa
BiPday 2014 --Creanza Teresa
 
ASEE-GSW_2015_submission_75
ASEE-GSW_2015_submission_75ASEE-GSW_2015_submission_75
ASEE-GSW_2015_submission_75
 
Molecular imaging – a new field for a new world By Zaver M. Bhujwalla
Molecular imaging – a new field for a new world By Zaver M. BhujwallaMolecular imaging – a new field for a new world By Zaver M. Bhujwalla
Molecular imaging – a new field for a new world By Zaver M. Bhujwalla
 
Bioinformatic Analysis of Synthetic Lethality in Breast Cancer
Bioinformatic Analysis of Synthetic Lethality in Breast CancerBioinformatic Analysis of Synthetic Lethality in Breast Cancer
Bioinformatic Analysis of Synthetic Lethality in Breast Cancer
 
Pdx project
Pdx project  Pdx project
Pdx project
 
Ascb 2010 poster
Ascb 2010  posterAscb 2010  poster
Ascb 2010 poster
 
HDAC4 and HDAC7 Promote Breast and Ovarian Cancer Cell Migration by Regulatin...
HDAC4 and HDAC7 Promote Breast and Ovarian Cancer Cell Migration by Regulatin...HDAC4 and HDAC7 Promote Breast and Ovarian Cancer Cell Migration by Regulatin...
HDAC4 and HDAC7 Promote Breast and Ovarian Cancer Cell Migration by Regulatin...
 
A high-throughput approach for multi-omic testing for prostate cancer research
A high-throughput approach for multi-omic testing for prostate cancer researchA high-throughput approach for multi-omic testing for prostate cancer research
A high-throughput approach for multi-omic testing for prostate cancer research
 
Salisha ppt (1) (1)
Salisha ppt (1) (1)Salisha ppt (1) (1)
Salisha ppt (1) (1)
 
Biomed central
Biomed centralBiomed central
Biomed central
 
Crispr cas9
Crispr cas9Crispr cas9
Crispr cas9
 
Published-PageOne
Published-PageOnePublished-PageOne
Published-PageOne
 
Presentation july 28_2015
Presentation july 28_2015Presentation july 28_2015
Presentation july 28_2015
 
Gene_Identification_Report
Gene_Identification_ReportGene_Identification_Report
Gene_Identification_Report
 
Identifying novel and druggable targets in a triple negative breast cancer ce...
Identifying novel and druggable targets in a triple negative breast cancer ce...Identifying novel and druggable targets in a triple negative breast cancer ce...
Identifying novel and druggable targets in a triple negative breast cancer ce...
 

Similar to SVM based prioritization of cancer causing mutations in centromere protein family

Johanna_Edlund-Thesis-final
Johanna_Edlund-Thesis-finalJohanna_Edlund-Thesis-final
Johanna_Edlund-Thesis-final
Johanna Edlund
 
Advances in Breast Tumor Biomarker Discovery Methods
Advances in Breast Tumor Biomarker Discovery MethodsAdvances in Breast Tumor Biomarker Discovery Methods
Advances in Breast Tumor Biomarker Discovery Methods
Thermo Fisher Scientific
 
IncellDx Oncobreast 3Dx CSUPERB Poster
IncellDx Oncobreast 3Dx CSUPERB PosterIncellDx Oncobreast 3Dx CSUPERB Poster
IncellDx Oncobreast 3Dx CSUPERB Poster
Amanda Chargin
 
Towards Prediction of Platinum Treatment Response in Ovarian Cancer using Mac...
Towards Prediction of Platinum Treatment Response in Ovarian Cancer using Mac...Towards Prediction of Platinum Treatment Response in Ovarian Cancer using Mac...
Towards Prediction of Platinum Treatment Response in Ovarian Cancer using Mac...
Antoaneta Vladimirova
 
Development and validation of an accurate quantitative real time polymerase c...
Development and validation of an accurate quantitative real time polymerase c...Development and validation of an accurate quantitative real time polymerase c...
Development and validation of an accurate quantitative real time polymerase c...
t7260678
 
Development and validation of an accurate quantitative real time polymerase c...
Development and validation of an accurate quantitative real time polymerase c...Development and validation of an accurate quantitative real time polymerase c...
Development and validation of an accurate quantitative real time polymerase c...
t7260678
 

Similar to SVM based prioritization of cancer causing mutations in centromere protein family (20)

FFPE Applications Solutions brochure
FFPE Applications Solutions brochureFFPE Applications Solutions brochure
FFPE Applications Solutions brochure
 
20140711 5 s_pond_ercc2.0_workshop
20140711 5 s_pond_ercc2.0_workshop20140711 5 s_pond_ercc2.0_workshop
20140711 5 s_pond_ercc2.0_workshop
 
OncoRep: A n-of-1 reporting tool to support genome-guided treatment for breas...
OncoRep: A n-of-1 reporting tool to support genome-guided treatment for breas...OncoRep: A n-of-1 reporting tool to support genome-guided treatment for breas...
OncoRep: A n-of-1 reporting tool to support genome-guided treatment for breas...
 
Developing a framework for for detection of low frequency somatic genetic alt...
Developing a framework for for detection of low frequency somatic genetic alt...Developing a framework for for detection of low frequency somatic genetic alt...
Developing a framework for for detection of low frequency somatic genetic alt...
 
JClinChem_2003
JClinChem_2003JClinChem_2003
JClinChem_2003
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Genomics Technologies
Genomics TechnologiesGenomics Technologies
Genomics Technologies
 
Johanna_Edlund-Thesis-final
Johanna_Edlund-Thesis-finalJohanna_Edlund-Thesis-final
Johanna_Edlund-Thesis-final
 
Advances in Breast Tumor Biomarker Discovery Methods
Advances in Breast Tumor Biomarker Discovery MethodsAdvances in Breast Tumor Biomarker Discovery Methods
Advances in Breast Tumor Biomarker Discovery Methods
 
IncellDx Oncobreast 3Dx CSUPERB Poster
IncellDx Oncobreast 3Dx CSUPERB PosterIncellDx Oncobreast 3Dx CSUPERB Poster
IncellDx Oncobreast 3Dx CSUPERB Poster
 
MSU Transgenic and Genome Editing Facility
MSU Transgenic and Genome Editing FacilityMSU Transgenic and Genome Editing Facility
MSU Transgenic and Genome Editing Facility
 
Towards Prediction of Platinum Treatment Response in Ovarian Cancer using Mac...
Towards Prediction of Platinum Treatment Response in Ovarian Cancer using Mac...Towards Prediction of Platinum Treatment Response in Ovarian Cancer using Mac...
Towards Prediction of Platinum Treatment Response in Ovarian Cancer using Mac...
 
Biotechnological approaches in aquatic animal health management
Biotechnological approaches in aquatic animal health managementBiotechnological approaches in aquatic animal health management
Biotechnological approaches in aquatic animal health management
 
Efficiency of Using Sequence Discovery for Polymorphism in DNA Sequence
Efficiency of Using Sequence Discovery for Polymorphism in DNA SequenceEfficiency of Using Sequence Discovery for Polymorphism in DNA Sequence
Efficiency of Using Sequence Discovery for Polymorphism in DNA Sequence
 
Detecting Somatic Mutation - Ensemble Approach
Detecting Somatic Mutation - Ensemble ApproachDetecting Somatic Mutation - Ensemble Approach
Detecting Somatic Mutation - Ensemble Approach
 
Development and validation of an accurate quantitative real time polymerase c...
Development and validation of an accurate quantitative real time polymerase c...Development and validation of an accurate quantitative real time polymerase c...
Development and validation of an accurate quantitative real time polymerase c...
 
Development and validation of an accurate quantitative real time polymerase c...
Development and validation of an accurate quantitative real time polymerase c...Development and validation of an accurate quantitative real time polymerase c...
Development and validation of an accurate quantitative real time polymerase c...
 
Poster
PosterPoster
Poster
 
Analytical performance of a novel next generation sequencing assay for Myeloi...
Analytical performance of a novel next generation sequencing assay for Myeloi...Analytical performance of a novel next generation sequencing assay for Myeloi...
Analytical performance of a novel next generation sequencing assay for Myeloi...
 
Maldi tof-ms analysis in identification of prostate cancer
Maldi tof-ms analysis in identification of prostate cancerMaldi tof-ms analysis in identification of prostate cancer
Maldi tof-ms analysis in identification of prostate cancer
 

Recently uploaded

Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
gindu3009
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
University of Hertfordshire
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Sérgio Sacani
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
Sérgio Sacani
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
Sérgio Sacani
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Sérgio Sacani
 

Recently uploaded (20)

Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATINChromatin Structure | EUCHROMATIN | HETEROCHROMATIN
Chromatin Structure | EUCHROMATIN | HETEROCHROMATIN
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 

SVM based prioritization of cancer causing mutations in centromere protein family

  • 1. A short introduction to Centrosomal Variants SVM based prioritization of cancer causing mutations in centromere protein family
  • 2. Centromere  The  centromere  is  the  part  of  a  chromosome  that  links  sister  chromatids.  During  anaphase  of  mitosis,  paired  centromeres  in  each  distinct  chromosome  begin  to  move  apart  as  daughter  chromosomes  migrate centromere first toward opposite ends of the cell.  It is the most condensed and constricted region of a chromosome.  It serves as the point of attachment for spindle fibers.   Deregulation  in  the  their  activity  leads  to  several  checkpoint  dissorders and pathogeneticities.
  • 4. Few important Centromere protein families  CEP family proteins  CENP family proteins  MAD family proteins  hSAS family proteins  CEPTIN family proteins
  • 5. CENP-E recruitment and its activity is mediated by several other proteomic complexes
  • 6. Proteins selected for evaluation CENPA, CENPB, CENPC, CENPE, CENPF, CENPH, CENPI, CENPJ, CENPK, CENPL, CENPM, CENPN, CENPO, CENPP, CENPQ, CENPR, CENPS, CENPT, CENPU, CENPV, CENPW, CENPX, CENPY, CENPZ Total 823 structural variants from CENP protein family were collected for this study
  • 7. Machine Learning: What is it all about 1. Computers are very intelligent and has greater compilaton ability. 2. It can learn everything, no matter what you give. 3. Training data must not contain any wrong values. 4. To prevent the use of spurious datas we must validate and scale the entire dataset before starting the training session. 5. There are three different methodologies in machine learning. a. Supervised learning methods b. Unsupervised learning methods c. Reinforcement learning methods
  • 8.  Supervised learning is the machine learning task of inferring a function from supervised (labeled) training data.  A supervised learning algorithm analyzes the training data and produces an inferred function.  The parallel task in human and animal psychology is often carride out by this method.  Few widely used supervised learning algorithms are: 1. Support vector machines 2. Bayesian statistics 3. Artificial neural network 4. Random Forests 5. Regression analysis
  • 9. Support Vector Machines  A support vector machine (SVM) is a concept in statistics and computer science for a set of related supervised learning methods that analyze data and recognize patterns, used for classification and regression analysis.  Given a set of training examples, each marked as belonging to one of two categories, an SVM training algorithm builds a model that assigns new examples into one category or the other.  More formally, a support vector machine constructs a hyperplane or set of hyperplanes in a high or infinite dimensional space, which can be used for classification.  Here consider Đ as a training data for which, Đ = {(xi,yi) | xi Є Rp, yi Є {1, -1}} (for i=1 to n)  For training we used radial basis function kernal for greater accuarcy (RBF): K(xi , xj) = exp(−γ ||xi − xj||^2), γ > 0.
  • 11. 1. Examination of protocol 2. Application of protocol to collect datasets for training the machine 4. Application of designed classifier to identify the cancer associated mutations in CENP family proteins. 3. Designing a Support Vector Machine classifier system using machine learning algorithm Methodology 5. Studying the dynamic behaviour of cancer associated structural variants
  • 12. Examination of protocol was carried out on CENPE proteinExamination of protocol was carried out on CENPE protein ➔Centromere-associated protein-E (CENPE), a protein with 2701 amino acids and relative molecular weight of 312 kDa, is highly expressed in mitosis and accumulates in the cell just prior to mitosis. ➔It is required for efficient, stable microtubule capture at kinetochores. ➔It plays an essential role in integrating the mechanics of microtubule-chromosome interactions with mitotic checkpoint signaling, and has emerged as a novel target for cancer therapy. ➔It contains ATP-sensitive motor-like domain at its N-terminus that is actively involved in hydrolyzing ATP to produce directed mechanical force along microtubules. ➔Absence of CENPE reduces tension at the bi-orientated chromosomes resulting in misaligned chromosomes in the metaphase plate, leading to metaphase arrest. ➔CENPE expression was also found to be reduced in human HCC tissue, and lower expression of CENPE was found to be inducing aneuploidy in LO2 cells.
  • 13. Prediction of oncogenic mutant in CENPE using SNP prediction tools  We first collected 100 nsSNP reported in CENPE coding gene from NCBI dbSNP database.  SIFT, Polyphen, PhDSnp, Pmut, CancPredict and Dr. Cancer tools were used to identify the cancer associated SNP from the available dataset.  We found Y63H as highly deleterious and cancer associated using above tools.  To analyse the structural consequences of this mutation we further carried out olecular dynamic simulation of CENPE native and mutant motor domain for 5 ns timescale.  Insilico X-ray scatering was carried out throughout the simulation in order to observe the change in ionic density in native and mutant structure.  Root mean square deviation was then plotted to analyze the relative fluctuation of the structures.
  • 15. Time (ps) Time (ps) NativeNative MutantMutant Root Mean Square FluctuationRoot Mean Square Fluctuation
  • 16. Calculation of R208K CENPE-ATP association constantCalculation of R208K CENPE-ATP association constant According to Debye-Huckel theory Ҡ(reaction rate constant) œ ­U (electrostatic interaction energy) Ҡnative            Ҡmutant 134.6        Ҡmutant Ҡmutant                    134.6 Ҳ 3.06 Unative = Umutant -13.42 = -3.06 = 13.42 = 30.69 CENPEnative + ATP -> CENPEnative-ATP complex; = 134.6Ҡ CENPEmutant + ATP -> CENPEmutant-ATP complex; = 30.69Ҡ
  • 17. 0 s
  • 18. Time (seconds) Time (seconds) Native Mutant CENPE-ATP CENPE-ADP CENPE-ADP CENPE-ATP Time (seconds) Time (seconds)
  • 19. Tools used to collect training data's Row 1 Row 2 Row 3 Row 4 0 2 4 6 8 10 12 Column 1 Column 2 Column 3 Tools used to collect SNP training datas 1. SIFT, Polyphen, PhDSnp, Pmut, CancPredict and Dr. Cancer tools were used to collect the SNP datasets. 2. Cancer variant datas were obtained from Swissvar database. 3. Neutral variants were randomly picked from Swissprot database. 4. Scaling, training and model generation were carried out using support vector machine algorithm. 5. RBF kernal was used to generate the classifier model. 6. Rescaling and cross-validation was carried out by changing the Ć and γ values untill the maximum accuracy was obtained.
  • 20.
  • 21. Model designed for neutral variants Model designed for 100 Neutral and Cancer variants
  • 22. References  Kim Y, Holland AJ, Lan W, Cleveland DW. Aurora kinases and protein phosphatase 1 mediate chromosome congression through regulation of CENP-E. Cell. 2010 142:444- 55.  Maia AF, Feijão T, Vromans MJ, Sunkel CE, Lens SM. Aurora B kinase cooperates with CENP-E to promote timely anaphase onset. Chromosoma. 2010 119:405-13.  Yang CP, Liu L, Ikui AE, Horwitz SB. The interaction between mitotic checkpoint proteins, CENP-E and BubR1, is diminished in epothilone B-resistant A549 cells. Cell Cycle. 2010 Mar 15;9(6):1207-13.  Hess B, Kutzner C, van der Spoel D, Lindahl E (2008) GROMACS 4: Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation. J Chem Theory Comput. 4:435–447.  Frisch C, Fersht AR, Schreiber G. Experimental assignment of the structure of the transition state for the association of barnase and barstar. J Mol Biol. 2001 308:69-77.
  • 23. AcknowledgementAcknowledgement J. Febin Prabhudass Asst. Prof. Seniour School of Biosciences and Technology VIT Univerity