SlideShare a Scribd company logo
July 17th 2014
Vijayaraj Nagarajan PhD
Computational Molecular Biology Specialist
BCBB/OCICB/NIAID/NIH
3
§  The function prediction problem
§  Methods and approaches
§  MSA case study
§  M46 : a different direction
§  How good are the predictions…?
Outline
http://www.clker.com/clipart-24887.html
“Unknown function”
3951259
6
7754904
“Unknown function”
Databases
Tool
Boxes
Which tool to use..?
Problem or opportunity….?
Tool
Boxes
Databases
ü The function prediction problem
§  Methods and approaches
§  MSA case study
§  M46 : a different direction
§  How good are the predictions…?
§  Based on…..
•  Physicochemical properties
•  MW, pI, Amino acid composition etc.
•  Sequence similarity
•  Primary sequence, Patterns, Domains, Motifs and Profiles
•  Structure similarity
•  CATH, SCOP, PDB, fold libraries
•  Functional properties
•  Catalytic activity, Post-translational modifications, Binding
activity
Function Prediction
•  Gene expression data
•  GEO, ArrayExpress, Microarray Meta-Miner (MMM)
•  Biomolecular interaction information
•  Interaction partners
•  Epigenetics (methylation, histone modifications etc.)
•  Associations from genome scale data – NGS
•  Phylogeny
•  Sequence and structure similarity
•  Gene Ontology
•  Semantic distance
•  Text mining
•  Associations from literature
Function prediction workflow
ü The function prediction problem
ü Methods and approaches
§  MSA case study
§  M46 : a different direction
§  How good are the predictions…?
Environment…?
agr
cap
hlgA
etc.,
sarA
nuc
aur
ssp
etc.,
SA1233
Sambanthamoorthy et al., Microbiology (152) 2006
Nagarajan et al., BMC Bioinformatics. 8 (Suppl 7):S5 2007
msa
Msa – a case study
http://blast.ncbi.nlm.nih.gov/Blast.cgi
BLASTp
BLASTn
PSI BLAST Iteration I PSI BLAST Iteration 2 PSI BLAST Iteration 3
PSI-BLAST Hits
 
	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  
List of Observations
First 40 – pattern..?
Transporter/efflux/membrane..?
PSI-BLAST Hits
PSI BLAST Iteration 3 with PAM70
BLAST against PDB Database
http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi
NCBI Conserved Domain Database (CDD)
 
	
  
List of Observations
First 40 – pattern.. (DUF?)
Transporter/efflux/membrane
§  Physicochemical properties
§  Amino acid scale representation
•  GRAVY (Grand Average of Hydropathy)
—  Msa - 1.021 (Hydrophobic)
Lasergene
	
  
	
  
List of Observations
First 40 – pattern.. (DUF?)
Transporter/efflux/membrane..?
Hydrophobic..?
The next step…?
§  Solubility and localization
–  ProtFun – envelope, non-enzyme
–  SVMProt – transmembrane
–  CELLO – membrane
–  PSORT – membrane
–  ProtCompB – membrane
–  PSLPred – membrane
§  Msa – insoluble, non-enzyme, membrane
	
  
	
  
List of Observations
First 40 – pattern.. (DUF?)
Transporter/efflux/membrane..?
Hydrophobic/insoluble/non-enzyme
Signals/Domains/Patterns/Motifs
SP – Signal Peptide
CP – Cleavage Position
If membrane, what next…?
SP – Signal Peptide
CP – Cleavage Position
More about membrane
TMHMM
DAS
Membrane topology
+1
0
+5
-1
Von Heijne, J Mol Biol. 225(2): 487-494. 1992
	
  
	
  
List of Observations
First 40 – pattern.. (DUF?)
Transporter/efflux/membrane..?
Hydrophobic/insoluble/non-enzyme
Signal peptide 1-20
TM1, TM2, TM3 – N-inside ✔
N
N
N
N
C
C
C
C
Positive-Inside rule and
Charge bias
http://pbil.ibcp.fr/htm/index.php
	
  
	
  
List of Observations
First 40 – pattern.. (DUF?)
Transporter/efflux/membrane..?
Hydrophobic/insoluble/non-enzyme
Signal peptide 1-20
TM1, TM2, TM3 – N-inside ✔



Three Transmembrane HELIX


Network Protein Sequence
Analysis
http://smart.embl-heidelberg.de/
PreATP-­‐grasp	
  domain	
  in	
  Msa.	
  	
  
§  Structural	
  Classifica:on	
  of	
  Proteins	
  (SCOP)	
  
entry:	
  d1gsa	
  1	
  
§  Usually	
  precedes	
  the	
  ATP-­‐grasp	
  domain	
  and	
  
could	
  contain	
  a	
  substrate-­‐binding	
  func:on.	
  	
  
§  Located	
  between	
  the	
  85th	
  and	
  116th	
  
residue.	
  
§  Interes:ngly	
  this	
  loca:on	
  is	
  predicted	
  to	
  be	
  
in	
  a	
  cytoplasmic	
  loop	
  region	
  of	
  Msa	
  
	
  
	
  
List of Observations
First 40 – pattern.. (DUF?)
Transporter/efflux/membrane..?
Hydrophobic/insoluble/non-enzyme
Signal peptide 1-20
TM1, TM2, TM3 – N-inside ✔



Three Transmembrane HELIX
PreATP-grasp domain


Functional sites
http://www.ebi.ac.uk/intact/site/index.jsf
Any interactions…?
http://www.ebi.ac.uk/intact/site/index.jsf
Any interactions…?
http://www.ebi.ac.uk/intact/site/index.jsf
Any interactions…?
	
  
	
  
List of Observations
First 40 – pattern.. (DUF?)
Transporter/efflux/membrane..?
Hydrophobic/insoluble/non-enzyme
Signal peptide 1-20
TM1, TM2, TM3 – N-inside ✔



Three Transmembrane HELIX
PreATP-grasp domain
Responds to environment…?
http://www.bioinformatics.org/sammd/
	
  
	
  
List of Observations
First 40 – pattern.. (DUF?)
Transporter/efflux/membrane..?
Hydrophobic/insoluble/non-enzyme
Signal peptide 1-20
TM1, TM2, TM3 – N-inside ✔



Three Transmembrane HELIX
PreATP-grasp domain
Responds to environment ✔


Gene expression
http://www.ebi.ac.uk/interpro/
InterProScan
 
	
  
List of Observations
First 40 – pattern.. (DUF?)
Transporter/efflux/membrane..?
Hydrophobic/insoluble/non-enzyme
Signal peptide 1-20
TM1, TM2, TM3 – N-inside ✔



Three Transmembrane HELIX
PreATP-grasp domain
Responds to environment ✔
Phosphorylation sites 48, 49, 99


**
*
Patterns
http://www.ihop-net.org/UniPub/iHOP/
Literature mining
•  Homology Modeling
•  No homologous structures in PDB
•  FOLD recognition
•  Phyre
Structure based predictions
§  Swiss-Pdb Viewer
–  Energy minimization
–  PHI/PSI angle
–  Loop
§  Structure validation
–  Verify3D
Model of Msa
Ramachandran plot for the predicted tertiary structure of the
Msa protein pre (A) and post (B) refinement
Quality of the model
 
	
  
List of Observations
First 40 – pattern.. (DUF?)
Transporter/efflux/membrane..?
Hydrophobic/insoluble/non-enzyme
Signal peptide 1-20
TM1, TM2, TM3 – N-inside ✔



Three Transmembrane HELIX
PreATP-grasp domain
Responds to environment ✔
Phosphorylation sites 48, 49, 99


Predicted 3D structure of Msa
Binding site predictions for the Msa protein. (A) ProFunc
predicted binding site (red); (B) PINUP predicted binding
site (interface in green); (C) Q-SiteFinder predicted binding
site and binding residues (pink)
	
  
	
  
List of Observations
First 40 – pattern.. (DUF?)
Transporter/efflux/membrane..?
Hydrophobic/insoluble/non-enzyme
Signal peptide 1-20
TM1, TM2, TM3 – N-inside ✔



Three Transmembrane HELIX
PreATP-grasp domain
Responds to environment ✔
Phosphorylation sites 48, 49, 99
Binding sites - cytoplasmic region?


Binding sites
ProFunc
–  “nest” near the putative phosphorylation site (47-50)
–  47-50; predicted outside membrane
–  All residues conserved at the “nest”
–  “nest” shows features of anion-binding site
–  “nest” characteristic functional motifs in ATP or GTP
binding proteins 	
  
	
  
List of Observations
First 40 – pattern.. (DUF?)
Transporter/efflux/membrane..?
Hydrophobic/insoluble/non-enzyme
Signal peptide 1-20
TM1, TM2, TM3 – N-inside ✔



Three Transmembrane HELIX
PreATP-grasp domain
Responds to environment ✔
Phosphorylation sites 48, 49, 99
Binding sites - cytoplasmic region?


Function site (“nest”) – outside..?
Predicted “nest”
§  Multiple sequence alignment (ClustalW) of Msa protein
sequence from 11 different strains
–  12 variations in strain RF122
§  1 replacement, 11 substitutions
•  1 substitution in pre-ATP grasp domain
–  7 variations in strain MRSA252
§  2 replacements, 5 substitutions
•  1 replacement in pre-ATP grasp domain
• Replacement
• Hydrophilic -> Hydrophobic
• Ser -> Gly
• Substitution
• Hydrophilic -> Hydrophilic
• Ser -> Glu
Function motifs: conserved…?
–  Variation at aa positions 111, 131, 133 common
–  None in Phosphorylation sites, signal peptide, “nest”
	
  
	
  
List of Observations
First 40 – pattern.. (DUF?)
Transporter/efflux/membrane..?
Hydrophobic/insoluble/non-enzyme
Signal peptide 1-20
TM1, TM2, TM3 – N-inside ✔



Three Transmembrane HELIX
PreATP-grasp domain
Responds to environment ✔
Phosphorylation sites 48, 49, 99
Binding sites - cytoplasmic region?


Function site (“nest”) – outside..?
Functional sites highly conserved..?
Function motifs: conserved…?
 
	
  
List of Observations
First 40 – pattern.. (DUF?)
Transporter/efflux/membrane..?
Hydrophobic/insoluble/non-enzyme
Signal peptide 1-20
TM1, TM2, TM3 – N-inside ✔



Three Transmembrane HELIX
PreATP-grasp domain
Responds to environment ✔
Phosphorylation sites 48, 49, 99
Binding sites - cytoplasmic region?


Function site (“nest”) – outside..?
Functional sites highly conserved..?
Msa – a putative
signal transducer
Msa – a putative signal transducer
msa and sarA transcriptome
msa transcriptome - sammd
Msa – regulatory network
ü The function prediction problem
ü Methods and approaches
ü MSA case study
§  M46 : a different direction
§  How good are the predictions…?
Any questions…?
§  Mold specific
•  Histoplasma capsulatum
•  Only in mold not in yeast
M46 – a different direction
M46 – PSI-BLAST
§  Nucleotide binding
–  DNA/RNA
–  S-S bonds
§  Look for motifs
–  Predict motifs, build HMM, search for similar
§  Localization
–  Secreted, ER signal, ER modifications
M46 – the clues
ü The function prediction problem
ü Methods and approaches
ü MSA case study
ü M46 : a different direction
§  How good are the
predictions…?
•  Only	
  good	
  if	
  it	
  would	
  make	
  any	
  biological	
  sense	
  
•  Only	
  good	
  if	
  it	
  could	
  be	
  supported	
  by	
  follow	
  up	
  experimental	
  evidence	
  
•  No	
  hits	
  
•  Reduce	
  threshold	
  -­‐	
  expect	
  worst	
  evalues-­‐pvalues	
  
•  Cut-­‐off	
  value	
  paradox	
  (if	
  <0.05	
  is	
  significant,	
  what	
  about	
  0.051…?)	
  
•  It	
  is	
  OK	
  to	
  look	
  at	
  hits	
  with	
  poor	
  evalue-­‐pvalue	
  
•  Cannot	
  assign	
  homology,	
  can	
  pick	
  clues	
  (your	
  only	
  hope)	
  
How good are the predictions…?
nagarajanv@mail.nih.gov	
  
ScienceApps@niaid.nih.gov	
  
	
  
hYp://bioinforma:cs.niaid.nih.gov	
  
Questions…?

More Related Content

What's hot

DNA protein interaction.pptx
DNA protein interaction.pptxDNA protein interaction.pptx
DNA protein interaction.pptx
shwetaliprajapati
 
Homology modeling: Modeller
Homology modeling: ModellerHomology modeling: Modeller
Genome organisation
Genome organisationGenome organisation
Genome organisation
Arun Geetha Viswanathan
 
Analysis of gene expression
Analysis of gene expressionAnalysis of gene expression
Analysis of gene expressionTapeshwar Yadav
 
Genomics and proteomics (Bioinformatics)
Genomics and proteomics (Bioinformatics)Genomics and proteomics (Bioinformatics)
Genomics and proteomics (Bioinformatics)
Sijo A
 
gene prediction programs
gene prediction programsgene prediction programs
gene prediction programs
MugdhaSharma11
 
Protein 3 d structure prediction
Protein 3 d structure predictionProtein 3 d structure prediction
Protein 3 d structure prediction
Samvartika Majumdar
 
Protein micro array
Protein micro arrayProtein micro array
Protein micro array
krupa sagar
 
Protein protein interaction
Protein protein interactionProtein protein interaction
Protein protein interaction
Aashish Patel
 
Protein Structure Determination
Protein Structure DeterminationProtein Structure Determination
Protein Structure DeterminationAmjad Ibrahim
 
Whole genome shotgun sequencing
Whole genome shotgun sequencingWhole genome shotgun sequencing
Whole genome shotgun sequencing
Goutham Sarovar
 
Express sequence tags
Express sequence tagsExpress sequence tags
Express sequence tags
Dhananjay Desai
 
Protein dna interactions
Protein dna interactionsProtein dna interactions
Protein dna interactions
Mandeep Kaur
 
Chromosome walking
Chromosome walkingChromosome walking
Chromosome walking
Sivasangari Shanmugam
 
Protein structure prediction (1)
Protein structure prediction (1)Protein structure prediction (1)
Protein structure prediction (1)
Sabahat Ali
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomicshemantbreeder
 
DNA SEQUENCING METHODS AND STRATEGIES FOR GENOME SEQUENCING
DNA SEQUENCING METHODS AND STRATEGIES FOR GENOME SEQUENCINGDNA SEQUENCING METHODS AND STRATEGIES FOR GENOME SEQUENCING
DNA SEQUENCING METHODS AND STRATEGIES FOR GENOME SEQUENCING
Puneet Kulyana
 
Functional proteomics, and tools
Functional proteomics, and toolsFunctional proteomics, and tools
Functional proteomics, and tools
KAUSHAL SAHU
 

What's hot (20)

DNA protein interaction.pptx
DNA protein interaction.pptxDNA protein interaction.pptx
DNA protein interaction.pptx
 
Homology modeling: Modeller
Homology modeling: ModellerHomology modeling: Modeller
Homology modeling: Modeller
 
Genome organisation
Genome organisationGenome organisation
Genome organisation
 
Analysis of gene expression
Analysis of gene expressionAnalysis of gene expression
Analysis of gene expression
 
Genomics and proteomics (Bioinformatics)
Genomics and proteomics (Bioinformatics)Genomics and proteomics (Bioinformatics)
Genomics and proteomics (Bioinformatics)
 
gene prediction programs
gene prediction programsgene prediction programs
gene prediction programs
 
Protein 3 d structure prediction
Protein 3 d structure predictionProtein 3 d structure prediction
Protein 3 d structure prediction
 
Protein micro array
Protein micro arrayProtein micro array
Protein micro array
 
Protein protein interaction
Protein protein interactionProtein protein interaction
Protein protein interaction
 
Protein Structure Determination
Protein Structure DeterminationProtein Structure Determination
Protein Structure Determination
 
Whole genome shotgun sequencing
Whole genome shotgun sequencingWhole genome shotgun sequencing
Whole genome shotgun sequencing
 
Express sequence tags
Express sequence tagsExpress sequence tags
Express sequence tags
 
Protein dna interactions
Protein dna interactionsProtein dna interactions
Protein dna interactions
 
Lecture 2
Lecture 2Lecture 2
Lecture 2
 
Chromosome walking
Chromosome walkingChromosome walking
Chromosome walking
 
Protein structure prediction (1)
Protein structure prediction (1)Protein structure prediction (1)
Protein structure prediction (1)
 
Comparative genomics
Comparative genomicsComparative genomics
Comparative genomics
 
DNA SEQUENCING METHODS AND STRATEGIES FOR GENOME SEQUENCING
DNA SEQUENCING METHODS AND STRATEGIES FOR GENOME SEQUENCINGDNA SEQUENCING METHODS AND STRATEGIES FOR GENOME SEQUENCING
DNA SEQUENCING METHODS AND STRATEGIES FOR GENOME SEQUENCING
 
Protein Structure Prediction
Protein Structure PredictionProtein Structure Prediction
Protein Structure Prediction
 
Functional proteomics, and tools
Functional proteomics, and toolsFunctional proteomics, and tools
Functional proteomics, and tools
 

Similar to Protein function prediction

第2回LinkedData勉強会@yayamamo
第2回LinkedData勉強会@yayamamo第2回LinkedData勉強会@yayamamo
第2回LinkedData勉強会@yayamamo
yayamamo @ DBCLS Kashiwanoha
 
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
fruitbreedomics
 
Creation, curation and analysis of RNA and Protein alignments with Jalview
Creation, curation and analysis of RNA and Protein alignments with JalviewCreation, curation and analysis of RNA and Protein alignments with Jalview
Creation, curation and analysis of RNA and Protein alignments with Jalview
Jim Procter
 
Transcription in prokaryotes.
Transcription in prokaryotes.Transcription in prokaryotes.
Transcription in prokaryotes.
ASM NAFIS BIOLOGY
 
20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_course
hansjansen9999
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomics
Ajit Shinde
 
Hamas 1
Hamas 1Hamas 1
RNA-seq quality control and pre-processing
RNA-seq quality control and pre-processingRNA-seq quality control and pre-processing
RNA-seq quality control and pre-processing
mikaelhuss
 
DNA Markers Techniques for Plant Varietal Identification
DNA Markers Techniques for Plant Varietal Identification DNA Markers Techniques for Plant Varietal Identification
DNA Markers Techniques for Plant Varietal Identification
Senthil Natesan
 
Introduction to 16S Analysis with NGS - BMR Genomics
Introduction to 16S Analysis with NGS - BMR GenomicsIntroduction to 16S Analysis with NGS - BMR Genomics
Introduction to 16S Analysis with NGS - BMR Genomics
Andrea Telatin
 
A Comparison of NGS Platforms.
A Comparison of NGS Platforms.A Comparison of NGS Platforms.
A Comparison of NGS Platforms.mkim8
 
PomBase conventions for improving annotation depth, breadth, consistency and ...
PomBase conventions for improving annotation depth, breadth, consistency and ...PomBase conventions for improving annotation depth, breadth, consistency and ...
PomBase conventions for improving annotation depth, breadth, consistency and ...
Valerie Wood
 
Thesis bio bix_2014
Thesis bio bix_2014Thesis bio bix_2014
Thesis bio bix_2014
Prof. Wim Van Criekinge
 
GENE_TRANSCRIPTION_PROCESS_IN_BACTERIA_L5.pdf
GENE_TRANSCRIPTION_PROCESS_IN_BACTERIA_L5.pdfGENE_TRANSCRIPTION_PROCESS_IN_BACTERIA_L5.pdf
GENE_TRANSCRIPTION_PROCESS_IN_BACTERIA_L5.pdf
LazarusJoseph5
 
Fly chromatin dynamics using bidirectional hidden markov model
Fly chromatin dynamics using bidirectional hidden markov modelFly chromatin dynamics using bidirectional hidden markov model
Fly chromatin dynamics using bidirectional hidden markov model
Sanju K. Sinha
 
Webinar about JASPAR BioPython module and MANTA.
Webinar about JASPAR BioPython module and MANTA.Webinar about JASPAR BioPython module and MANTA.
Webinar about JASPAR BioPython module and MANTA.
amathelier
 
Xpert MTB RIF
Xpert MTB RIFXpert MTB RIF
Xpert MTB RIF
Dr. Rajat Prakash
 
Recombinase cre lox and flp-frt
Recombinase cre lox and flp-frtRecombinase cre lox and flp-frt
Recombinase cre lox and flp-frt
KAUSHAL SAHU
 
RNA Metabolism
RNA MetabolismRNA Metabolism
RNA Metabolism
Prof Viyatprajna Acharya
 

Similar to Protein function prediction (20)

Thesis biobix
Thesis biobixThesis biobix
Thesis biobix
 
第2回LinkedData勉強会@yayamamo
第2回LinkedData勉強会@yayamamo第2回LinkedData勉強会@yayamamo
第2回LinkedData勉強会@yayamamo
 
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
Fruit breedomics workshop wp6 from marker assisted breeding to genomics assis...
 
Creation, curation and analysis of RNA and Protein alignments with Jalview
Creation, curation and analysis of RNA and Protein alignments with JalviewCreation, curation and analysis of RNA and Protein alignments with Jalview
Creation, curation and analysis of RNA and Protein alignments with Jalview
 
Transcription in prokaryotes.
Transcription in prokaryotes.Transcription in prokaryotes.
Transcription in prokaryotes.
 
20150601 bio sb_assembly_course
20150601 bio sb_assembly_course20150601 bio sb_assembly_course
20150601 bio sb_assembly_course
 
Functional genomics
Functional genomicsFunctional genomics
Functional genomics
 
Hamas 1
Hamas 1Hamas 1
Hamas 1
 
RNA-seq quality control and pre-processing
RNA-seq quality control and pre-processingRNA-seq quality control and pre-processing
RNA-seq quality control and pre-processing
 
DNA Markers Techniques for Plant Varietal Identification
DNA Markers Techniques for Plant Varietal Identification DNA Markers Techniques for Plant Varietal Identification
DNA Markers Techniques for Plant Varietal Identification
 
Introduction to 16S Analysis with NGS - BMR Genomics
Introduction to 16S Analysis with NGS - BMR GenomicsIntroduction to 16S Analysis with NGS - BMR Genomics
Introduction to 16S Analysis with NGS - BMR Genomics
 
A Comparison of NGS Platforms.
A Comparison of NGS Platforms.A Comparison of NGS Platforms.
A Comparison of NGS Platforms.
 
PomBase conventions for improving annotation depth, breadth, consistency and ...
PomBase conventions for improving annotation depth, breadth, consistency and ...PomBase conventions for improving annotation depth, breadth, consistency and ...
PomBase conventions for improving annotation depth, breadth, consistency and ...
 
Thesis bio bix_2014
Thesis bio bix_2014Thesis bio bix_2014
Thesis bio bix_2014
 
GENE_TRANSCRIPTION_PROCESS_IN_BACTERIA_L5.pdf
GENE_TRANSCRIPTION_PROCESS_IN_BACTERIA_L5.pdfGENE_TRANSCRIPTION_PROCESS_IN_BACTERIA_L5.pdf
GENE_TRANSCRIPTION_PROCESS_IN_BACTERIA_L5.pdf
 
Fly chromatin dynamics using bidirectional hidden markov model
Fly chromatin dynamics using bidirectional hidden markov modelFly chromatin dynamics using bidirectional hidden markov model
Fly chromatin dynamics using bidirectional hidden markov model
 
Webinar about JASPAR BioPython module and MANTA.
Webinar about JASPAR BioPython module and MANTA.Webinar about JASPAR BioPython module and MANTA.
Webinar about JASPAR BioPython module and MANTA.
 
Xpert MTB RIF
Xpert MTB RIFXpert MTB RIF
Xpert MTB RIF
 
Recombinase cre lox and flp-frt
Recombinase cre lox and flp-frtRecombinase cre lox and flp-frt
Recombinase cre lox and flp-frt
 
RNA Metabolism
RNA MetabolismRNA Metabolism
RNA Metabolism
 

More from Bioinformatics and Computational Biosciences Branch

Virus Sequence Alignment and Phylogenetic Analysis 2019
Virus Sequence Alignment and Phylogenetic Analysis 2019Virus Sequence Alignment and Phylogenetic Analysis 2019
Virus Sequence Alignment and Phylogenetic Analysis 2019
Bioinformatics and Computational Biosciences Branch
 
Nephele 2.0: How to get the most out of your Nephele results
Nephele 2.0: How to get the most out of your Nephele resultsNephele 2.0: How to get the most out of your Nephele results
Nephele 2.0: How to get the most out of your Nephele results
Bioinformatics and Computational Biosciences Branch
 
Introduction to METAGENOTE
Introduction to METAGENOTE Introduction to METAGENOTE
Intro to homology modeling
Intro to homology modelingIntro to homology modeling
Protein structure prediction with a focus on Rosetta
Protein structure prediction with a focus on RosettaProtein structure prediction with a focus on Rosetta
Protein structure prediction with a focus on Rosetta
Bioinformatics and Computational Biosciences Branch
 
Biological networks
Biological networksBiological networks
UNIX Basics and Cluster Computing
UNIX Basics and Cluster ComputingUNIX Basics and Cluster Computing
UNIX Basics and Cluster Computing
Bioinformatics and Computational Biosciences Branch
 
Statistical applications in GraphPad Prism
Statistical applications in GraphPad PrismStatistical applications in GraphPad Prism
Statistical applications in GraphPad Prism
Bioinformatics and Computational Biosciences Branch
 
Intro to JMP for statistics
Intro to JMP for statisticsIntro to JMP for statistics
Categorical models
Categorical modelsCategorical models
Automating biostatistics workflows using R-based webtools
Automating biostatistics workflows using R-based webtoolsAutomating biostatistics workflows using R-based webtools
Automating biostatistics workflows using R-based webtools
Bioinformatics and Computational Biosciences Branch
 
Overview of statistical tests: Data handling and data quality (Part II)
Overview of statistical tests: Data handling and data quality (Part II)Overview of statistical tests: Data handling and data quality (Part II)
Overview of statistical tests: Data handling and data quality (Part II)
Bioinformatics and Computational Biosciences Branch
 
Overview of statistics: Statistical testing (Part I)
Overview of statistics: Statistical testing (Part I)Overview of statistics: Statistical testing (Part I)
Overview of statistics: Statistical testing (Part I)
Bioinformatics and Computational Biosciences Branch
 
GraphPad Prism: Curve fitting
GraphPad Prism: Curve fittingGraphPad Prism: Curve fitting
Appendix: Crash course in R and BioConductor
Appendix: Crash course in R and BioConductorAppendix: Crash course in R and BioConductor
Appendix: Crash course in R and BioConductor
Bioinformatics and Computational Biosciences Branch
 
Crash course in R and BioConductor
Crash course in R and BioConductorCrash course in R and BioConductor
Crash course in R and BioConductor
Bioinformatics and Computational Biosciences Branch
 
GraphPad Prism: Customizing your graphs
GraphPad Prism: Customizing your graphsGraphPad Prism: Customizing your graphs
GraphPad Prism: Customizing your graphs
Bioinformatics and Computational Biosciences Branch
 

More from Bioinformatics and Computational Biosciences Branch (20)

Hong_Celine_ES_workshop.pptx
Hong_Celine_ES_workshop.pptxHong_Celine_ES_workshop.pptx
Hong_Celine_ES_workshop.pptx
 
Virus Sequence Alignment and Phylogenetic Analysis 2019
Virus Sequence Alignment and Phylogenetic Analysis 2019Virus Sequence Alignment and Phylogenetic Analysis 2019
Virus Sequence Alignment and Phylogenetic Analysis 2019
 
Nephele 2.0: How to get the most out of your Nephele results
Nephele 2.0: How to get the most out of your Nephele resultsNephele 2.0: How to get the most out of your Nephele results
Nephele 2.0: How to get the most out of your Nephele results
 
Introduction to METAGENOTE
Introduction to METAGENOTE Introduction to METAGENOTE
Introduction to METAGENOTE
 
Intro to homology modeling
Intro to homology modelingIntro to homology modeling
Intro to homology modeling
 
Protein docking
Protein dockingProtein docking
Protein docking
 
Protein structure prediction with a focus on Rosetta
Protein structure prediction with a focus on RosettaProtein structure prediction with a focus on Rosetta
Protein structure prediction with a focus on Rosetta
 
Biological networks
Biological networksBiological networks
Biological networks
 
UNIX Basics and Cluster Computing
UNIX Basics and Cluster ComputingUNIX Basics and Cluster Computing
UNIX Basics and Cluster Computing
 
Statistical applications in GraphPad Prism
Statistical applications in GraphPad PrismStatistical applications in GraphPad Prism
Statistical applications in GraphPad Prism
 
Intro to JMP for statistics
Intro to JMP for statisticsIntro to JMP for statistics
Intro to JMP for statistics
 
Categorical models
Categorical modelsCategorical models
Categorical models
 
Better graphics in R
Better graphics in RBetter graphics in R
Better graphics in R
 
Automating biostatistics workflows using R-based webtools
Automating biostatistics workflows using R-based webtoolsAutomating biostatistics workflows using R-based webtools
Automating biostatistics workflows using R-based webtools
 
Overview of statistical tests: Data handling and data quality (Part II)
Overview of statistical tests: Data handling and data quality (Part II)Overview of statistical tests: Data handling and data quality (Part II)
Overview of statistical tests: Data handling and data quality (Part II)
 
Overview of statistics: Statistical testing (Part I)
Overview of statistics: Statistical testing (Part I)Overview of statistics: Statistical testing (Part I)
Overview of statistics: Statistical testing (Part I)
 
GraphPad Prism: Curve fitting
GraphPad Prism: Curve fittingGraphPad Prism: Curve fitting
GraphPad Prism: Curve fitting
 
Appendix: Crash course in R and BioConductor
Appendix: Crash course in R and BioConductorAppendix: Crash course in R and BioConductor
Appendix: Crash course in R and BioConductor
 
Crash course in R and BioConductor
Crash course in R and BioConductorCrash course in R and BioConductor
Crash course in R and BioConductor
 
GraphPad Prism: Customizing your graphs
GraphPad Prism: Customizing your graphsGraphPad Prism: Customizing your graphs
GraphPad Prism: Customizing your graphs
 

Recently uploaded

Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
kumarmathi863
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
Sérgio Sacani
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
IvanMallco1
 
insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
anitaento25
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
subedisuryaofficial
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
muralinath2
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
RenuJangid3
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
IqrimaNabilatulhusni
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SELF-EXPLANATORY
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
Scintica Instrumentation
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
muralinath2
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
Areesha Ahmad
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
AlaminAfendy1
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
Nistarini College, Purulia (W.B) India
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
Health Advances
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
Richard Gill
 

Recently uploaded (20)

Structures and textures of metamorphic rocks
Structures and textures of metamorphic rocksStructures and textures of metamorphic rocks
Structures and textures of metamorphic rocks
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...Multi-source connectivity as the driver of solar wind variability in the heli...
Multi-source connectivity as the driver of solar wind variability in the heli...
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
 
insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
platelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptxplatelets- lifespan -Clot retraction-disorders.pptx
platelets- lifespan -Clot retraction-disorders.pptx
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
 
general properties of oerganologametal.ppt
general properties of oerganologametal.pptgeneral properties of oerganologametal.ppt
general properties of oerganologametal.ppt
 
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
 
GBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram StainingGBSN- Microbiology (Lab 3) Gram Staining
GBSN- Microbiology (Lab 3) Gram Staining
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
 

Protein function prediction

  • 1. July 17th 2014 Vijayaraj Nagarajan PhD Computational Molecular Biology Specialist BCBB/OCICB/NIAID/NIH
  • 2.
  • 3. 3
  • 4. §  The function prediction problem §  Methods and approaches §  MSA case study §  M46 : a different direction §  How good are the predictions…? Outline
  • 9. ü The function prediction problem §  Methods and approaches §  MSA case study §  M46 : a different direction §  How good are the predictions…?
  • 10. §  Based on….. •  Physicochemical properties •  MW, pI, Amino acid composition etc. •  Sequence similarity •  Primary sequence, Patterns, Domains, Motifs and Profiles •  Structure similarity •  CATH, SCOP, PDB, fold libraries •  Functional properties •  Catalytic activity, Post-translational modifications, Binding activity Function Prediction
  • 11. •  Gene expression data •  GEO, ArrayExpress, Microarray Meta-Miner (MMM) •  Biomolecular interaction information •  Interaction partners •  Epigenetics (methylation, histone modifications etc.) •  Associations from genome scale data – NGS •  Phylogeny •  Sequence and structure similarity •  Gene Ontology •  Semantic distance •  Text mining •  Associations from literature
  • 13. ü The function prediction problem ü Methods and approaches §  MSA case study §  M46 : a different direction §  How good are the predictions…?
  • 14. Environment…? agr cap hlgA etc., sarA nuc aur ssp etc., SA1233 Sambanthamoorthy et al., Microbiology (152) 2006 Nagarajan et al., BMC Bioinformatics. 8 (Suppl 7):S5 2007 msa Msa – a case study
  • 17. PSI BLAST Iteration I PSI BLAST Iteration 2 PSI BLAST Iteration 3 PSI-BLAST Hits
  • 18.                   List of Observations First 40 – pattern..? Transporter/efflux/membrane..? PSI-BLAST Hits
  • 19. PSI BLAST Iteration 3 with PAM70 BLAST against PDB Database
  • 21.     List of Observations First 40 – pattern.. (DUF?) Transporter/efflux/membrane
  • 22. §  Physicochemical properties §  Amino acid scale representation •  GRAVY (Grand Average of Hydropathy) —  Msa - 1.021 (Hydrophobic) Lasergene     List of Observations First 40 – pattern.. (DUF?) Transporter/efflux/membrane..? Hydrophobic..? The next step…?
  • 23. §  Solubility and localization –  ProtFun – envelope, non-enzyme –  SVMProt – transmembrane –  CELLO – membrane –  PSORT – membrane –  ProtCompB – membrane –  PSLPred – membrane §  Msa – insoluble, non-enzyme, membrane     List of Observations First 40 – pattern.. (DUF?) Transporter/efflux/membrane..? Hydrophobic/insoluble/non-enzyme
  • 24. Signals/Domains/Patterns/Motifs SP – Signal Peptide CP – Cleavage Position If membrane, what next…?
  • 25. SP – Signal Peptide CP – Cleavage Position More about membrane
  • 27. +1 0 +5 -1 Von Heijne, J Mol Biol. 225(2): 487-494. 1992     List of Observations First 40 – pattern.. (DUF?) Transporter/efflux/membrane..? Hydrophobic/insoluble/non-enzyme Signal peptide 1-20 TM1, TM2, TM3 – N-inside ✔ N N N N C C C C Positive-Inside rule and Charge bias
  • 28. http://pbil.ibcp.fr/htm/index.php     List of Observations First 40 – pattern.. (DUF?) Transporter/efflux/membrane..? Hydrophobic/insoluble/non-enzyme Signal peptide 1-20 TM1, TM2, TM3 – N-inside ✔ Three Transmembrane HELIX Network Protein Sequence Analysis
  • 30. PreATP-­‐grasp  domain  in  Msa.     §  Structural  Classifica:on  of  Proteins  (SCOP)   entry:  d1gsa  1   §  Usually  precedes  the  ATP-­‐grasp  domain  and   could  contain  a  substrate-­‐binding  func:on.     §  Located  between  the  85th  and  116th   residue.   §  Interes:ngly  this  loca:on  is  predicted  to  be   in  a  cytoplasmic  loop  region  of  Msa       List of Observations First 40 – pattern.. (DUF?) Transporter/efflux/membrane..? Hydrophobic/insoluble/non-enzyme Signal peptide 1-20 TM1, TM2, TM3 – N-inside ✔ Three Transmembrane HELIX PreATP-grasp domain Functional sites
  • 33. http://www.ebi.ac.uk/intact/site/index.jsf Any interactions…?     List of Observations First 40 – pattern.. (DUF?) Transporter/efflux/membrane..? Hydrophobic/insoluble/non-enzyme Signal peptide 1-20 TM1, TM2, TM3 – N-inside ✔ Three Transmembrane HELIX PreATP-grasp domain Responds to environment…?
  • 34. http://www.bioinformatics.org/sammd/     List of Observations First 40 – pattern.. (DUF?) Transporter/efflux/membrane..? Hydrophobic/insoluble/non-enzyme Signal peptide 1-20 TM1, TM2, TM3 – N-inside ✔ Three Transmembrane HELIX PreATP-grasp domain Responds to environment ✔ Gene expression
  • 36.     List of Observations First 40 – pattern.. (DUF?) Transporter/efflux/membrane..? Hydrophobic/insoluble/non-enzyme Signal peptide 1-20 TM1, TM2, TM3 – N-inside ✔ Three Transmembrane HELIX PreATP-grasp domain Responds to environment ✔ Phosphorylation sites 48, 49, 99 ** * Patterns
  • 38. •  Homology Modeling •  No homologous structures in PDB •  FOLD recognition •  Phyre Structure based predictions
  • 39. §  Swiss-Pdb Viewer –  Energy minimization –  PHI/PSI angle –  Loop §  Structure validation –  Verify3D Model of Msa
  • 40. Ramachandran plot for the predicted tertiary structure of the Msa protein pre (A) and post (B) refinement Quality of the model
  • 41.     List of Observations First 40 – pattern.. (DUF?) Transporter/efflux/membrane..? Hydrophobic/insoluble/non-enzyme Signal peptide 1-20 TM1, TM2, TM3 – N-inside ✔ Three Transmembrane HELIX PreATP-grasp domain Responds to environment ✔ Phosphorylation sites 48, 49, 99 Predicted 3D structure of Msa
  • 42. Binding site predictions for the Msa protein. (A) ProFunc predicted binding site (red); (B) PINUP predicted binding site (interface in green); (C) Q-SiteFinder predicted binding site and binding residues (pink)     List of Observations First 40 – pattern.. (DUF?) Transporter/efflux/membrane..? Hydrophobic/insoluble/non-enzyme Signal peptide 1-20 TM1, TM2, TM3 – N-inside ✔ Three Transmembrane HELIX PreATP-grasp domain Responds to environment ✔ Phosphorylation sites 48, 49, 99 Binding sites - cytoplasmic region? Binding sites
  • 43. ProFunc –  “nest” near the putative phosphorylation site (47-50) –  47-50; predicted outside membrane –  All residues conserved at the “nest” –  “nest” shows features of anion-binding site –  “nest” characteristic functional motifs in ATP or GTP binding proteins     List of Observations First 40 – pattern.. (DUF?) Transporter/efflux/membrane..? Hydrophobic/insoluble/non-enzyme Signal peptide 1-20 TM1, TM2, TM3 – N-inside ✔ Three Transmembrane HELIX PreATP-grasp domain Responds to environment ✔ Phosphorylation sites 48, 49, 99 Binding sites - cytoplasmic region? Function site (“nest”) – outside..? Predicted “nest”
  • 44. §  Multiple sequence alignment (ClustalW) of Msa protein sequence from 11 different strains –  12 variations in strain RF122 §  1 replacement, 11 substitutions •  1 substitution in pre-ATP grasp domain –  7 variations in strain MRSA252 §  2 replacements, 5 substitutions •  1 replacement in pre-ATP grasp domain • Replacement • Hydrophilic -> Hydrophobic • Ser -> Gly • Substitution • Hydrophilic -> Hydrophilic • Ser -> Glu Function motifs: conserved…?
  • 45. –  Variation at aa positions 111, 131, 133 common –  None in Phosphorylation sites, signal peptide, “nest”     List of Observations First 40 – pattern.. (DUF?) Transporter/efflux/membrane..? Hydrophobic/insoluble/non-enzyme Signal peptide 1-20 TM1, TM2, TM3 – N-inside ✔ Three Transmembrane HELIX PreATP-grasp domain Responds to environment ✔ Phosphorylation sites 48, 49, 99 Binding sites - cytoplasmic region? Function site (“nest”) – outside..? Functional sites highly conserved..? Function motifs: conserved…?
  • 46.     List of Observations First 40 – pattern.. (DUF?) Transporter/efflux/membrane..? Hydrophobic/insoluble/non-enzyme Signal peptide 1-20 TM1, TM2, TM3 – N-inside ✔ Three Transmembrane HELIX PreATP-grasp domain Responds to environment ✔ Phosphorylation sites 48, 49, 99 Binding sites - cytoplasmic region? Function site (“nest”) – outside..? Functional sites highly conserved..? Msa – a putative signal transducer
  • 47. Msa – a putative signal transducer
  • 48. msa and sarA transcriptome
  • 51. ü The function prediction problem ü Methods and approaches ü MSA case study §  M46 : a different direction §  How good are the predictions…? Any questions…?
  • 52. §  Mold specific •  Histoplasma capsulatum •  Only in mold not in yeast M46 – a different direction
  • 54. §  Nucleotide binding –  DNA/RNA –  S-S bonds §  Look for motifs –  Predict motifs, build HMM, search for similar §  Localization –  Secreted, ER signal, ER modifications M46 – the clues
  • 55. ü The function prediction problem ü Methods and approaches ü MSA case study ü M46 : a different direction §  How good are the predictions…?
  • 56.
  • 57. •  Only  good  if  it  would  make  any  biological  sense   •  Only  good  if  it  could  be  supported  by  follow  up  experimental  evidence   •  No  hits   •  Reduce  threshold  -­‐  expect  worst  evalues-­‐pvalues   •  Cut-­‐off  value  paradox  (if  <0.05  is  significant,  what  about  0.051…?)   •  It  is  OK  to  look  at  hits  with  poor  evalue-­‐pvalue   •  Cannot  assign  homology,  can  pick  clues  (your  only  hope)   How good are the predictions…?
  • 58. nagarajanv@mail.nih.gov   ScienceApps@niaid.nih.gov     hYp://bioinforma:cs.niaid.nih.gov   Questions…?