SlideShare a Scribd company logo
1 of 25
Download to read offline
AlphaFold
An overview
AlphaFold
An overview
AlphaFold
MGAFGHGFG
TYHKLAALED
GTLKHHAKLQ
PHLSLLCMF…
What is it?
- AF is an Artificial intelligence program
- Google’s DeepMind
The Goal:
- Predicting the three-dimensional
structure that a protein will adopt
based solely on its amino acid sequence
It “solves” two main problems:
1. Sequence-Structure gap
2. Protein folding Why solving these
problems?
Jumper, J., Evans, R., Pritzel, A. et al. Nature 596, 583–589 (2021).
Sequence-Structure gap
- 1958: determination of
the first protein
structure.
- John Kendrew & Max
Perutz
- Structure determination
(experimental):
- NMR
- X-ray crystallography
- Cryo-Electron
microscopy
- Protein Data Bank:
- Total: ~170,000
- Unique: ~100,000
AlphaFold 1
The protein folding problem
- 1972: Christian Anfisen, Nobel Prize in
Chemistry.
- “It should be possible to determine a
protein’s three-dimensional shape based
solely on its sequence”
- A typical protein could adopt
10^300 different configurations
- Longer than the age of the universe
- However, in nature, proteins spontaneously fold
into their functional shape.
- Cyrus Levinthal’s paradox (1969)
- 50 years open research problem
The protein folding problem
CASP
Critical Assessment of
Techniques for Protein
Structure prediction
• The protein folding Olympics
• The state of the art in
protein structure prediction
- The competition:
- Since 1994
- Takes place every two years
- Last competition: CASP14 – 2020
- Organizers:
- Known both the sequence and the
structure
Participants:
- Receive only the protein’s
sequence
- Must blindly predict the
structure of the proteins
- Predictions: compared with
the experimental data
Homology
modeling
Threading &
Fragment assembly
Molecular
dynamics
INPUT: query sequence Q INPUT: query sequence Q INPUT: query sequence Q
INPUT:
Database of
known folds or
structure
fragments
INPUT:
Database of
protein structures
1. find protein P high sequence
similarity to Q
2. return P’s structure as an
approxima:on to Q’s structure
1. Laws of physics to
simulate folding of Q
1. find a set of fragments that Q
can be aligned with
2. return F as an approximation to
Q’s structure
• Force field
• Molecular
mechanics
CASP before AlphaFold
The metric:
- How well is the prediction compared
with the experimental data?
GDT: Global Distance Test
- Compares two structures
- From 0 to 100 (%)
- Greater is better
- Uses distance cutoffs
- Uses alpha Carbons
- More accurate than RMSD
Homology
modeling
Threading &
Fragment assembly
Molecular
dynamics
CASP and AlphaFold
CASP14: 152 targets
Jumper, J., Evans, R., Pritzel, A. et al. Nature 596, 583–589 (2021).
How does it work? AlphaFold uses Deep Learning
Artificial
Intelligence
Machine
learning
Deep
Learning
Machine learning:
Learn from data
“The field of study that gives computers
the ability to learn without
being explicitly programmed”
Data
Algorithm
Results
Computer
Data
Results
Algorithm
Computer
Traditional Approach
Machine Learning Approach
Grokking Deep Learning/, by Andrew W. Trask, Manning Publications, 2019
How does it work? AlphaFold uses Deep Learning
Artificial
Intelligence
Machine
learning
Deep
Learning
Machine learning:
Learn from data
“The field of study that gives computers
the ability to learn without
being explicitly programmed”
f
X y
ML: approximates f using data (X, y)
𝒇 ≈ #
𝒇 + ℰ
a true relationship
between two variables
The ML model
Grokking Deep Learning/, by Andrew W. Trask, Manning Publications, 2019
Machine Learning
X
y
!
𝒇 𝑿 = %
𝒚
Data = (X, y)
ML model:
1. The ML model (blueprint):
2. A training algorithm
1. Data (training set)
2. Loss function (error)
3. Optimization algorithm
3. A validation and a test set
A linear regression model
The goal: Minimize the error
1. Training set
2. Test set (data never seen by the
model)
Generalization
!
𝒚 = 𝒘 ∗ 𝒙 + 𝒃
!
𝒚 ≈ 𝒚
Deep Learning
%
𝒚 = 𝒘 ∗ 𝒙 + 𝒃
!
𝒚
A linear regression model A Neural Network (Feed Forward)
!
𝒚
𝒙𝟏
𝒙𝟐
𝒙𝟑
𝒂𝒌
𝒂𝟏
𝟐
𝒘𝟏 𝒘𝟐
𝒘𝟑
𝒂𝟏
𝟏
𝒂𝟏
𝟑
𝒂𝟏
𝟒
𝒂𝟐
𝟐
𝒂𝟐
𝟏
𝒂𝟐
𝟑
𝒂𝟐
𝟒
𝒂𝟑
𝟏
prediction
prediction
A node with
its input
edges
Activation
function
- More complex models
- Learns no linear
relaBonships
- Learns interacBons
between features (X)
- Feature
extrac:on
- A linear
relationship
between x and y
Machine Learning
More than three hidden layers
Key: Feature Extraction
AlphaFold 1
The model:
• CASP13 (2018)
• Convolutional-based Neural Network
Training:
• Structures: 31,247 domains
• Sequences: UniClust30
Senior, et al. (2020). Nature, 577(7792), 706–710.
X y
Sequence Structure
MGAFGHGFG
TYHKLAALED
GTLKHHAKLQ
PHLSLLCMF…
AlphaFold 1
• Input:
• Protein amino acid sequence
• Multiple Sequence Alignments (MSA):
• Profile features
MSA
1 2 3 4 5 6 7 ..
.
n
A - - 0.2 - - - - - -
R - - 0.3 - - - - - -
F - - 0.5 - - - - - -
G - - 0 - - - - - -
.. - - 0.8 - - - - - -
Y - - 1.2 - - - - - -
sequence positions
amino
acids
PSSM
position-specific scoring matrix
MSA to Profiles - PSSM:
⚙ → Familes and Domains
Senior, et al. (2020). Nature, 577(7792), 706–710.
MSA
Profile
Structure
optimization
AlphaFold 1
Senior, et al. (2020). Nature, 577(7792), 706–710.
MGAFGHGFG
TYHKLAALED
GTLKHHAKLQ
PHLSLLCMF…
Input
Sequence
MSA
Profile
The ML
model
The
distogram
y
X
Convolu:onal
Neural Network
The central component:
• A convolutional neural network
• Trained on PDB structures
• It predicts the distances dij
between the Cβ atoms of pairs,
ij, of residues of a protein.
AlphaFold 1
Senior, et al. (2020). Nature, 577(7792), 706–710.
AlphaFold 1
The distogram
Resiudue 29
The predicted probability distributions for
distances of residue 29 to all other residues (41)
Senior, et al. (2020). Nature, 577(7792), 706–710.
MGAFGHGFG
TYHKLAALED
GTLKHHAKLQ
PHLSLLCMF…
Input
Sequence
MSA
Profile
The ML
model
The
distogram
y
X
Convolu:onal
Neural Network
Gradient descent:
• Rotate the phi and psi angles
• Match the predicted Cβ atoms
distances
AlphaFold 1 Protein folding
Senior, et al. (2020). Nature, 577(7792), 706–710.
Senior, et al. (2020). Nature, 577(7792), 706–710.
Jumper, J., Evans, R., Pritzel, A. et al. Nature 596, 583–589 (2021).
AlphaFold
References:
1. Senior, et al. (2020). Improved protein structure
prediction using potentials from deep learning.
Nature, 577(7792), 706–710.
2. Jumper, J., Evans, R., Pritzel, A. et al. Highly
accurate protein structure prediction with
AlphaFold. Nature 596, 583–589 (2021).
Back to AlphaFold 2
X y
Sequence Structure
• Attention-based Neural Network
• Transformer-based
• Method inspired from biology, physics and
machine learning
• Trained with:
• ~170,000
• PDB structures
• UniProt sequences
MGAFGHGFG
TYHKLAALED
GTLKHHAKLQ
PHLSLLCMF…

More Related Content

What's hot

Protein structure
Protein structureProtein structure
Protein structurePooja Pawar
 
Protein computational analysis
Protein computational analysisProtein computational analysis
Protein computational analysisKinza Irshad
 
Ab Initio Protein Structure Prediction
Ab Initio Protein Structure PredictionAb Initio Protein Structure Prediction
Ab Initio Protein Structure PredictionArindam Ghosh
 
Homology modeling of proteins (ppt)
Homology modeling of proteins (ppt)Homology modeling of proteins (ppt)
Homology modeling of proteins (ppt)Melvin Alex
 
Phylogenetic trees
Phylogenetic treesPhylogenetic trees
Phylogenetic treesmartyynyyte
 
structural biology-Protein structure function relationship
structural biology-Protein structure function relationshipstructural biology-Protein structure function relationship
structural biology-Protein structure function relationshipMSCW Mysore
 
Protein Structural predection
Protein Structural predectionProtein Structural predection
Protein Structural predectionSantu Chall
 
Phylogenetic tree construction
Phylogenetic tree constructionPhylogenetic tree construction
Phylogenetic tree constructionUddalok Jana
 
Protien Structure Prediction
Protien Structure PredictionProtien Structure Prediction
Protien Structure PredictionSelimReza76
 
Machine learning in biology
Machine learning in biologyMachine learning in biology
Machine learning in biologyPranavathiyani G
 
Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...tuxette
 
Lecture 6 candidate gene association full
Lecture 6 candidate gene association fullLecture 6 candidate gene association full
Lecture 6 candidate gene association fullLekki Frazier-Wood
 
Informal presentation on bioinformatics
Informal presentation on bioinformaticsInformal presentation on bioinformatics
Informal presentation on bioinformaticsAtai Rabby
 

What's hot (20)

Protein structure
Protein structureProtein structure
Protein structure
 
Protein computational analysis
Protein computational analysisProtein computational analysis
Protein computational analysis
 
Ab Initio Protein Structure Prediction
Ab Initio Protein Structure PredictionAb Initio Protein Structure Prediction
Ab Initio Protein Structure Prediction
 
Blast Algorithm
Blast AlgorithmBlast Algorithm
Blast Algorithm
 
Homology modeling
Homology modelingHomology modeling
Homology modeling
 
Homology modeling of proteins (ppt)
Homology modeling of proteins (ppt)Homology modeling of proteins (ppt)
Homology modeling of proteins (ppt)
 
Phylogenetic trees
Phylogenetic treesPhylogenetic trees
Phylogenetic trees
 
structural biology-Protein structure function relationship
structural biology-Protein structure function relationshipstructural biology-Protein structure function relationship
structural biology-Protein structure function relationship
 
Protein Structural predection
Protein Structural predectionProtein Structural predection
Protein Structural predection
 
Phylogenetic tree construction
Phylogenetic tree constructionPhylogenetic tree construction
Phylogenetic tree construction
 
Biological databases
Biological databasesBiological databases
Biological databases
 
Protien Structure Prediction
Protien Structure PredictionProtien Structure Prediction
Protien Structure Prediction
 
Bioinformatics
BioinformaticsBioinformatics
Bioinformatics
 
Machine learning in biology
Machine learning in biologyMachine learning in biology
Machine learning in biology
 
Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...Multi-omics data integration methods: kernel and other machine learning appro...
Multi-omics data integration methods: kernel and other machine learning appro...
 
Protein fold recognition and ab_initio modeling
Protein fold recognition and ab_initio modelingProtein fold recognition and ab_initio modeling
Protein fold recognition and ab_initio modeling
 
Protein docking
Protein dockingProtein docking
Protein docking
 
Lecture 6 candidate gene association full
Lecture 6 candidate gene association fullLecture 6 candidate gene association full
Lecture 6 candidate gene association full
 
Structural genomics
Structural genomicsStructural genomics
Structural genomics
 
Informal presentation on bioinformatics
Informal presentation on bioinformaticsInformal presentation on bioinformatics
Informal presentation on bioinformatics
 

Similar to Protein folding prediction using Alphafold 1

Thesis def
Thesis defThesis def
Thesis defJay Vyas
 
Multiple Sequence Alignment by Shubham Kaushik
Multiple Sequence Alignment by Shubham KaushikMultiple Sequence Alignment by Shubham Kaushik
Multiple Sequence Alignment by Shubham KaushikShubham Kaushik
 
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...Natalio Krasnogor
 
R Analytics in the Cloud
R Analytics in the CloudR Analytics in the Cloud
R Analytics in the CloudDataMine Lab
 
Project Presentation
Project PresentationProject Presentation
Project Presentationbutest
 
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...Databricks
 
[IJCAI 2023] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network for Effi...
[IJCAI 2023] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network for Effi...[IJCAI 2023] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network for Effi...
[IJCAI 2023] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network for Effi...Ziyuan Zhao
 
Automation of building reliable models
Automation of building reliable modelsAutomation of building reliable models
Automation of building reliable modelsEszter Szabó
 
IGARSS_2011_MARPU_3.ppt
IGARSS_2011_MARPU_3.pptIGARSS_2011_MARPU_3.ppt
IGARSS_2011_MARPU_3.pptgrssieee
 
Artificial Intelligence Database Performance Tuning
Artificial Intelligence Database Performance TuningArtificial Intelligence Database Performance Tuning
Artificial Intelligence Database Performance TuningRoel Van de Paar
 
ProFET - Protein Feature Engineering Toolki
ProFET - Protein Feature Engineering ToolkiProFET - Protein Feature Engineering Toolki
ProFET - Protein Feature Engineering ToolkiDan Ofer
 
Data Profiling in Apache Calcite
Data Profiling in Apache CalciteData Profiling in Apache Calcite
Data Profiling in Apache CalciteJulian Hyde
 
Automating Machine Learning - Is it feasible?
Automating Machine Learning - Is it feasible?Automating Machine Learning - Is it feasible?
Automating Machine Learning - Is it feasible?Manuel Martín
 
IRJET-Performance Enhancement in Machine Learning System using Hybrid Bee Col...
IRJET-Performance Enhancement in Machine Learning System using Hybrid Bee Col...IRJET-Performance Enhancement in Machine Learning System using Hybrid Bee Col...
IRJET-Performance Enhancement in Machine Learning System using Hybrid Bee Col...IRJET Journal
 
Prediction of transcription factor binding to DNA using rule induction methods
Prediction of transcription factor binding to DNA using rule induction methodsPrediction of transcription factor binding to DNA using rule induction methods
Prediction of transcription factor binding to DNA using rule induction methodsziggurat
 
Integrative Networks Centric Bioinformatics
Integrative Networks Centric BioinformaticsIntegrative Networks Centric Bioinformatics
Integrative Networks Centric BioinformaticsNatalio Krasnogor
 

Similar to Protein folding prediction using Alphafold 1 (20)

Thesis def
Thesis defThesis def
Thesis def
 
Multiple Sequence Alignment by Shubham Kaushik
Multiple Sequence Alignment by Shubham KaushikMultiple Sequence Alignment by Shubham Kaushik
Multiple Sequence Alignment by Shubham Kaushik
 
PPT
PPTPPT
PPT
 
P0126557 slides
P0126557 slidesP0126557 slides
P0126557 slides
 
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...
 
2015-03-31_MotifGP
2015-03-31_MotifGP2015-03-31_MotifGP
2015-03-31_MotifGP
 
R Analytics in the Cloud
R Analytics in the CloudR Analytics in the Cloud
R Analytics in the Cloud
 
Project Presentation
Project PresentationProject Presentation
Project Presentation
 
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics on ...
 
Folker Meyer: Metagenomic Data Annotation
Folker Meyer: Metagenomic Data AnnotationFolker Meyer: Metagenomic Data Annotation
Folker Meyer: Metagenomic Data Annotation
 
[IJCAI 2023] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network for Effi...
[IJCAI 2023] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network for Effi...[IJCAI 2023] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network for Effi...
[IJCAI 2023] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network for Effi...
 
Automation of building reliable models
Automation of building reliable modelsAutomation of building reliable models
Automation of building reliable models
 
IGARSS_2011_MARPU_3.ppt
IGARSS_2011_MARPU_3.pptIGARSS_2011_MARPU_3.ppt
IGARSS_2011_MARPU_3.ppt
 
Artificial Intelligence Database Performance Tuning
Artificial Intelligence Database Performance TuningArtificial Intelligence Database Performance Tuning
Artificial Intelligence Database Performance Tuning
 
ProFET - Protein Feature Engineering Toolki
ProFET - Protein Feature Engineering ToolkiProFET - Protein Feature Engineering Toolki
ProFET - Protein Feature Engineering Toolki
 
Data Profiling in Apache Calcite
Data Profiling in Apache CalciteData Profiling in Apache Calcite
Data Profiling in Apache Calcite
 
Automating Machine Learning - Is it feasible?
Automating Machine Learning - Is it feasible?Automating Machine Learning - Is it feasible?
Automating Machine Learning - Is it feasible?
 
IRJET-Performance Enhancement in Machine Learning System using Hybrid Bee Col...
IRJET-Performance Enhancement in Machine Learning System using Hybrid Bee Col...IRJET-Performance Enhancement in Machine Learning System using Hybrid Bee Col...
IRJET-Performance Enhancement in Machine Learning System using Hybrid Bee Col...
 
Prediction of transcription factor binding to DNA using rule induction methods
Prediction of transcription factor binding to DNA using rule induction methodsPrediction of transcription factor binding to DNA using rule induction methods
Prediction of transcription factor binding to DNA using rule induction methods
 
Integrative Networks Centric Bioinformatics
Integrative Networks Centric BioinformaticsIntegrative Networks Centric Bioinformatics
Integrative Networks Centric Bioinformatics
 

More from Joel Ricci-López

Biología Molecular: Introducción
Biología Molecular: IntroducciónBiología Molecular: Introducción
Biología Molecular: IntroducciónJoel Ricci-López
 
Acoplamiento molecular con Autodock 4.2
Acoplamiento molecular con Autodock 4.2Acoplamiento molecular con Autodock 4.2
Acoplamiento molecular con Autodock 4.2Joel Ricci-López
 
Enfermedad del Sueño: Tripanosomiasis Africana
Enfermedad del Sueño: Tripanosomiasis AfricanaEnfermedad del Sueño: Tripanosomiasis Africana
Enfermedad del Sueño: Tripanosomiasis AfricanaJoel Ricci-López
 
LSD: Dietilamida de ácido lisérgico
LSD: Dietilamida de ácido lisérgicoLSD: Dietilamida de ácido lisérgico
LSD: Dietilamida de ácido lisérgicoJoel Ricci-López
 
Reptiles: Técnicas de Captura
Reptiles: Técnicas de CapturaReptiles: Técnicas de Captura
Reptiles: Técnicas de CapturaJoel Ricci-López
 
Tortuga Laúd: Leatherback turtle
Tortuga Laúd: Leatherback turtleTortuga Laúd: Leatherback turtle
Tortuga Laúd: Leatherback turtleJoel Ricci-López
 
Peces Óseos: Orden Myctophiformes
Peces Óseos: Orden MyctophiformesPeces Óseos: Orden Myctophiformes
Peces Óseos: Orden MyctophiformesJoel Ricci-López
 
Peces Óseos: Orden Cypriniformes (Carpas)
Peces Óseos: Orden Cypriniformes (Carpas)Peces Óseos: Orden Cypriniformes (Carpas)
Peces Óseos: Orden Cypriniformes (Carpas)Joel Ricci-López
 
Peces Óseos: Orden Batrachoidiformes
Peces Óseos: Orden BatrachoidiformesPeces Óseos: Orden Batrachoidiformes
Peces Óseos: Orden BatrachoidiformesJoel Ricci-López
 
Evolución del Sistema Nervioso
Evolución del Sistema NerviosoEvolución del Sistema Nervioso
Evolución del Sistema NerviosoJoel Ricci-López
 
Tiburones Ángel: Squatiniformes
Tiburones Ángel: SquatiniformesTiburones Ángel: Squatiniformes
Tiburones Ángel: SquatiniformesJoel Ricci-López
 
Replicación en Genomas de RNA: Virus y Viroides
Replicación en Genomas de RNA: Virus y ViroidesReplicación en Genomas de RNA: Virus y Viroides
Replicación en Genomas de RNA: Virus y ViroidesJoel Ricci-López
 
Genética de la violencia: Gen MAOA
Genética de la violencia: Gen MAOAGenética de la violencia: Gen MAOA
Genética de la violencia: Gen MAOAJoel Ricci-López
 
Lóbulos Cerebrales: La corteza cerebral
Lóbulos Cerebrales: La corteza cerebralLóbulos Cerebrales: La corteza cerebral
Lóbulos Cerebrales: La corteza cerebralJoel Ricci-López
 
Hemofília: Características Genéticas
Hemofília: Características GenéticasHemofília: Características Genéticas
Hemofília: Características GenéticasJoel Ricci-López
 

More from Joel Ricci-López (20)

Biología Molecular: Introducción
Biología Molecular: IntroducciónBiología Molecular: Introducción
Biología Molecular: Introducción
 
Acoplamiento molecular con Autodock 4.2
Acoplamiento molecular con Autodock 4.2Acoplamiento molecular con Autodock 4.2
Acoplamiento molecular con Autodock 4.2
 
Enfermedad del Sueño: Tripanosomiasis Africana
Enfermedad del Sueño: Tripanosomiasis AfricanaEnfermedad del Sueño: Tripanosomiasis Africana
Enfermedad del Sueño: Tripanosomiasis Africana
 
Inmunofluorescencia
InmunofluorescenciaInmunofluorescencia
Inmunofluorescencia
 
LSD: Dietilamida de ácido lisérgico
LSD: Dietilamida de ácido lisérgicoLSD: Dietilamida de ácido lisérgico
LSD: Dietilamida de ácido lisérgico
 
Reptiles: Técnicas de Captura
Reptiles: Técnicas de CapturaReptiles: Técnicas de Captura
Reptiles: Técnicas de Captura
 
Cuculiformes
CuculiformesCuculiformes
Cuculiformes
 
Aves: Orden Passeriformes
Aves: Orden PasseriformesAves: Orden Passeriformes
Aves: Orden Passeriformes
 
Tortuga Laúd: Leatherback turtle
Tortuga Laúd: Leatherback turtleTortuga Laúd: Leatherback turtle
Tortuga Laúd: Leatherback turtle
 
Familia: Cactáceas
Familia: CactáceasFamilia: Cactáceas
Familia: Cactáceas
 
Peces Óseos: Orden Myctophiformes
Peces Óseos: Orden MyctophiformesPeces Óseos: Orden Myctophiformes
Peces Óseos: Orden Myctophiformes
 
Peces Óseos: Orden Cypriniformes (Carpas)
Peces Óseos: Orden Cypriniformes (Carpas)Peces Óseos: Orden Cypriniformes (Carpas)
Peces Óseos: Orden Cypriniformes (Carpas)
 
Peces Óseos: Orden Batrachoidiformes
Peces Óseos: Orden BatrachoidiformesPeces Óseos: Orden Batrachoidiformes
Peces Óseos: Orden Batrachoidiformes
 
Evolución del Sistema Nervioso
Evolución del Sistema NerviosoEvolución del Sistema Nervioso
Evolución del Sistema Nervioso
 
Tiburones Ángel: Squatiniformes
Tiburones Ángel: SquatiniformesTiburones Ángel: Squatiniformes
Tiburones Ángel: Squatiniformes
 
Replicación en Genomas de RNA: Virus y Viroides
Replicación en Genomas de RNA: Virus y ViroidesReplicación en Genomas de RNA: Virus y Viroides
Replicación en Genomas de RNA: Virus y Viroides
 
Familia: Solanaceas
Familia: SolanaceasFamilia: Solanaceas
Familia: Solanaceas
 
Genética de la violencia: Gen MAOA
Genética de la violencia: Gen MAOAGenética de la violencia: Gen MAOA
Genética de la violencia: Gen MAOA
 
Lóbulos Cerebrales: La corteza cerebral
Lóbulos Cerebrales: La corteza cerebralLóbulos Cerebrales: La corteza cerebral
Lóbulos Cerebrales: La corteza cerebral
 
Hemofília: Características Genéticas
Hemofília: Características GenéticasHemofília: Características Genéticas
Hemofília: Características Genéticas
 

Recently uploaded

How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degreeyuu sss
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 

Recently uploaded (20)

How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
办美国阿肯色大学小石城分校毕业证成绩单pdf电子版制作修改#真实留信入库#永久存档#真实可查#diploma#degree
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 

Protein folding prediction using Alphafold 1

  • 3. AlphaFold MGAFGHGFG TYHKLAALED GTLKHHAKLQ PHLSLLCMF… What is it? - AF is an Artificial intelligence program - Google’s DeepMind The Goal: - Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence It “solves” two main problems: 1. Sequence-Structure gap 2. Protein folding Why solving these problems? Jumper, J., Evans, R., Pritzel, A. et al. Nature 596, 583–589 (2021).
  • 4. Sequence-Structure gap - 1958: determination of the first protein structure. - John Kendrew & Max Perutz - Structure determination (experimental): - NMR - X-ray crystallography - Cryo-Electron microscopy - Protein Data Bank: - Total: ~170,000 - Unique: ~100,000
  • 5. AlphaFold 1 The protein folding problem - 1972: Christian Anfisen, Nobel Prize in Chemistry. - “It should be possible to determine a protein’s three-dimensional shape based solely on its sequence” - A typical protein could adopt 10^300 different configurations - Longer than the age of the universe - However, in nature, proteins spontaneously fold into their functional shape. - Cyrus Levinthal’s paradox (1969) - 50 years open research problem
  • 6. The protein folding problem CASP Critical Assessment of Techniques for Protein Structure prediction • The protein folding Olympics • The state of the art in protein structure prediction - The competition: - Since 1994 - Takes place every two years - Last competition: CASP14 – 2020 - Organizers: - Known both the sequence and the structure Participants: - Receive only the protein’s sequence - Must blindly predict the structure of the proteins - Predictions: compared with the experimental data
  • 7. Homology modeling Threading & Fragment assembly Molecular dynamics INPUT: query sequence Q INPUT: query sequence Q INPUT: query sequence Q INPUT: Database of known folds or structure fragments INPUT: Database of protein structures 1. find protein P high sequence similarity to Q 2. return P’s structure as an approxima:on to Q’s structure 1. Laws of physics to simulate folding of Q 1. find a set of fragments that Q can be aligned with 2. return F as an approximation to Q’s structure • Force field • Molecular mechanics
  • 8. CASP before AlphaFold The metric: - How well is the prediction compared with the experimental data? GDT: Global Distance Test - Compares two structures - From 0 to 100 (%) - Greater is better - Uses distance cutoffs - Uses alpha Carbons - More accurate than RMSD Homology modeling Threading & Fragment assembly Molecular dynamics
  • 9. CASP and AlphaFold CASP14: 152 targets Jumper, J., Evans, R., Pritzel, A. et al. Nature 596, 583–589 (2021).
  • 10. How does it work? AlphaFold uses Deep Learning Artificial Intelligence Machine learning Deep Learning Machine learning: Learn from data “The field of study that gives computers the ability to learn without being explicitly programmed” Data Algorithm Results Computer Data Results Algorithm Computer Traditional Approach Machine Learning Approach Grokking Deep Learning/, by Andrew W. Trask, Manning Publications, 2019
  • 11. How does it work? AlphaFold uses Deep Learning Artificial Intelligence Machine learning Deep Learning Machine learning: Learn from data “The field of study that gives computers the ability to learn without being explicitly programmed” f X y ML: approximates f using data (X, y) 𝒇 ≈ # 𝒇 + ℰ a true relationship between two variables The ML model Grokking Deep Learning/, by Andrew W. Trask, Manning Publications, 2019
  • 12. Machine Learning X y ! 𝒇 𝑿 = % 𝒚 Data = (X, y) ML model: 1. The ML model (blueprint): 2. A training algorithm 1. Data (training set) 2. Loss function (error) 3. Optimization algorithm 3. A validation and a test set A linear regression model The goal: Minimize the error 1. Training set 2. Test set (data never seen by the model) Generalization ! 𝒚 = 𝒘 ∗ 𝒙 + 𝒃 ! 𝒚 ≈ 𝒚
  • 13. Deep Learning % 𝒚 = 𝒘 ∗ 𝒙 + 𝒃 ! 𝒚 A linear regression model A Neural Network (Feed Forward) ! 𝒚 𝒙𝟏 𝒙𝟐 𝒙𝟑 𝒂𝒌 𝒂𝟏 𝟐 𝒘𝟏 𝒘𝟐 𝒘𝟑 𝒂𝟏 𝟏 𝒂𝟏 𝟑 𝒂𝟏 𝟒 𝒂𝟐 𝟐 𝒂𝟐 𝟏 𝒂𝟐 𝟑 𝒂𝟐 𝟒 𝒂𝟑 𝟏 prediction prediction A node with its input edges Activation function - More complex models - Learns no linear relaBonships - Learns interacBons between features (X) - Feature extrac:on - A linear relationship between x and y Machine Learning More than three hidden layers
  • 15. AlphaFold 1 The model: • CASP13 (2018) • Convolutional-based Neural Network Training: • Structures: 31,247 domains • Sequences: UniClust30 Senior, et al. (2020). Nature, 577(7792), 706–710. X y Sequence Structure MGAFGHGFG TYHKLAALED GTLKHHAKLQ PHLSLLCMF…
  • 16. AlphaFold 1 • Input: • Protein amino acid sequence • Multiple Sequence Alignments (MSA): • Profile features MSA 1 2 3 4 5 6 7 .. . n A - - 0.2 - - - - - - R - - 0.3 - - - - - - F - - 0.5 - - - - - - G - - 0 - - - - - - .. - - 0.8 - - - - - - Y - - 1.2 - - - - - - sequence positions amino acids PSSM position-specific scoring matrix MSA to Profiles - PSSM: ⚙ → Familes and Domains Senior, et al. (2020). Nature, 577(7792), 706–710. MSA Profile Structure optimization
  • 17. AlphaFold 1 Senior, et al. (2020). Nature, 577(7792), 706–710.
  • 18. MGAFGHGFG TYHKLAALED GTLKHHAKLQ PHLSLLCMF… Input Sequence MSA Profile The ML model The distogram y X Convolu:onal Neural Network The central component: • A convolutional neural network • Trained on PDB structures • It predicts the distances dij between the Cβ atoms of pairs, ij, of residues of a protein. AlphaFold 1 Senior, et al. (2020). Nature, 577(7792), 706–710.
  • 19. AlphaFold 1 The distogram Resiudue 29 The predicted probability distributions for distances of residue 29 to all other residues (41) Senior, et al. (2020). Nature, 577(7792), 706–710.
  • 20. MGAFGHGFG TYHKLAALED GTLKHHAKLQ PHLSLLCMF… Input Sequence MSA Profile The ML model The distogram y X Convolu:onal Neural Network Gradient descent: • Rotate the phi and psi angles • Match the predicted Cβ atoms distances AlphaFold 1 Protein folding Senior, et al. (2020). Nature, 577(7792), 706–710.
  • 21. Senior, et al. (2020). Nature, 577(7792), 706–710.
  • 22. Jumper, J., Evans, R., Pritzel, A. et al. Nature 596, 583–589 (2021).
  • 23. AlphaFold References: 1. Senior, et al. (2020). Improved protein structure prediction using potentials from deep learning. Nature, 577(7792), 706–710. 2. Jumper, J., Evans, R., Pritzel, A. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
  • 24.
  • 25. Back to AlphaFold 2 X y Sequence Structure • Attention-based Neural Network • Transformer-based • Method inspired from biology, physics and machine learning • Trained with: • ~170,000 • PDB structures • UniProt sequences MGAFGHGFG TYHKLAALED GTLKHHAKLQ PHLSLLCMF…