SlideShare a Scribd company logo
fast and flexible probabilistic modeling in python
jmschreiber91
@jmschrei
@jmschreiber91
Jacob Schreiber
PhD student, Paul G. Allen School of
Computer Science, University of
Washington, Seattle WA
Maxwell Libbrecht
Postdoc, Department of Genome Sciences,
University of Washington, Seattle WA
Assistant Professor, Simon Fraser University,
Vancouver BC
noble.gs.washington.edu/~maxwl/
fast and flexible probabilistic modeling in python
jmschreiber91
@jmschrei
@jmschreiber91
Jacob Schreiber
PhD student, Paul G. Allen School of
Computer Science, University of
Washington, Seattle WA
Maxwell Libbrecht
Postdoc, Department of Genome Sciences,
University of Washington, Seattle WA
Assistant Professor, Simon Fraser University,
Vancouver BC
noble.gs.washington.edu/~maxwl/
Follow along with the Jupyter tutorial
3
goo.gl/cbE7rs
What is probabilistic modeling?
4goo.gl/cbE7rs
Tasks:
• Sample from a distribution.
• Learn parameters of a
distribution from data.
• Predict which distribution
generated a data example.
When should you use pomegranate?
5
complexity of probabilistic modeling task
simple complexz }| {
• logistic regression
• sample from a Gaussian
distribution
• speed is not important
Use scipy or
scikit-learn
z }| {
• distributions not implemented in
pomegranate
• problem-specific inference
algorithms
Use custom
packages (in R or
stand-alone)
z }| {
Use
pomegranate
goo.gl/cbE7rs
When should you use pomegranate?
6
z }| {
Use pomegranate
probabilistic modeling tasks
orthogonaltasks
• Scientific computing: Use numpy, scipy
• Non-probabilistic supervised learning (SVMs,
neural networks): Use scikit-learn
• Visualizing probabilistic models: Use matplotlib
• Bayesian statistics: Check out PyMC3
goo.gl/cbE7rs
Overview: this talk
7
Overview
Major Models/Model Stacks
1. General Mixture Models
2. Hidden Markov Models
3. Bayesian Networks
4. Bayes Classifiers
Finale: Train a mixture of HMMs in parallel
The API is common to all models
8
All models have these methods!
All models composed of
distributions (like GMM, HMM...)
have these methods too!
model.log_probability(X) / model.probability(X)
model.sample()
model.fit(X, weights, inertia)
model.summarize(X, weights)
model.from_summaries(inertia)
model.predict(X)
model.predict_proba(X)
model.predict_log_proba(X)
Model.from_samples(X, weights)
goo.gl/cbE7rs
Overview: supported models
Six Main Models:
1. Probability Distributions
2. General Mixture Models
3. Markov Chains
4. Hidden Markov Models
5. Bayes Classifiers / Naive Bayes
6. Bayesian Networks
9
Two Helper Models:
1. k-means++/kmeans||
2. Factor Graphs
pomegranate supports many distributions
10
Univariate Distributions
1. UniformDistribution
2. BernoulliDistribution
3. NormalDistribution
4. LogNormalDistribution
5. ExponentialDistribution
6. BetaDistribution
7. GammaDistribution
8. DiscreteDistribution
9. PoissonDistribution
Kernel Densities
1. GaussianKernelDensity
2. UniformKernelDensity
3. TriangleKernelDensity
Multivariate Distributions
1. IndependentComponentsDistributio
n
2. MultivariateGaussianDistribution
3. DirichletDistribution
4. ConditionalProbabilityTable
5. JointProbabilityTable
11
mu,	sig	=	0,	2	
a	=	NormalDistribution(mu,	sig)
X	=	[0,	1,	1,	2,	1.5,	6,	7,	8,	7]	
a	=	GaussianKernelDensity(X)
Models can be created from known values
12
Models can be learned from data
X	=	numpy.random.normal(0,	1,	100)	
a	=	NormalDistribution.from_samples(X)
Overview: model stacking in pomegranate
13
Distributions
Bayes Classifiers
Markov Chains
General Mixture Models
Hidden Markov Models
Bayesian Networks
D BC MC GMM HMM BN
Sub-model
Wrapper model
Overview: model stacking in pomegranate
14
Distributions
Bayes Classifiers
Markov Chains
General Mixture Models
Hidden Markov Models
Bayesian Networks
D BC MC GMM HMM BN
Sub-model
Wrapper model
Overview: model stacking in pomegranate
15
Distributions
Bayes Classifiers
Markov Chains
General Mixture Models
Hidden Markov Models
Bayesian Networks
D BC MC GMM HMM BN
Sub-model
Wrapper model
Overview: model stacking in pomegranate
16
Distributions
Bayes Classifiers
Markov Chains
General Mixture Models
Hidden Markov Models
Bayesian Networks
D BC MC GMM HMM BN
17
pomegranate can be faster than numpy
Fitting a Normal Distribution to 1,000 samples
18
pomegranate can be faster than numpy
Fitting Multivariate Gaussian to 10,000,000 samples of 10
dimensions
19
pomegranate uses BLAS internally
20
pomegranate just merged GPU support
21
pomegranate uses additive summarization
pomegranate reduces data to sufficient statistics for updates
and so only has to go datasets once (for all models).
Here is an example of the Normal Distribution sufficient
statistics
22
pomegranate supports out-of-core learning
Batches from a dataset can be reduced to additive summary
statistics, enabling exact updates from data that can’t fit in memory.
23
Parallelization exploits additive summaries
Extract summaries
+
+
+
+
New Parameters
24
pomegranate supports semisupervised learning
Summary statistics from supervised models can be added to
summary statistics from unsupervised models to train a single model
on a mixture of labeled and unlabeled data.
25
pomegranate supports semisupervised learning
Supervised Accuracy: 0.93 Semisupervised Accuracy: 0.96
26
pomegranate can be faster than scipy
27
pomegranate uses aggressive caching
29
30
Example ‘blast’ from Gossip Girl
Spotted: Lonely Boy. Can't believe the love of his life has
returned. If only she knew who he was. But everyone knows
Serena. And everyone is talking. Wonder what Blair Waldorf
thinks. Sure, they're BFF's, but we always thought Blair's
boyfriend Nate had a thing for Serena.
31
How do we encode these ‘blasts’?
Better lock it down with Nate, B. Clock's ticking.
+1 Nate
-1 Blair
32
How do we encode these ‘blasts’?
This just in: S and B committing a crime of fashion. Who
doesn't love a five-finger discount. Especially if it's the middle
one.
-1 Blair
-1 Serena
33
Simple summations don’t work well
34
Beta distributions can model uncertainty
35
Beta distributions can model uncertainty
36
Beta distributions can model uncertainty
Overview: this talk
37
Overview
Major Models/Model Stacks
1. General Mixture Models
2. Hidden Markov Models
3. Bayesian Networks
4. Bayes Classifiers
Finale: Train a mixture of HMMs in parallel
GMMs can model complex distributions
38
GMMs can model complex distributions
39
model = GeneralMixtureModel.from_samples(NormalDistribution, 2, X)
GMMs can model complex distributions
40
An exponential distribution is not right
41
model = ExponentialDistribution.from_samples(X)
A mixture of exponentials is better
42
model = GeneralMixtureModel.from_samples(ExponentialDistribution, 2, X)
Heterogeneous mixtures natively supported
43
model = GeneralMixtureModel.from_samples([ExponentialDistribution, UniformDistribution], 2, X)
GMMs faster than sklearn
44
Overview: this talk
45
Overview
Major Models/Model Stacks
1. General Mixture Models
2. Hidden Markov Models
3. Bayesian Networks
4. Bayes Classifiers
Finale: Train a mixture of HMMs in parallel
CG enrichment detection HMM
46
GACTACGACTCGCGCTCGCACGTCGCTCGACATCATCGACA
CG enrichment detection HMM
GACTACGACTCGCGCTCGCACGTCGCTCGACATCATCGACA
47
pomegranate HMMs are feature rich
48
GMM-HMM easy to define
49
HMMs are faster than hmmlearn
50
Overview: this talk
51
Overview
Major Models/Model Stacks
1. General Mixture Models
2. Hidden Markov Models
3. Bayesian Networks
4. Bayes Classifiers
Finale: Train a mixture of HMMs in parallel
Bayesian networks
52
Bayesian networks are powerful inference tools which define a
dependency structure between variables.
Sprinkler
Wet Grass
Rain
Bayesian networks
53
Sprinkler
Wet Grass
Rain
Two main difficult tasks:
(1) Inference given incomplete information
(2) Learning the dependency structure from data
Bayesian network inference
54
Inference is done using Belief Bropagation on a factor graph.
Messages are sent from one side to the other until convergence.
Rain
Sprinkler
Grass
Wet
R/S/GW
Factor
Grass
Wet
Rain
Sprinkler
Marginals Factors
Inference in a medical diagnostic network
55
BRCA 2 BRCA 1 LCT
BLOATLE LOA VOM AC
PREGLIOC
Inference in a medical diagnostic network
56
BRCA 2 BRCA 1 LCT
BLOATLE LOA VOM AC
PREGLIOC
Bayesian network structure learning
57
???
Three primary ways:
● “Search and score” / Exact
● “Constraint Learning” / PC
● Heuristics
Bayesian network structure learning
58
???
pomegranate supports:
● “Search and score” / Exact
● “Constraint Learning” / PC
● Heuristics
Exact structure learning is intractable
???
59
Chow-Liu trees are fast approximations
60
pomegranate supports four algorithms
61
Constraint graphs merge data + knowledge
62
BRCA 2 BRCA 1 LCT
BLOATLE LOA VOM AC
PREGLIOC
genetic conditions
conditions
symptoms
Constraint graphs merge data + knowledge
63
genetic conditions
diseases
symptoms
Modeling the global stock market
64
Constraint graph published in PeerJ CS
65
Overview: this talk
66
Overview
Major Models/Model Stacks
1. General Mixture Models
2. Hidden Markov Models
3. Bayesian Networks
4. Bayes Classifiers
Finale: Train a mixture of HMMs in parallel
Bayes classifiers rely on Bayes’ rule
67
68
Let’s build a classifier on this data
69
Likelihood function alone is not good
70
Priors can model class imbalance
71
The posterior is a good model of the data
Naive Bayes assumes independent features
72
Naive Bayes produces ellipsoid boundaries
73model = NaiveBayes.from_samples(NormalDistribution, X, y)

Naive Bayes can be heterogeneous
74
Data can fall under different distributions
75
Using appropriate distributions is better
76
model = NaiveBayes.from_samples(NormalDistribution, X_train, y_train)
print "Gaussian Naive Bayes: ", (model.predict(X_test) == y_test).mean()

clf = GaussianNB().fit(X_train, y_train)

print "sklearn Gaussian Naive Bayes: ", (clf.predict(X_test) == y_test).mean()

model = NaiveBayes.from_samples([NormalDistribution, LogNormalDistribution,
ExponentialDistribution], X_train, y_train

print "Heterogeneous Naive Bayes: ", (model.predict(X_test) == y_test).mean()
Gaussian Naive Bayes: 0.798

sklearn Gaussian Naive Bayes: 0.798

Heterogeneous Naive Bayes: 0.844
This additional flexibility is just as fast
77
Bayes classifiers don’t require independence
78
naive accuracy: 0.929 bayes classifier accuracy: 0.966
Gaussian mixture model Bayes classifier
79
Creating complex Bayes classifiers is easy
80
gmm_a = GeneralMixtureModel.from_samples(MultivariateGaussianDistribution, 2, X[y == 0])

gmm_b = GeneralMixtureModel.from_samples(MultivariateGaussianDistribution, 2, X[y == 1])

model_b = BayesClassifier([gmm_a, gmm_b], weights=numpy.array([1-y.mean(), y.mean()]))

Creating complex Bayes classifiers is easy
81
mc_a = MarkovChain.from_samples(X[y == 0])
mc_b = MarkovChain.from_samples(X[y == 1])

model_b = BayesClassifier([mc_a, mc_b], weights=numpy.array([1-y.mean(), y.mean()]))
hmm_a = HiddenMarkovModel.from_samples(X[y == 0])
hmm_b = HiddenMarkovModel.from_samples(X[y == 1])

model_b = BayesClassifier([hmm_a, hmm_b], weights=numpy.array([1-y.mean(), y.mean()]))
bn_a = BayesianNetwork.from_samples(X[y == 0])
bn_b = BayesianNetwork.from_samples(X[y == 1])

model_b = BayesClassifier([bn_a, bn_b], weights=numpy.array([1-y.mean(), y.mean()]))



Overview: this talk
82
Overview
Major Models/Model Stacks
1. General Mixture Models
2. Hidden Markov Models
3. Bayesian Networks
4. Bayes Classifiers
Finale: Train a mixture of HMMs in parallel
Training a mixture of HMMs in parallel
83
Creating a mixture of HMMs is just as simple as passing the
HMMs into a GMM as if it were any other distribution
Training a mixture of HMMs in parallel
84fit(model, X, n_jobs=n)
Documentation available at Readthedocs
85
Tutorials available on github
86
https://github.com/jmschrei/pomegranate/tree/master/tutorials
Acknowledgements
87
fast and flexible probabilistic modeling in python
jmschreiber91
@jmschrei
@jmschreiber91
Jacob Schreiber
PhD student, Paul G. Allen School of
Computer Science, University of
Washington, Seattle WA
Maxwell Libbrecht
Postdoc, Department of Genome Sciences,
University of Washington, Seattle WA
Assistant Professor, Simon Fraser University,
Vancouver BC
noble.gs.washington.edu/~maxwl/

More Related Content

What's hot

CSC446: Pattern Recognition (LN6)
CSC446: Pattern Recognition (LN6)CSC446: Pattern Recognition (LN6)
CSC446: Pattern Recognition (LN6)
Mostafa G. M. Mostafa
 
LDM_ImageSythesis.pptx
LDM_ImageSythesis.pptxLDM_ImageSythesis.pptx
LDM_ImageSythesis.pptx
AkankshaRawat53
 
[IJCAI 2023 - Poster] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network...
[IJCAI 2023 - Poster] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network...[IJCAI 2023 - Poster] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network...
[IJCAI 2023 - Poster] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network...
Ziyuan Zhao
 
Machine Learning Models in Production
Machine Learning Models in ProductionMachine Learning Models in Production
Machine Learning Models in Production
DataWorks Summit
 
Genetic algorithm raktim
Genetic algorithm raktimGenetic algorithm raktim
Genetic algorithm raktim
Raktim Halder
 
Demystifying Xgboost
Demystifying XgboostDemystifying Xgboost
Demystifying Xgboost
halifaxchester
 
From neural networks to deep learning
From neural networks to deep learningFrom neural networks to deep learning
From neural networks to deep learning
Viet-Trung TRAN
 
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Md. Main Uddin Rony
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
Amr Rashed
 
Deep learning tutorial 9/2019
Deep learning tutorial 9/2019Deep learning tutorial 9/2019
Deep learning tutorial 9/2019
Amr Rashed
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine Learning
safa cimenli
 
Few shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learningFew shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learning
ﺁﺻﻒ ﻋﻠﯽ ﻣﯿﺮ
 
Relational knowledge distillation
Relational knowledge distillationRelational knowledge distillation
Relational knowledge distillation
NAVER Engineering
 
An Introduction to TensorFlow architecture
An Introduction to TensorFlow architectureAn Introduction to TensorFlow architecture
An Introduction to TensorFlow architecture
Mani Goswami
 
Introduction to Machine Learning (case studies)
Introduction to Machine Learning (case studies)Introduction to Machine Learning (case studies)
Introduction to Machine Learning (case studies)
Dmitry Efimov
 
LLM presentation final
LLM presentation finalLLM presentation final
LLM presentation finalRuth Griffin
 
Build, Train & Deploy Machine Learning Models at Scale
Build, Train & Deploy Machine Learning Models at ScaleBuild, Train & Deploy Machine Learning Models at Scale
Build, Train & Deploy Machine Learning Models at Scale
Amazon Web Services
 
Lecture 11: ML Deployment & Monitoring (Full Stack Deep Learning - Spring 2021)
Lecture 11: ML Deployment & Monitoring (Full Stack Deep Learning - Spring 2021)Lecture 11: ML Deployment & Monitoring (Full Stack Deep Learning - Spring 2021)
Lecture 11: ML Deployment & Monitoring (Full Stack Deep Learning - Spring 2021)
Sergey Karayev
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
Databricks
 
Stable Diffusion path
Stable Diffusion pathStable Diffusion path
Stable Diffusion path
Vitaly Bondar
 

What's hot (20)

CSC446: Pattern Recognition (LN6)
CSC446: Pattern Recognition (LN6)CSC446: Pattern Recognition (LN6)
CSC446: Pattern Recognition (LN6)
 
LDM_ImageSythesis.pptx
LDM_ImageSythesis.pptxLDM_ImageSythesis.pptx
LDM_ImageSythesis.pptx
 
[IJCAI 2023 - Poster] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network...
[IJCAI 2023 - Poster] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network...[IJCAI 2023 - Poster] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network...
[IJCAI 2023 - Poster] SemiGNN-PPI: Self-Ensembling Multi-Graph Neural Network...
 
Machine Learning Models in Production
Machine Learning Models in ProductionMachine Learning Models in Production
Machine Learning Models in Production
 
Genetic algorithm raktim
Genetic algorithm raktimGenetic algorithm raktim
Genetic algorithm raktim
 
Demystifying Xgboost
Demystifying XgboostDemystifying Xgboost
Demystifying Xgboost
 
From neural networks to deep learning
From neural networks to deep learningFrom neural networks to deep learning
From neural networks to deep learning
 
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
Data Analysis: Evaluation Metrics for Supervised Learning Models of Machine L...
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 
Deep learning tutorial 9/2019
Deep learning tutorial 9/2019Deep learning tutorial 9/2019
Deep learning tutorial 9/2019
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine Learning
 
Few shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learningFew shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learning
 
Relational knowledge distillation
Relational knowledge distillationRelational knowledge distillation
Relational knowledge distillation
 
An Introduction to TensorFlow architecture
An Introduction to TensorFlow architectureAn Introduction to TensorFlow architecture
An Introduction to TensorFlow architecture
 
Introduction to Machine Learning (case studies)
Introduction to Machine Learning (case studies)Introduction to Machine Learning (case studies)
Introduction to Machine Learning (case studies)
 
LLM presentation final
LLM presentation finalLLM presentation final
LLM presentation final
 
Build, Train & Deploy Machine Learning Models at Scale
Build, Train & Deploy Machine Learning Models at ScaleBuild, Train & Deploy Machine Learning Models at Scale
Build, Train & Deploy Machine Learning Models at Scale
 
Lecture 11: ML Deployment & Monitoring (Full Stack Deep Learning - Spring 2021)
Lecture 11: ML Deployment & Monitoring (Full Stack Deep Learning - Spring 2021)Lecture 11: ML Deployment & Monitoring (Full Stack Deep Learning - Spring 2021)
Lecture 11: ML Deployment & Monitoring (Full Stack Deep Learning - Spring 2021)
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
 
Stable Diffusion path
Stable Diffusion pathStable Diffusion path
Stable Diffusion path
 

Similar to Maxwell W Libbrecht - pomegranate: fast and flexible probabilistic modeling in python

Distance-based bias in model-directed optimization of additively decomposable...
Distance-based bias in model-directed optimization of additively decomposable...Distance-based bias in model-directed optimization of additively decomposable...
Distance-based bias in model-directed optimization of additively decomposable...
Martin Pelikan
 
Dependency patterns for latent variable discovery
Dependency patterns for latent variable discovery Dependency patterns for latent variable discovery
Dependency patterns for latent variable discovery
Bayesian Intelligence
 
Data driven model optimization [autosaved]
Data driven model optimization [autosaved]Data driven model optimization [autosaved]
Data driven model optimization [autosaved]
Russell Jarvis
 
Neural Nets Deconstructed
Neural Nets DeconstructedNeural Nets Deconstructed
Neural Nets Deconstructed
Paul Sterk
 
Higher-order Link Prediction GraphEx
Higher-order Link Prediction GraphExHigher-order Link Prediction GraphEx
Higher-order Link Prediction GraphEx
Austin Benson
 
The application of artificial intelligence
The application of artificial intelligenceThe application of artificial intelligence
The application of artificial intelligencePallavi Vashistha
 
NIPS2007: deep belief nets
NIPS2007: deep belief netsNIPS2007: deep belief nets
NIPS2007: deep belief netszukun
 
Mining Maximally Banded Matrices in Binary Data
 Mining Maximally Banded Matrices in Binary Data Mining Maximally Banded Matrices in Binary Data
Mining Maximally Banded Matrices in Binary DataFaris Alqadah
 
Cornell Pbsb 20090126 Nets
Cornell Pbsb 20090126 NetsCornell Pbsb 20090126 Nets
Cornell Pbsb 20090126 Nets
Mark Gerstein
 
Simplicial closure and higher-order link prediction (SIAMNS18)
Simplicial closure and higher-order link prediction (SIAMNS18)Simplicial closure and higher-order link prediction (SIAMNS18)
Simplicial closure and higher-order link prediction (SIAMNS18)
Austin Benson
 
Composing graphical models with neural networks for structured representatio...
Composing graphical models with  neural networks for structured representatio...Composing graphical models with  neural networks for structured representatio...
Composing graphical models with neural networks for structured representatio...
Jeongmin Cha
 
OpenCL applications in genomics
OpenCL applications in genomicsOpenCL applications in genomics
OpenCL applications in genomicsUSC
 
Magpie
MagpieMagpie
Magpie
Jan Stypka
 
Alignment Approaches II: Long Reads
Alignment Approaches II: Long ReadsAlignment Approaches II: Long Reads
Alignment Approaches II: Long Reads
Genome Reference Consortium
 
Are you sure about that?! Uncertainty Quantification in AI
Are you sure about that?! Uncertainty Quantification in AIAre you sure about that?! Uncertainty Quantification in AI
Are you sure about that?! Uncertainty Quantification in AI
inovex GmbH
 
Uncertainty Quantification in AI
Uncertainty Quantification in AIUncertainty Quantification in AI
Uncertainty Quantification in AI
Florian Wilhelm
 
PPT - Deep and Confident Prediction For Time Series at Uber
PPT - Deep and Confident Prediction For Time Series at UberPPT - Deep and Confident Prediction For Time Series at Uber
PPT - Deep and Confident Prediction For Time Series at Uber
Jisang Yoon
 
Dynamic Programming Algorithm for the Prediction for Gene Structure
Dynamic Programming Algorithm for the Prediction for Gene StructureDynamic Programming Algorithm for the Prediction for Gene Structure
Dynamic Programming Algorithm for the Prediction for Gene StructureMarilyn Arceo
 
Jillian ms defense-4-14-14-ja-novid3
Jillian ms defense-4-14-14-ja-novid3Jillian ms defense-4-14-14-ja-novid3
Jillian ms defense-4-14-14-ja-novid3Jillian Aurisano
 

Similar to Maxwell W Libbrecht - pomegranate: fast and flexible probabilistic modeling in python (20)

Distance-based bias in model-directed optimization of additively decomposable...
Distance-based bias in model-directed optimization of additively decomposable...Distance-based bias in model-directed optimization of additively decomposable...
Distance-based bias in model-directed optimization of additively decomposable...
 
Dependency patterns for latent variable discovery
Dependency patterns for latent variable discovery Dependency patterns for latent variable discovery
Dependency patterns for latent variable discovery
 
Data driven model optimization [autosaved]
Data driven model optimization [autosaved]Data driven model optimization [autosaved]
Data driven model optimization [autosaved]
 
Neural Nets Deconstructed
Neural Nets DeconstructedNeural Nets Deconstructed
Neural Nets Deconstructed
 
Higher-order Link Prediction GraphEx
Higher-order Link Prediction GraphExHigher-order Link Prediction GraphEx
Higher-order Link Prediction GraphEx
 
The application of artificial intelligence
The application of artificial intelligenceThe application of artificial intelligence
The application of artificial intelligence
 
NIPS2007: deep belief nets
NIPS2007: deep belief netsNIPS2007: deep belief nets
NIPS2007: deep belief nets
 
Mining Maximally Banded Matrices in Binary Data
 Mining Maximally Banded Matrices in Binary Data Mining Maximally Banded Matrices in Binary Data
Mining Maximally Banded Matrices in Binary Data
 
Cornell Pbsb 20090126 Nets
Cornell Pbsb 20090126 NetsCornell Pbsb 20090126 Nets
Cornell Pbsb 20090126 Nets
 
Simplicial closure and higher-order link prediction (SIAMNS18)
Simplicial closure and higher-order link prediction (SIAMNS18)Simplicial closure and higher-order link prediction (SIAMNS18)
Simplicial closure and higher-order link prediction (SIAMNS18)
 
Composing graphical models with neural networks for structured representatio...
Composing graphical models with  neural networks for structured representatio...Composing graphical models with  neural networks for structured representatio...
Composing graphical models with neural networks for structured representatio...
 
OpenCL applications in genomics
OpenCL applications in genomicsOpenCL applications in genomics
OpenCL applications in genomics
 
Magpie
MagpieMagpie
Magpie
 
Alignment Approaches II: Long Reads
Alignment Approaches II: Long ReadsAlignment Approaches II: Long Reads
Alignment Approaches II: Long Reads
 
Are you sure about that?! Uncertainty Quantification in AI
Are you sure about that?! Uncertainty Quantification in AIAre you sure about that?! Uncertainty Quantification in AI
Are you sure about that?! Uncertainty Quantification in AI
 
Uncertainty Quantification in AI
Uncertainty Quantification in AIUncertainty Quantification in AI
Uncertainty Quantification in AI
 
Biological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical ModelsBiological Network Inference via Gaussian Graphical Models
Biological Network Inference via Gaussian Graphical Models
 
PPT - Deep and Confident Prediction For Time Series at Uber
PPT - Deep and Confident Prediction For Time Series at UberPPT - Deep and Confident Prediction For Time Series at Uber
PPT - Deep and Confident Prediction For Time Series at Uber
 
Dynamic Programming Algorithm for the Prediction for Gene Structure
Dynamic Programming Algorithm for the Prediction for Gene StructureDynamic Programming Algorithm for the Prediction for Gene Structure
Dynamic Programming Algorithm for the Prediction for Gene Structure
 
Jillian ms defense-4-14-14-ja-novid3
Jillian ms defense-4-14-14-ja-novid3Jillian ms defense-4-14-14-ja-novid3
Jillian ms defense-4-14-14-ja-novid3
 

More from PyData

Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
PyData
 
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif WalshUnit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
PyData
 
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake BolewskiThe TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
PyData
 
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
PyData
 
Deploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne BauerDeploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne Bauer
PyData
 
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam LermaGraph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
PyData
 
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
PyData
 
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroRESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
PyData
 
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
PyData
 
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven LottAvoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
PyData
 
Words in Space - Rebecca Bilbro
Words in Space - Rebecca BilbroWords in Space - Rebecca Bilbro
Words in Space - Rebecca Bilbro
PyData
 
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
PyData
 
Pydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica PuertoPydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica Puerto
PyData
 
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
PyData
 
Extending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will AydExtending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will Ayd
PyData
 
Measuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen HooverMeasuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen Hoover
PyData
 
What's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldWhat's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper Seabold
PyData
 
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
PyData
 
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-WardSolving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
PyData
 
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
PyData
 

More from PyData (20)

Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
Michal Mucha: Build and Deploy an End-to-end Streaming NLP Insight System | P...
 
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif WalshUnit testing data with marbles - Jane Stewart Adams, Leif Walsh
Unit testing data with marbles - Jane Stewart Adams, Leif Walsh
 
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake BolewskiThe TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
The TileDB Array Data Storage Manager - Stavros Papadopoulos, Jake Bolewski
 
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...Using Embeddings to Understand the Variance and Evolution of Data Science... ...
Using Embeddings to Understand the Variance and Evolution of Data Science... ...
 
Deploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne BauerDeploying Data Science for Distribution of The New York Times - Anne Bauer
Deploying Data Science for Distribution of The New York Times - Anne Bauer
 
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam LermaGraph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
Graph Analytics - From the Whiteboard to Your Toolbox - Sam Lerma
 
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
Do Your Homework! Writing tests for Data Science and Stochastic Code - David ...
 
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo MazzaferroRESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
 
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
Mining dockless bikeshare and dockless scootershare trip data - Stefanie Brod...
 
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven LottAvoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
Avoiding Bad Database Surprises: Simulation and Scalability - Steven Lott
 
Words in Space - Rebecca Bilbro
Words in Space - Rebecca BilbroWords in Space - Rebecca Bilbro
Words in Space - Rebecca Bilbro
 
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...End-to-End Machine learning pipelines for Python driven organizations - Nick ...
End-to-End Machine learning pipelines for Python driven organizations - Nick ...
 
Pydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica PuertoPydata beautiful soup - Monica Puerto
Pydata beautiful soup - Monica Puerto
 
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
1D Convolutional Neural Networks for Time Series Modeling - Nathan Janos, Jef...
 
Extending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will AydExtending Pandas with Custom Types - Will Ayd
Extending Pandas with Custom Types - Will Ayd
 
Measuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen HooverMeasuring Model Fairness - Stephen Hoover
Measuring Model Fairness - Stephen Hoover
 
What's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper SeaboldWhat's the Science in Data Science? - Skipper Seabold
What's the Science in Data Science? - Skipper Seabold
 
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
Applying Statistical Modeling and Machine Learning to Perform Time-Series For...
 
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-WardSolving very simple substitution ciphers algorithmically - Stephen Enright-Ward
Solving very simple substitution ciphers algorithmically - Stephen Enright-Ward
 
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
The Face of Nanomaterials: Insightful Classification Using Deep Learning - An...
 

Recently uploaded

FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
Matthew Sinclair
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
James Anderson
 

Recently uploaded (20)

FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
20240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 202420240607 QFM018 Elixir Reading List May 2024
20240607 QFM018 Elixir Reading List May 2024
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...
 

Maxwell W Libbrecht - pomegranate: fast and flexible probabilistic modeling in python

  • 1. fast and flexible probabilistic modeling in python jmschreiber91 @jmschrei @jmschreiber91 Jacob Schreiber PhD student, Paul G. Allen School of Computer Science, University of Washington, Seattle WA Maxwell Libbrecht Postdoc, Department of Genome Sciences, University of Washington, Seattle WA Assistant Professor, Simon Fraser University, Vancouver BC noble.gs.washington.edu/~maxwl/
  • 2. fast and flexible probabilistic modeling in python jmschreiber91 @jmschrei @jmschreiber91 Jacob Schreiber PhD student, Paul G. Allen School of Computer Science, University of Washington, Seattle WA Maxwell Libbrecht Postdoc, Department of Genome Sciences, University of Washington, Seattle WA Assistant Professor, Simon Fraser University, Vancouver BC noble.gs.washington.edu/~maxwl/
  • 3. Follow along with the Jupyter tutorial 3 goo.gl/cbE7rs
  • 4. What is probabilistic modeling? 4goo.gl/cbE7rs Tasks: • Sample from a distribution. • Learn parameters of a distribution from data. • Predict which distribution generated a data example.
  • 5. When should you use pomegranate? 5 complexity of probabilistic modeling task simple complexz }| { • logistic regression • sample from a Gaussian distribution • speed is not important Use scipy or scikit-learn z }| { • distributions not implemented in pomegranate • problem-specific inference algorithms Use custom packages (in R or stand-alone) z }| { Use pomegranate goo.gl/cbE7rs
  • 6. When should you use pomegranate? 6 z }| { Use pomegranate probabilistic modeling tasks orthogonaltasks • Scientific computing: Use numpy, scipy • Non-probabilistic supervised learning (SVMs, neural networks): Use scikit-learn • Visualizing probabilistic models: Use matplotlib • Bayesian statistics: Check out PyMC3 goo.gl/cbE7rs
  • 7. Overview: this talk 7 Overview Major Models/Model Stacks 1. General Mixture Models 2. Hidden Markov Models 3. Bayesian Networks 4. Bayes Classifiers Finale: Train a mixture of HMMs in parallel
  • 8. The API is common to all models 8 All models have these methods! All models composed of distributions (like GMM, HMM...) have these methods too! model.log_probability(X) / model.probability(X) model.sample() model.fit(X, weights, inertia) model.summarize(X, weights) model.from_summaries(inertia) model.predict(X) model.predict_proba(X) model.predict_log_proba(X) Model.from_samples(X, weights) goo.gl/cbE7rs
  • 9. Overview: supported models Six Main Models: 1. Probability Distributions 2. General Mixture Models 3. Markov Chains 4. Hidden Markov Models 5. Bayes Classifiers / Naive Bayes 6. Bayesian Networks 9 Two Helper Models: 1. k-means++/kmeans|| 2. Factor Graphs
  • 10. pomegranate supports many distributions 10 Univariate Distributions 1. UniformDistribution 2. BernoulliDistribution 3. NormalDistribution 4. LogNormalDistribution 5. ExponentialDistribution 6. BetaDistribution 7. GammaDistribution 8. DiscreteDistribution 9. PoissonDistribution Kernel Densities 1. GaussianKernelDensity 2. UniformKernelDensity 3. TriangleKernelDensity Multivariate Distributions 1. IndependentComponentsDistributio n 2. MultivariateGaussianDistribution 3. DirichletDistribution 4. ConditionalProbabilityTable 5. JointProbabilityTable
  • 12. 12 Models can be learned from data X = numpy.random.normal(0, 1, 100) a = NormalDistribution.from_samples(X)
  • 13. Overview: model stacking in pomegranate 13 Distributions Bayes Classifiers Markov Chains General Mixture Models Hidden Markov Models Bayesian Networks D BC MC GMM HMM BN Sub-model Wrapper model
  • 14. Overview: model stacking in pomegranate 14 Distributions Bayes Classifiers Markov Chains General Mixture Models Hidden Markov Models Bayesian Networks D BC MC GMM HMM BN Sub-model Wrapper model
  • 15. Overview: model stacking in pomegranate 15 Distributions Bayes Classifiers Markov Chains General Mixture Models Hidden Markov Models Bayesian Networks D BC MC GMM HMM BN Sub-model Wrapper model
  • 16. Overview: model stacking in pomegranate 16 Distributions Bayes Classifiers Markov Chains General Mixture Models Hidden Markov Models Bayesian Networks D BC MC GMM HMM BN
  • 17. 17 pomegranate can be faster than numpy Fitting a Normal Distribution to 1,000 samples
  • 18. 18 pomegranate can be faster than numpy Fitting Multivariate Gaussian to 10,000,000 samples of 10 dimensions
  • 21. 21 pomegranate uses additive summarization pomegranate reduces data to sufficient statistics for updates and so only has to go datasets once (for all models). Here is an example of the Normal Distribution sufficient statistics
  • 22. 22 pomegranate supports out-of-core learning Batches from a dataset can be reduced to additive summary statistics, enabling exact updates from data that can’t fit in memory.
  • 23. 23 Parallelization exploits additive summaries Extract summaries + + + + New Parameters
  • 24. 24 pomegranate supports semisupervised learning Summary statistics from supervised models can be added to summary statistics from unsupervised models to train a single model on a mixture of labeled and unlabeled data.
  • 25. 25 pomegranate supports semisupervised learning Supervised Accuracy: 0.93 Semisupervised Accuracy: 0.96
  • 26. 26 pomegranate can be faster than scipy
  • 28.
  • 29. 29
  • 30. 30 Example ‘blast’ from Gossip Girl Spotted: Lonely Boy. Can't believe the love of his life has returned. If only she knew who he was. But everyone knows Serena. And everyone is talking. Wonder what Blair Waldorf thinks. Sure, they're BFF's, but we always thought Blair's boyfriend Nate had a thing for Serena.
  • 31. 31 How do we encode these ‘blasts’? Better lock it down with Nate, B. Clock's ticking. +1 Nate -1 Blair
  • 32. 32 How do we encode these ‘blasts’? This just in: S and B committing a crime of fashion. Who doesn't love a five-finger discount. Especially if it's the middle one. -1 Blair -1 Serena
  • 34. 34 Beta distributions can model uncertainty
  • 35. 35 Beta distributions can model uncertainty
  • 36. 36 Beta distributions can model uncertainty
  • 37. Overview: this talk 37 Overview Major Models/Model Stacks 1. General Mixture Models 2. Hidden Markov Models 3. Bayesian Networks 4. Bayes Classifiers Finale: Train a mixture of HMMs in parallel
  • 38. GMMs can model complex distributions 38
  • 39. GMMs can model complex distributions 39 model = GeneralMixtureModel.from_samples(NormalDistribution, 2, X)
  • 40. GMMs can model complex distributions 40
  • 41. An exponential distribution is not right 41 model = ExponentialDistribution.from_samples(X)
  • 42. A mixture of exponentials is better 42 model = GeneralMixtureModel.from_samples(ExponentialDistribution, 2, X)
  • 43. Heterogeneous mixtures natively supported 43 model = GeneralMixtureModel.from_samples([ExponentialDistribution, UniformDistribution], 2, X)
  • 44. GMMs faster than sklearn 44
  • 45. Overview: this talk 45 Overview Major Models/Model Stacks 1. General Mixture Models 2. Hidden Markov Models 3. Bayesian Networks 4. Bayes Classifiers Finale: Train a mixture of HMMs in parallel
  • 46. CG enrichment detection HMM 46 GACTACGACTCGCGCTCGCACGTCGCTCGACATCATCGACA
  • 47. CG enrichment detection HMM GACTACGACTCGCGCTCGCACGTCGCTCGACATCATCGACA 47
  • 48. pomegranate HMMs are feature rich 48
  • 49. GMM-HMM easy to define 49
  • 50. HMMs are faster than hmmlearn 50
  • 51. Overview: this talk 51 Overview Major Models/Model Stacks 1. General Mixture Models 2. Hidden Markov Models 3. Bayesian Networks 4. Bayes Classifiers Finale: Train a mixture of HMMs in parallel
  • 52. Bayesian networks 52 Bayesian networks are powerful inference tools which define a dependency structure between variables. Sprinkler Wet Grass Rain
  • 53. Bayesian networks 53 Sprinkler Wet Grass Rain Two main difficult tasks: (1) Inference given incomplete information (2) Learning the dependency structure from data
  • 54. Bayesian network inference 54 Inference is done using Belief Bropagation on a factor graph. Messages are sent from one side to the other until convergence. Rain Sprinkler Grass Wet R/S/GW Factor Grass Wet Rain Sprinkler Marginals Factors
  • 55. Inference in a medical diagnostic network 55 BRCA 2 BRCA 1 LCT BLOATLE LOA VOM AC PREGLIOC
  • 56. Inference in a medical diagnostic network 56 BRCA 2 BRCA 1 LCT BLOATLE LOA VOM AC PREGLIOC
  • 57. Bayesian network structure learning 57 ??? Three primary ways: ● “Search and score” / Exact ● “Constraint Learning” / PC ● Heuristics
  • 58. Bayesian network structure learning 58 ??? pomegranate supports: ● “Search and score” / Exact ● “Constraint Learning” / PC ● Heuristics
  • 59. Exact structure learning is intractable ??? 59
  • 60. Chow-Liu trees are fast approximations 60
  • 61. pomegranate supports four algorithms 61
  • 62. Constraint graphs merge data + knowledge 62 BRCA 2 BRCA 1 LCT BLOATLE LOA VOM AC PREGLIOC genetic conditions conditions symptoms
  • 63. Constraint graphs merge data + knowledge 63 genetic conditions diseases symptoms
  • 64. Modeling the global stock market 64
  • 65. Constraint graph published in PeerJ CS 65
  • 66. Overview: this talk 66 Overview Major Models/Model Stacks 1. General Mixture Models 2. Hidden Markov Models 3. Bayesian Networks 4. Bayes Classifiers Finale: Train a mixture of HMMs in parallel
  • 67. Bayes classifiers rely on Bayes’ rule 67
  • 68. 68 Let’s build a classifier on this data
  • 70. 70 Priors can model class imbalance
  • 71. 71 The posterior is a good model of the data
  • 72. Naive Bayes assumes independent features 72
  • 73. Naive Bayes produces ellipsoid boundaries 73model = NaiveBayes.from_samples(NormalDistribution, X, y)

  • 74. Naive Bayes can be heterogeneous 74
  • 75. Data can fall under different distributions 75
  • 76. Using appropriate distributions is better 76 model = NaiveBayes.from_samples(NormalDistribution, X_train, y_train) print "Gaussian Naive Bayes: ", (model.predict(X_test) == y_test).mean()
 clf = GaussianNB().fit(X_train, y_train)
 print "sklearn Gaussian Naive Bayes: ", (clf.predict(X_test) == y_test).mean()
 model = NaiveBayes.from_samples([NormalDistribution, LogNormalDistribution, ExponentialDistribution], X_train, y_train
 print "Heterogeneous Naive Bayes: ", (model.predict(X_test) == y_test).mean() Gaussian Naive Bayes: 0.798
 sklearn Gaussian Naive Bayes: 0.798
 Heterogeneous Naive Bayes: 0.844
  • 77. This additional flexibility is just as fast 77
  • 78. Bayes classifiers don’t require independence 78 naive accuracy: 0.929 bayes classifier accuracy: 0.966
  • 79. Gaussian mixture model Bayes classifier 79
  • 80. Creating complex Bayes classifiers is easy 80 gmm_a = GeneralMixtureModel.from_samples(MultivariateGaussianDistribution, 2, X[y == 0])
 gmm_b = GeneralMixtureModel.from_samples(MultivariateGaussianDistribution, 2, X[y == 1])
 model_b = BayesClassifier([gmm_a, gmm_b], weights=numpy.array([1-y.mean(), y.mean()]))

  • 81. Creating complex Bayes classifiers is easy 81 mc_a = MarkovChain.from_samples(X[y == 0]) mc_b = MarkovChain.from_samples(X[y == 1])
 model_b = BayesClassifier([mc_a, mc_b], weights=numpy.array([1-y.mean(), y.mean()])) hmm_a = HiddenMarkovModel.from_samples(X[y == 0]) hmm_b = HiddenMarkovModel.from_samples(X[y == 1])
 model_b = BayesClassifier([hmm_a, hmm_b], weights=numpy.array([1-y.mean(), y.mean()])) bn_a = BayesianNetwork.from_samples(X[y == 0]) bn_b = BayesianNetwork.from_samples(X[y == 1])
 model_b = BayesClassifier([bn_a, bn_b], weights=numpy.array([1-y.mean(), y.mean()]))
 

  • 82. Overview: this talk 82 Overview Major Models/Model Stacks 1. General Mixture Models 2. Hidden Markov Models 3. Bayesian Networks 4. Bayes Classifiers Finale: Train a mixture of HMMs in parallel
  • 83. Training a mixture of HMMs in parallel 83 Creating a mixture of HMMs is just as simple as passing the HMMs into a GMM as if it were any other distribution
  • 84. Training a mixture of HMMs in parallel 84fit(model, X, n_jobs=n)
  • 85. Documentation available at Readthedocs 85
  • 86. Tutorials available on github 86 https://github.com/jmschrei/pomegranate/tree/master/tutorials
  • 88. fast and flexible probabilistic modeling in python jmschreiber91 @jmschrei @jmschreiber91 Jacob Schreiber PhD student, Paul G. Allen School of Computer Science, University of Washington, Seattle WA Maxwell Libbrecht Postdoc, Department of Genome Sciences, University of Washington, Seattle WA Assistant Professor, Simon Fraser University, Vancouver BC noble.gs.washington.edu/~maxwl/