SlideShare a Scribd company logo
DS
RC

Data Science
Research Center

Complex Models
for
Big Data

Max Welling
UvA
DS
RC

The Four Paradigms

We have added big data to
computer simulation, experiment
and theory.

Not replaced it…
DS
RC

Big Simulation

Computer simulations have
become increasingly complex
(e.g. weather, earthquake models)

The Computational Wall: If a model has hundreds of parameters, how can we:

1) Find the parameter values that match the observations best?
2) Determine if we underfit (model too simple) or overfit (model too complex)?
3) Compare two models?
DS
RC

Parameter Inference
Parameter Update

Parameters

Simulation

Observations
DS
RC

Challenge I

The “posterior probability”
in closed form.

can not be computed

Solution: Markov Chain Monte Carlo Sampling (MCMC)
DS
RC

Challenge II

We cannot run MCMC because the likelihood
is not given in closed form (but rather as a simulation)

Solution: Likelihood Free MCMC (or Approximate Bayesian Computation)

Run many simulations
and compare samples
With observations.
Source: Csillery, Katalin, et al.
"Approximate Bayesian
computation (ABC)
in practice."Trends in
ecology & evolution 25.7
(2010): 410-418.
DS
RC

Challenge III

We need thousands of simulations to infer the posterior
(infeasible if every simulation takes a day or so)
Ted Meeds

If surrogate ~ log(P) with high
confidence then use surrogate to draw sample.
If not: simulate until enough confidence.

Surrogate of log(P)

Solution: Learn log(P) using Gaussian Process Surrogate functions (GPS)
D S Two Kinds of Complex Model
RC

Machine
Learning

Computational
Science
Model Capacity

“Let the model speak”

“Let the data speak”
DS
RC

3x Exponential Growth
in Machine Learning

Computer Power

Data Volume

Model Capacity
D S Growth in Model Capacity
RC
2020-2050 Human Brain
(N=+/- 100T)

?

Model Capacity over Time

2009: Hinton’s Deep Belief Net
(+/- N=10M)

2013: Google/Y!
(N=+/- 10B)

1943: First NN
(+/- N=10)

1988: NetTalk
(+/- N=20K)
D S Deep Learning: Neural Nets Strike
R C Back(again)
1970: NN discredited
(Minsky & Papert)

2 layers
1943: NN invented
(McCulloch & Pitts)

-Model Size: 10B parameters
-Used by: Yahoo!, Google,
Microsoft, Baidu,
IBM, Scyfer 

1986: Backpropagation
(Rumelhart, Hinton & Williams )

1995: SVM
(Vapnik)

3 layers

2009: Deep Learning
(Hinton)

many
layers
DS
RC

Paradox
Why does model capacity grow exponentially?
Raw Information: O(N)

Predictive Information: log(N)

Noise
?
DS
RC

Big Challenges from Industry
Scyfer connects industry to academia:
-inspire academia w/ relevant problems
-deliver ML products to industry
-host student projects
-provide employment for our students
= VALORISATION

What industry needs.

What academics are
interested in.
DS
RC

Intelligent Autonomous Systems Lab - UvA

Visual
Analytics

Shimon Whiteson

Leo Dorst

Business
Analytics

Decision
Theory

(Geometric Algebra)

Understand
and decide

(Reinforcement Learning
& Planning)

Joris Mooij
(Causality)
Distributed
Processing

Data

Reasoning

Knowledge
representati
on

Large Scale
Databases

Store and
process
Software
Eng.
System /
Network
Eng.

Analyze
and model

Multimedia
Retrieval

Modeling
and
simulation

Information
Retrieval

Machine
Learning

Ben Kröse
(Ambient Robotics)

Dariu Gavrilla
(Human-aware
Intelligent Systems)

Max Welling
(Machine Learning)
DS
RC

Our Future Need

Visual
Analytics

Shimon Whiteson

Leo Dorst

Business
Analytics

Decision
Theory

(Geometric Algebra)

Understand
and decide

(Reinforcement Learning
& Planning)

Joris Mooij
(Causality)
Distributed
Processing

Data

Reasoning

Knowledge
representati
on

Large Scale
Databases

Store and
process
Software
Eng.
System /
Network
Eng.

Analyze
and model

Multimedia
Retrieval

Modeling
and
simulation

Information
Retrieval

Machine
Learning

Ben Kröse
(Ambient Robotics)

Dariu Gavrilla
(Human-aware
Intelligent Systems)

Max Welling
(Machine Learning)
DS
RC

Questions?

More Related Content

What's hot

Joint unsupervised learning of deep representations and image clusters
Joint unsupervised learning of deep representations and image clustersJoint unsupervised learning of deep representations and image clusters
Joint unsupervised learning of deep representations and image clusters
Universitat Politècnica de Catalunya
 
Deep Learning for Computer Vision: Generative models and adversarial training...
Deep Learning for Computer Vision: Generative models and adversarial training...Deep Learning for Computer Vision: Generative models and adversarial training...
Deep Learning for Computer Vision: Generative models and adversarial training...
Universitat Politècnica de Catalunya
 
Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017
Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017
Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017
Universitat Politècnica de Catalunya
 
Bol.com
Bol.comBol.com
Bol.com
BigDataExpo
 
Image Classification using deep learning
Image Classification using deep learning Image Classification using deep learning
Image Classification using deep learning
Asma-AH
 
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
Universitat Politècnica de Catalunya
 
Deep learning with Tensorflow in R
Deep learning with Tensorflow in RDeep learning with Tensorflow in R
Deep learning with Tensorflow in R
mikaelhuss
 
Learning where to look: focus and attention in deep vision
Learning where to look: focus and attention in deep visionLearning where to look: focus and attention in deep vision
Learning where to look: focus and attention in deep vision
Universitat Politècnica de Catalunya
 
Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)
Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)
Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)
Universitat Politècnica de Catalunya
 
Semantic Segmentation - Míriam Bellver - UPC Barcelona 2018
Semantic Segmentation - Míriam Bellver - UPC Barcelona 2018Semantic Segmentation - Míriam Bellver - UPC Barcelona 2018
Semantic Segmentation - Míriam Bellver - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Universitat Politècnica de Catalunya
 
Deep Learning for Computer Vision: Attention Models (UPC 2016)
Deep Learning for Computer Vision: Attention Models (UPC 2016)Deep Learning for Computer Vision: Attention Models (UPC 2016)
Deep Learning for Computer Vision: Attention Models (UPC 2016)
Universitat Politècnica de Catalunya
 
Generative Models and Adversarial Training (D2L3 Insight@DCU Machine Learning...
Generative Models and Adversarial Training (D2L3 Insight@DCU Machine Learning...Generative Models and Adversarial Training (D2L3 Insight@DCU Machine Learning...
Generative Models and Adversarial Training (D2L3 Insight@DCU Machine Learning...
Universitat Politècnica de Catalunya
 
Capitalico / Chart Pattern Matching in Financial Trading Using RNN
Capitalico / Chart Pattern Matching in Financial Trading Using RNNCapitalico / Chart Pattern Matching in Financial Trading Using RNN
Capitalico / Chart Pattern Matching in Financial Trading Using RNN
Alpaca
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Universitat Politècnica de Catalunya
 
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Universitat Politècnica de Catalunya
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Universitat Politècnica de Catalunya
 
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
Object classification using CNN & VGG16 Model (Keras and Tensorflow)
Object classification using CNN & VGG16 Model (Keras and Tensorflow) Object classification using CNN & VGG16 Model (Keras and Tensorflow)
Object classification using CNN & VGG16 Model (Keras and Tensorflow)
Lalit Jain
 
Cluster formation over huge volatile robotic data
Cluster formation over huge volatile robotic data Cluster formation over huge volatile robotic data
Cluster formation over huge volatile robotic data
Eirini Ntoutsi
 

What's hot (20)

Joint unsupervised learning of deep representations and image clusters
Joint unsupervised learning of deep representations and image clustersJoint unsupervised learning of deep representations and image clusters
Joint unsupervised learning of deep representations and image clusters
 
Deep Learning for Computer Vision: Generative models and adversarial training...
Deep Learning for Computer Vision: Generative models and adversarial training...Deep Learning for Computer Vision: Generative models and adversarial training...
Deep Learning for Computer Vision: Generative models and adversarial training...
 
Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017
Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017
Deep 3D Visual Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2017
 
Bol.com
Bol.comBol.com
Bol.com
 
Image Classification using deep learning
Image Classification using deep learning Image Classification using deep learning
Image Classification using deep learning
 
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
 
Deep learning with Tensorflow in R
Deep learning with Tensorflow in RDeep learning with Tensorflow in R
Deep learning with Tensorflow in R
 
Learning where to look: focus and attention in deep vision
Learning where to look: focus and attention in deep visionLearning where to look: focus and attention in deep vision
Learning where to look: focus and attention in deep vision
 
Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)
Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)
Deep Learning for Computer Vision: Unsupervised Learning (UPC 2016)
 
Semantic Segmentation - Míriam Bellver - UPC Barcelona 2018
Semantic Segmentation - Míriam Bellver - UPC Barcelona 2018Semantic Segmentation - Míriam Bellver - UPC Barcelona 2018
Semantic Segmentation - Míriam Bellver - UPC Barcelona 2018
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Deep Learning for Computer Vision: Attention Models (UPC 2016)
Deep Learning for Computer Vision: Attention Models (UPC 2016)Deep Learning for Computer Vision: Attention Models (UPC 2016)
Deep Learning for Computer Vision: Attention Models (UPC 2016)
 
Generative Models and Adversarial Training (D2L3 Insight@DCU Machine Learning...
Generative Models and Adversarial Training (D2L3 Insight@DCU Machine Learning...Generative Models and Adversarial Training (D2L3 Insight@DCU Machine Learning...
Generative Models and Adversarial Training (D2L3 Insight@DCU Machine Learning...
 
Capitalico / Chart Pattern Matching in Financial Trading Using RNN
Capitalico / Chart Pattern Matching in Financial Trading Using RNNCapitalico / Chart Pattern Matching in Financial Trading Using RNN
Capitalico / Chart Pattern Matching in Financial Trading Using RNN
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
 
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
 
Object classification using CNN & VGG16 Model (Keras and Tensorflow)
Object classification using CNN & VGG16 Model (Keras and Tensorflow) Object classification using CNN & VGG16 Model (Keras and Tensorflow)
Object classification using CNN & VGG16 Model (Keras and Tensorflow)
 
Cluster formation over huge volatile robotic data
Cluster formation over huge volatile robotic data Cluster formation over huge volatile robotic data
Cluster formation over huge volatile robotic data
 

Viewers also liked

Building new business models through big data dec 06 2012
Building new business models through big data   dec 06 2012Building new business models through big data   dec 06 2012
Building new business models through big data dec 06 2012
Aki Balogh
 
Data Science Highlights
Data Science Highlights Data Science Highlights
Data Science Highlights
Joe Lamantia
 
Engineering patterns for implementing data science models on big data platforms
Engineering patterns for implementing data science models on big data platformsEngineering patterns for implementing data science models on big data platforms
Engineering patterns for implementing data science models on big data platforms
Hisham Arafat
 
Becoming Data-Driven Through Cultural Change
Becoming Data-Driven Through Cultural ChangeBecoming Data-Driven Through Cultural Change
Becoming Data-Driven Through Cultural Change
Cloudera, Inc.
 
From Insight to Action: Using Data Science to Transform Your Organization
From Insight to Action: Using Data Science to Transform Your OrganizationFrom Insight to Action: Using Data Science to Transform Your Organization
From Insight to Action: Using Data Science to Transform Your Organization
Cloudera, Inc.
 
How to create new business models with Big Data and Analytics
How to create new business models with Big Data and AnalyticsHow to create new business models with Big Data and Analytics
How to create new business models with Big Data and Analytics
Aki Balogh
 
Analytics Trends 2016: The next evolution
Analytics Trends 2016: The next evolutionAnalytics Trends 2016: The next evolution
Analytics Trends 2016: The next evolution
Deloitte United States
 

Viewers also liked (7)

Building new business models through big data dec 06 2012
Building new business models through big data   dec 06 2012Building new business models through big data   dec 06 2012
Building new business models through big data dec 06 2012
 
Data Science Highlights
Data Science Highlights Data Science Highlights
Data Science Highlights
 
Engineering patterns for implementing data science models on big data platforms
Engineering patterns for implementing data science models on big data platformsEngineering patterns for implementing data science models on big data platforms
Engineering patterns for implementing data science models on big data platforms
 
Becoming Data-Driven Through Cultural Change
Becoming Data-Driven Through Cultural ChangeBecoming Data-Driven Through Cultural Change
Becoming Data-Driven Through Cultural Change
 
From Insight to Action: Using Data Science to Transform Your Organization
From Insight to Action: Using Data Science to Transform Your OrganizationFrom Insight to Action: Using Data Science to Transform Your Organization
From Insight to Action: Using Data Science to Transform Your Organization
 
How to create new business models with Big Data and Analytics
How to create new business models with Big Data and AnalyticsHow to create new business models with Big Data and Analytics
How to create new business models with Big Data and Analytics
 
Analytics Trends 2016: The next evolution
Analytics Trends 2016: The next evolutionAnalytics Trends 2016: The next evolution
Analytics Trends 2016: The next evolution
 

Similar to Complex Models for Big Data

Android and Deep Learning
Android and Deep LearningAndroid and Deep Learning
Android and Deep Learning
Oswald Campesato
 
Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)
Oswald Campesato
 
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
Paolo Missier
 
Deep Learning And Business Models (VNITC 2015-09-13)
Deep Learning And Business Models (VNITC 2015-09-13)Deep Learning And Business Models (VNITC 2015-09-13)
Deep Learning And Business Models (VNITC 2015-09-13)
Ha Phuong
 
Introduction to Deep Learning and Tensorflow
Introduction to Deep Learning and TensorflowIntroduction to Deep Learning and Tensorflow
Introduction to Deep Learning and Tensorflow
Oswald Campesato
 
U_N.o.1T: A U-Net exploration, in Depth
U_N.o.1T: A U-Net exploration, in DepthU_N.o.1T: A U-Net exploration, in Depth
U_N.o.1T: A U-Net exploration, in Depth
Manuel Nieves Sáez
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
Oswald Campesato
 
230208 MLOps Getting from Good to Great.pptx
230208 MLOps Getting from Good to Great.pptx230208 MLOps Getting from Good to Great.pptx
230208 MLOps Getting from Good to Great.pptx
Arthur240715
 
Cognitive Engine: Boosting Scientific Discovery
Cognitive Engine:  Boosting Scientific DiscoveryCognitive Engine:  Boosting Scientific Discovery
Cognitive Engine: Boosting Scientific Discovery
diannepatricia
 
Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22
marpierc
 
Angular and Deep Learning
Angular and Deep LearningAngular and Deep Learning
Angular and Deep Learning
Oswald Campesato
 
Agents In An Exponential World Foster
Agents In An Exponential World FosterAgents In An Exponential World Foster
Agents In An Exponential World Foster
Ian Foster
 
The Other HPC: High Productivity Computing
The Other HPC: High Productivity ComputingThe Other HPC: High Productivity Computing
The Other HPC: High Productivity Computing
University of Washington
 
Deep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistry
Kenta Oono
 
Full resume dr_russell_john_childs_2013
Full resume dr_russell_john_childs_2013Full resume dr_russell_john_childs_2013
Full resume dr_russell_john_childs_2013
Russell Childs
 
PointNet
PointNetPointNet
Portfolio
PortfolioPortfolio
Portfolio
Ivan Khomyakov
 
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI ProjectsDiscovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
Wee Hyong Tok
 
Qiu bosc2010
Qiu bosc2010Qiu bosc2010
Qiu bosc2010
BOSC 2010
 
LR2. Summary Day 2
LR2. Summary Day 2LR2. Summary Day 2
LR2. Summary Day 2
Machine Learning Valencia
 

Similar to Complex Models for Big Data (20)

Android and Deep Learning
Android and Deep LearningAndroid and Deep Learning
Android and Deep Learning
 
Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)
 
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
(Explainable) Data-Centric AI: what are you explaininhg, and to whom?
 
Deep Learning And Business Models (VNITC 2015-09-13)
Deep Learning And Business Models (VNITC 2015-09-13)Deep Learning And Business Models (VNITC 2015-09-13)
Deep Learning And Business Models (VNITC 2015-09-13)
 
Introduction to Deep Learning and Tensorflow
Introduction to Deep Learning and TensorflowIntroduction to Deep Learning and Tensorflow
Introduction to Deep Learning and Tensorflow
 
U_N.o.1T: A U-Net exploration, in Depth
U_N.o.1T: A U-Net exploration, in DepthU_N.o.1T: A U-Net exploration, in Depth
U_N.o.1T: A U-Net exploration, in Depth
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
230208 MLOps Getting from Good to Great.pptx
230208 MLOps Getting from Good to Great.pptx230208 MLOps Getting from Good to Great.pptx
230208 MLOps Getting from Good to Great.pptx
 
Cognitive Engine: Boosting Scientific Discovery
Cognitive Engine:  Boosting Scientific DiscoveryCognitive Engine:  Boosting Scientific Discovery
Cognitive Engine: Boosting Scientific Discovery
 
Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22
 
Angular and Deep Learning
Angular and Deep LearningAngular and Deep Learning
Angular and Deep Learning
 
Agents In An Exponential World Foster
Agents In An Exponential World FosterAgents In An Exponential World Foster
Agents In An Exponential World Foster
 
The Other HPC: High Productivity Computing
The Other HPC: High Productivity ComputingThe Other HPC: High Productivity Computing
The Other HPC: High Productivity Computing
 
Deep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistryDeep learning for molecules, introduction to chainer chemistry
Deep learning for molecules, introduction to chainer chemistry
 
Full resume dr_russell_john_childs_2013
Full resume dr_russell_john_childs_2013Full resume dr_russell_john_childs_2013
Full resume dr_russell_john_childs_2013
 
PointNet
PointNetPointNet
PointNet
 
Portfolio
PortfolioPortfolio
Portfolio
 
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI ProjectsDiscovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
 
Qiu bosc2010
Qiu bosc2010Qiu bosc2010
Qiu bosc2010
 
LR2. Summary Day 2
LR2. Summary Day 2LR2. Summary Day 2
LR2. Summary Day 2
 

Recently uploaded

Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
Safe Software
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
Neo4j
 
Demystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through StorytellingDemystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through Storytelling
Enterprise Knowledge
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving
 
Christine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptxChristine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptx
christinelarrosa
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
saastr
 
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Pitangent Analytics & Technology Solutions Pvt. Ltd
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
Antonios Katsarakis
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
Jason Yip
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
Fwdays
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
ScyllaDB
 
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham HillinQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
LizaNolte
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
Javier Junquera
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
Neo4j
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
Pablo Gómez Abajo
 

Recently uploaded (20)

Essentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation ParametersEssentials of Automations: Exploring Attributes & Automation Parameters
Essentials of Automations: Exploring Attributes & Automation Parameters
 
Leveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and StandardsLeveraging the Graph for Clinical Trials and Standards
Leveraging the Graph for Clinical Trials and Standards
 
Demystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through StorytellingDemystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through Storytelling
 
Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving | Nameplate Manufacturing Process - 2024
Northern Engraving | Nameplate Manufacturing Process - 2024
 
Christine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptxChristine's Supplier Sourcing Presentaion.pptx
Christine's Supplier Sourcing Presentaion.pptx
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
 
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
Crafting Excellence: A Comprehensive Guide to iOS Mobile App Development Serv...
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
 
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...
 
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk"Frontline Battles with DDoS: Best practices and Lessons Learned",  Igor Ivaniuk
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor Ivaniuk
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
 
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham HillinQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
inQuba Webinar Mastering Customer Journey Management with Dr Graham Hill
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)GNSS spoofing via SDR (Criptored Talks 2024)
GNSS spoofing via SDR (Criptored Talks 2024)
 
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge GraphGraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
GraphRAG for LifeSciences Hands-On with the Clinical Knowledge Graph
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
 

Complex Models for Big Data

  • 1. DS RC Data Science Research Center Complex Models for Big Data Max Welling UvA
  • 2. DS RC The Four Paradigms We have added big data to computer simulation, experiment and theory. Not replaced it…
  • 3. DS RC Big Simulation Computer simulations have become increasingly complex (e.g. weather, earthquake models) The Computational Wall: If a model has hundreds of parameters, how can we: 1) Find the parameter values that match the observations best? 2) Determine if we underfit (model too simple) or overfit (model too complex)? 3) Compare two models?
  • 5. DS RC Challenge I The “posterior probability” in closed form. can not be computed Solution: Markov Chain Monte Carlo Sampling (MCMC)
  • 6. DS RC Challenge II We cannot run MCMC because the likelihood is not given in closed form (but rather as a simulation) Solution: Likelihood Free MCMC (or Approximate Bayesian Computation) Run many simulations and compare samples With observations. Source: Csillery, Katalin, et al. "Approximate Bayesian computation (ABC) in practice."Trends in ecology & evolution 25.7 (2010): 410-418.
  • 7. DS RC Challenge III We need thousands of simulations to infer the posterior (infeasible if every simulation takes a day or so) Ted Meeds If surrogate ~ log(P) with high confidence then use surrogate to draw sample. If not: simulate until enough confidence. Surrogate of log(P) Solution: Learn log(P) using Gaussian Process Surrogate functions (GPS)
  • 8. D S Two Kinds of Complex Model RC Machine Learning Computational Science Model Capacity “Let the model speak” “Let the data speak”
  • 9. DS RC 3x Exponential Growth in Machine Learning Computer Power Data Volume Model Capacity
  • 10. D S Growth in Model Capacity RC 2020-2050 Human Brain (N=+/- 100T) ? Model Capacity over Time 2009: Hinton’s Deep Belief Net (+/- N=10M) 2013: Google/Y! (N=+/- 10B) 1943: First NN (+/- N=10) 1988: NetTalk (+/- N=20K)
  • 11. D S Deep Learning: Neural Nets Strike R C Back(again) 1970: NN discredited (Minsky & Papert) 2 layers 1943: NN invented (McCulloch & Pitts) -Model Size: 10B parameters -Used by: Yahoo!, Google, Microsoft, Baidu, IBM, Scyfer  1986: Backpropagation (Rumelhart, Hinton & Williams ) 1995: SVM (Vapnik) 3 layers 2009: Deep Learning (Hinton) many layers
  • 12. DS RC Paradox Why does model capacity grow exponentially? Raw Information: O(N) Predictive Information: log(N) Noise ?
  • 13. DS RC Big Challenges from Industry Scyfer connects industry to academia: -inspire academia w/ relevant problems -deliver ML products to industry -host student projects -provide employment for our students = VALORISATION What industry needs. What academics are interested in.
  • 14. DS RC Intelligent Autonomous Systems Lab - UvA Visual Analytics Shimon Whiteson Leo Dorst Business Analytics Decision Theory (Geometric Algebra) Understand and decide (Reinforcement Learning & Planning) Joris Mooij (Causality) Distributed Processing Data Reasoning Knowledge representati on Large Scale Databases Store and process Software Eng. System / Network Eng. Analyze and model Multimedia Retrieval Modeling and simulation Information Retrieval Machine Learning Ben Kröse (Ambient Robotics) Dariu Gavrilla (Human-aware Intelligent Systems) Max Welling (Machine Learning)
  • 15. DS RC Our Future Need Visual Analytics Shimon Whiteson Leo Dorst Business Analytics Decision Theory (Geometric Algebra) Understand and decide (Reinforcement Learning & Planning) Joris Mooij (Causality) Distributed Processing Data Reasoning Knowledge representati on Large Scale Databases Store and process Software Eng. System / Network Eng. Analyze and model Multimedia Retrieval Modeling and simulation Information Retrieval Machine Learning Ben Kröse (Ambient Robotics) Dariu Gavrilla (Human-aware Intelligent Systems) Max Welling (Machine Learning)