SlideShare a Scribd company logo
Neural Networks with Python
Tom Dierickx
Data Services Team Knowledge-Sharing
December 7, 2018
42
Today’s Agenda
● What is “learning” ?
● What is “machine learning”?
○ demo: decision tree (using scikit-learn and XGBoost inside PowerBI)
○ demo: logistic regression (using statsmodels w/ AWS SageMaker)
○ demo: neural network (using scikit-learn w/ Azure Notebooks)
● What is “deep learning”?
○ demo: neural network (using TensorFlow via Keras w/Google Colab)
● Where you can do it online today … for free!
● Resources and links
What is “Learning” ?
● We typically think of learning in terms of the act, or subjective
experience, of becoming aware of some new fact (which usually has
a “feel” to it) or chunk of information (which generally has a larger “feel” to it)
● Note how this is a very personal, human-centered interpretation
as it’s implicitly defined in terms of our own “consciousness”
● In truth, our brains give themselves a juicy hit of “dopamine” as a
reward for each novel fact/information/news acquired … and it’s
well-known that “emotional-to-us” things actually get packed
away deeper into our long-term memories stronger and longer
● … BUT, a more objective viewpoint might be to define “learning” in
terms of acquiring new skills that, hopefully, lead to increased
accuracy, efficiency, and speed to recall for us in some subject
area … after all, that’s why we humans evolved to learn right?
● So, let’s define learning as “becoming more proficient” in
something (i.e. spotting mistakes quicker, making less errors, connecting dots, etc)
What is “Machine Learning”?
● Continuous improvement in accurately “predicting” output values
(called “labels”) from input values (called “features”)
● More specifically, automatically improving “on it own” against some
pre-defined metric (typically, some “cost function”) that typically compares
predicted values versus actual values, across all observations [i.e. want
minimal error rate% (in classification problems) or minimal RMSE (in regression problems)]
● Various ML algorithms are used to “train” (i.e. learn rules) against some
“training data” until accuracy is thought sufficiently “good enough”
● The learned rules (i.e. usually in the form of some weighted coefficients; albeit, sometimes
10’s of 1000’s, or even more, of them!) are then applied against some unseen
“test data” (i.e. usually just some data, like 20%, that was held out from being trained on) to
validate accuracy on “new” data and hope it holds up
● Want minimal under-fitting (aka, “high bias”)
and minimal over-fitting (aka, “high variance”)
● Under-fitting can be improved with more
data, better features, or better algorithm
● Over-fitting can be improved with simpler models
and/or adding regularization parameters
What is “Machine Learning”? (Cont.)
Q: Why is ML such a “big deal” and so hyped today?
A: b/c it’s transforming our world as we speak! Instead of
a programmer having to (somehow!) know all the conditional
logic rules needed upfront to produce desired output,
internal rules that “just work” are “magically” inferred
Note: “learned rules” may not always be directly accessible or even interpretable
What is “Machine Learning”? (Cont.)
● Some popular tools of the trade (2018 edition)
○ Python (esp. Anaconda distro) is soaring (R is fading)
○ SQL (the “language”, in general) still essential
○ scikit-learn and TensorFlow (w/ Keras wrapper) very popular ML libraries
○ Apache Spark (big data backend) remains preferred over classic Hadoop
Example: Decision Tree
● Attempt to find “natural splits” in data (usually by minimizing “entropy”
and, thus, maximizing “information gain”; i.e., find the most homogeneous branches)
● Tend to overfit (thus, under-perform); but can improve with ensemble methods:
○ Bagged trees: (i.e. multiple trees generated by random sampling; use aggregate “consensus”)
■ “Random forest” most common algorithm
○ Boosted trees: (i.e. incrementally build tree “learning” from prior observations; “snowball”)
■ “XGBoost” most popular algorithm
Live Demo: Decision Tree
● PowerBI Desktop to:
○ fetch dataset on 80k+ UFO sightings from the interwebs via URL
○ serve as an interactive, GUI reporting container to slice & dice things
● Python scripts inside PowerBI to:
○ download historical lunar phase data from external website
○ combine everything using Pandas
○ predict most likely times for UFO sightings using:
■ scikit-learn module to build a “simple” decision tree
■ XGBoost module to gain even better results.
Example: Logistic Regression
● Good for predicting binary output (i.e. 1/0, yes/no, true/false, win/loss, pass/fail, in/out)
● Models “probability” [0 ≤ p ≤ 1] of “Y/N” responses; good for binary classification
Live Demo: Logistic Regression
● AWS SageMaker platform to:
○ Create jupyter notebook in the cloud
○ Look at NFL turnover +/- margin vs win/loss for week 13 games
○ Use statsmodels library to perform logistic regression
○ Use seaborn plotting library for creating nice visuals
Example: Neural Network
● A neural network is similar to logistic regression in some ways, but :
○ Has hidden layer in the middle, with multiple nodes, instead of a single output
○ These nodes (called “neurons”) in the middle each generate their own 0 ≤ value ≤ 1
○ Other activation functions to introduce non-linearity besides “sigmoid” function can be used
○ Output layer can support multiple, predicted output values (p.s. though not shown below)
● Technical notes:
○ Weights are inferred by gradient descent (i.e. partial derivatives) optimization algorithm
○ Weights are updated through a very iterative process called backpropagation
○ Can take many iterations to minimize cost function and for it to converge
Live Demo: Neural Network
● Microsoft Azure Notebooks platform to:
○ Create jupyter notebook in the cloud
○ Look at tic-tac-toe board configurations (https://datahub.io/machine-learning/tic-tac-toe-endgame)
○ Use scikit-learn library to train a “simple” neural network to “learn” what combination
of moves equates to winning or losing for X
○ Validate a prediction by hand to show how the math works
What is “Deep Learning”? (Cont.)
● There really is no exact definition, but is common to implicitly refer to a subset
of machine learning types that focus mostly on very “deep” neural networks
(aka, many hidden layers and nodes)
● Some evolving variants, like recurrent neural networks (RNN) for sequential
data (think: speech recognition) or convolutional neural networks (CNN) for image
data (think: image recognition) perform clever, custom calculations and connect the
hidden layers together in slightly different ways to perform even better (i.e.
faster, more accurately, with less calculations needed) than a “classic” feed forward
network (FFN)
What is “Deep Learning”? (Cont.)
GoogLeNet (Inception v1)—
Winner of ILSVRC 2014
(Image Classification)
● This is the best picture I have seen depicting how the various pieces of the
data and analytics landscape relate to each other (... and my “problem” is I find every
piece so interesting in its own right that I feel like I never know enough about any of them!!)
Current Landscape
Source: https://www.machinecurve.com/index.php/2017/09/30/the-differences-between-artificial-intelligence-machine-learning-more/
Example: Deep Neural Network
● Neural network having multiple hidden layers and more nodes, so can “learn”
more complex patterns … but requires much more data to do so, of course
● Newer architectures even employ different types of hidden layer nodes
● More complicated networks even stitch together multiple networks into one
larger network in a pipeline fashion
● It’s common to plug-in pre-trained networks, especially in audio and/or vision
applications, so don’t have to train from scratch; this is called transfer learning
Live Demo: Deep Neural Network
● Google Colab platform to:
○ Create jupyter-based notebook in the cloud
○ Look at 60 years worth of daily weather data for Rockford, IL
(generated from https://www.ncdc.noaa.gov)
○ Upload raw file to google drive
○ Use Keras wrapper library on top of TensorFlow library to
train a “deep” neural network to “learn” for us if it will rain or snow for
upcoming Saturday given today’s weather is X
Where you can do it online - for free!
There’s a bit of “space race” to take over the world through AI and ML and with
cloud-based computing now ubiquitous and a commodity resource, typically
metered by the hour, there’s lots of 100% free (for now, anyway) places to learn and
practice ML (generally) and Neural Networks (specifically) beyond just your own laptop
● Google Colab
(Python; 20GB RAM, free GPU/TPU hardware)
● Kaggle
(Python or R; 17GB RAM; google acquired in 2017; compete for prizes!)
● Azure Notebooks
(Python or R or F#; 4GB RAM)
● Amazon SageMaker
(Python, R, Scala; 4GB RAM, access to AWS ecosystem, free tier = 250 hours limit)
● IBM Watson Studio
(Python, R, Scala; 4GB RAM, feature-rich options)
● Many more out there popping up everyday...
Resources and links
● “The differences between AI, machine learning & more”
https://www.machinecurve.com/index.php/2017/09/30/the-differences-between-artificial-intelligence-machine-learning-more/
● “Introduction to Data Science”
https://www.saedsayad.com/data_mining_map.htm
● “Definitions of common machine learning terms”
https://ml-cheatsheet.readthedocs.io/en/latest/glossary.html
● “Decision Trees and Boosting, XGBoost | Two Minute Papers #55”
https://www.youtube.com/watch?v=0Xc9LIb_HTw
● “Logistic Regression - Fun and Easy Machine Learning”
https://www.youtube.com/watch?v=7qJ7GksOXoA
● “3 Blue, 1 Brown: Neural networks”
https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi
● “An introduction to Machine Learning (and a little bit of Deep Learning)”
https://www.slideshare.net/ThomasDaSilvaPaula/an-introduction-to-machine-learning-and-a-little-bit-of-deep-learning
● “Modern Convolutional Neural Network techniques for image segmentation”
https://www.slideshare.net/GioeleCiaparrone/modern-convolutional-neural-network-techniques-for-image-segmentation
● “Neural Networks and Deep Learning” free online course
https://www.coursera.org/learn/neural-networks-deep-learning
● “NUFORC geolocated and time standardized UFO reports”
https://github.com/planetsig/ufo-reports

More Related Content

What's hot

Getting started with TensorFlow
Getting started with TensorFlowGetting started with TensorFlow
Getting started with TensorFlow
ElifTech
 
Metta Innovations - Introdução ao Deep Learning aplicado a vídeo analytics
Metta Innovations - Introdução ao Deep Learning aplicado a vídeo analyticsMetta Innovations - Introdução ao Deep Learning aplicado a vídeo analytics
Metta Innovations - Introdução ao Deep Learning aplicado a vídeo analytics
Eduardo Gaspar
 
How to apply deep learning to 3 d objects
How to apply deep learning to 3 d objectsHow to apply deep learning to 3 d objects
How to apply deep learning to 3 d objects
Ogushi Masaya
 
Approximate "Now" is Better Than Accurate "Later"
Approximate "Now" is Better Than Accurate "Later"Approximate "Now" is Better Than Accurate "Later"
Approximate "Now" is Better Than Accurate "Later"
NUS-ISS
 
CodeStock - Exploring .NET memory management - a trip down memory lane
CodeStock - Exploring .NET memory management - a trip down memory laneCodeStock - Exploring .NET memory management - a trip down memory lane
CodeStock - Exploring .NET memory management - a trip down memory lane
Maarten Balliauw
 
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
Sebastian Raschka
 
Scalable high-dimensional indexing with Hadoop
Scalable high-dimensional indexing with HadoopScalable high-dimensional indexing with Hadoop
Scalable high-dimensional indexing with Hadoop
Denis Shestakov
 
How to win data science competitions with Deep Learning
How to win data science competitions with Deep LearningHow to win data science competitions with Deep Learning
How to win data science competitions with Deep Learning
Sri Ambati
 
H2O Open Source Deep Learning, Arno Candel 03-20-14
H2O Open Source Deep Learning, Arno Candel 03-20-14H2O Open Source Deep Learning, Arno Candel 03-20-14
H2O Open Source Deep Learning, Arno Candel 03-20-14
Sri Ambati
 
H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614
Sri Ambati
 
A Scalable Implementation of Deep Learning on Spark (Alexander Ulanov)
A Scalable Implementation of Deep Learning on Spark (Alexander Ulanov)A Scalable Implementation of Deep Learning on Spark (Alexander Ulanov)
A Scalable Implementation of Deep Learning on Spark (Alexander Ulanov)
Alexander Ulanov
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
Databricks
 
Intro to data oriented design
Intro to data oriented designIntro to data oriented design
Intro to data oriented design
Stoyan Nikolov
 
MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...
Sri Ambati
 
Distributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNetDistributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNet
Amazon Web Services
 
Terabyte-scale image similarity search: experience and best practice
Terabyte-scale image similarity search: experience and best practiceTerabyte-scale image similarity search: experience and best practice
Terabyte-scale image similarity search: experience and best practice
Denis Shestakov
 
Deep Learning for AI (2)
Deep Learning for AI (2)Deep Learning for AI (2)
Deep Learning for AI (2)
Dongheon Lee
 
AWS re:Invent 2016: Using MXNet for Recommendation Modeling at Scale (MAC306)
AWS re:Invent 2016: Using MXNet for Recommendation Modeling at Scale (MAC306)AWS re:Invent 2016: Using MXNet for Recommendation Modeling at Scale (MAC306)
AWS re:Invent 2016: Using MXNet for Recommendation Modeling at Scale (MAC306)
Amazon Web Services
 
딥러닝프레임워크비교
딥러닝프레임워크비교딥러닝프레임워크비교
딥러닝프레임워크비교
Junyi Song
 
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
Databricks
 

What's hot (20)

Getting started with TensorFlow
Getting started with TensorFlowGetting started with TensorFlow
Getting started with TensorFlow
 
Metta Innovations - Introdução ao Deep Learning aplicado a vídeo analytics
Metta Innovations - Introdução ao Deep Learning aplicado a vídeo analyticsMetta Innovations - Introdução ao Deep Learning aplicado a vídeo analytics
Metta Innovations - Introdução ao Deep Learning aplicado a vídeo analytics
 
How to apply deep learning to 3 d objects
How to apply deep learning to 3 d objectsHow to apply deep learning to 3 d objects
How to apply deep learning to 3 d objects
 
Approximate "Now" is Better Than Accurate "Later"
Approximate "Now" is Better Than Accurate "Later"Approximate "Now" is Better Than Accurate "Later"
Approximate "Now" is Better Than Accurate "Later"
 
CodeStock - Exploring .NET memory management - a trip down memory lane
CodeStock - Exploring .NET memory management - a trip down memory laneCodeStock - Exploring .NET memory management - a trip down memory lane
CodeStock - Exploring .NET memory management - a trip down memory lane
 
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...An Introduction to Supervised Machine Learning and Pattern Classification: Th...
An Introduction to Supervised Machine Learning and Pattern Classification: Th...
 
Scalable high-dimensional indexing with Hadoop
Scalable high-dimensional indexing with HadoopScalable high-dimensional indexing with Hadoop
Scalable high-dimensional indexing with Hadoop
 
How to win data science competitions with Deep Learning
How to win data science competitions with Deep LearningHow to win data science competitions with Deep Learning
How to win data science competitions with Deep Learning
 
H2O Open Source Deep Learning, Arno Candel 03-20-14
H2O Open Source Deep Learning, Arno Candel 03-20-14H2O Open Source Deep Learning, Arno Candel 03-20-14
H2O Open Source Deep Learning, Arno Candel 03-20-14
 
H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614H2O Distributed Deep Learning by Arno Candel 071614
H2O Distributed Deep Learning by Arno Candel 071614
 
A Scalable Implementation of Deep Learning on Spark (Alexander Ulanov)
A Scalable Implementation of Deep Learning on Spark (Alexander Ulanov)A Scalable Implementation of Deep Learning on Spark (Alexander Ulanov)
A Scalable Implementation of Deep Learning on Spark (Alexander Ulanov)
 
Training Neural Networks
Training Neural NetworksTraining Neural Networks
Training Neural Networks
 
Intro to data oriented design
Intro to data oriented designIntro to data oriented design
Intro to data oriented design
 
MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...MLconf - Distributed Deep Learning for Classification and Regression Problems...
MLconf - Distributed Deep Learning for Classification and Regression Problems...
 
Distributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNetDistributed Deep Learning on AWS with Apache MXNet
Distributed Deep Learning on AWS with Apache MXNet
 
Terabyte-scale image similarity search: experience and best practice
Terabyte-scale image similarity search: experience and best practiceTerabyte-scale image similarity search: experience and best practice
Terabyte-scale image similarity search: experience and best practice
 
Deep Learning for AI (2)
Deep Learning for AI (2)Deep Learning for AI (2)
Deep Learning for AI (2)
 
AWS re:Invent 2016: Using MXNet for Recommendation Modeling at Scale (MAC306)
AWS re:Invent 2016: Using MXNet for Recommendation Modeling at Scale (MAC306)AWS re:Invent 2016: Using MXNet for Recommendation Modeling at Scale (MAC306)
AWS re:Invent 2016: Using MXNet for Recommendation Modeling at Scale (MAC306)
 
딥러닝프레임워크비교
딥러닝프레임워크비교딥러닝프레임워크비교
딥러닝프레임워크비교
 
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
A Tale of Three Deep Learning Frameworks: TensorFlow, Keras, & PyTorch with B...
 

Similar to Neural networks with python

Large Scale Deep Learning with TensorFlow
Large Scale Deep Learning with TensorFlow Large Scale Deep Learning with TensorFlow
Large Scale Deep Learning with TensorFlow
Jen Aman
 
The Flow of TensorFlow
The Flow of TensorFlowThe Flow of TensorFlow
The Flow of TensorFlow
Jeongkyu Shin
 
Statistics vs machine learning
Statistics vs machine learningStatistics vs machine learning
Statistics vs machine learning
Tom Dierickx
 
Neuromation.io AI Ukraine Presentation
Neuromation.io AI Ukraine PresentationNeuromation.io AI Ukraine Presentation
Neuromation.io AI Ukraine Presentation
Bohdan Klimenko
 
Fast and Scalable Python
Fast and Scalable PythonFast and Scalable Python
Fast and Scalable Python
Travis Oliphant
 
Apache Cassandra Lunch #54: Machine Learning with Spark + Cassandra Part 2
Apache Cassandra Lunch #54: Machine Learning with Spark + Cassandra Part 2Apache Cassandra Lunch #54: Machine Learning with Spark + Cassandra Part 2
Apache Cassandra Lunch #54: Machine Learning with Spark + Cassandra Part 2
Anant Corporation
 
Py tables
Py tablesPy tables
Py tables
Ali Hallaji
 
Large Data Analyze With PyTables
Large Data Analyze With PyTablesLarge Data Analyze With PyTables
Large Data Analyze With PyTables
Innfinision Cloud and BigData Solutions
 
PyTables
PyTablesPyTables
PyTables
Ali Hallaji
 
Monitoring Big Data Systems - "The Simple Way"
Monitoring Big Data Systems - "The Simple Way"Monitoring Big Data Systems - "The Simple Way"
Monitoring Big Data Systems - "The Simple Way"
Demi Ben-Ari
 
Apache Cassandra Lunch #50: Machine Learning with Spark + Cassandra
Apache Cassandra Lunch #50: Machine Learning with Spark + CassandraApache Cassandra Lunch #50: Machine Learning with Spark + Cassandra
Apache Cassandra Lunch #50: Machine Learning with Spark + Cassandra
Anant Corporation
 
Netflix machine learning
Netflix machine learningNetflix machine learning
Netflix machine learning
Amer Ather
 
Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...
Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...
Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...
Holden Karau
 
Distributed machine learning 101 using apache spark from a browser devoxx.b...
Distributed machine learning 101 using apache spark from a browser   devoxx.b...Distributed machine learning 101 using apache spark from a browser   devoxx.b...
Distributed machine learning 101 using apache spark from a browser devoxx.b...
Andy Petrella
 
Big Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-onBig Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-on
Dony Riyanto
 
TensorfLow_Basic.pptx
TensorfLow_Basic.pptxTensorfLow_Basic.pptx
TensorfLow_Basic.pptx
TMUb202109065
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
Büşra İçöz
 
Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)
Julien SIMON
 
A Platform for Accelerating Machine Learning Applications
 A Platform for Accelerating Machine Learning Applications A Platform for Accelerating Machine Learning Applications
A Platform for Accelerating Machine Learning Applications
NVIDIA Taiwan
 
Python Machine Learning - Getting Started
Python Machine Learning - Getting StartedPython Machine Learning - Getting Started
Python Machine Learning - Getting Started
Rafey Iqbal Rahman
 

Similar to Neural networks with python (20)

Large Scale Deep Learning with TensorFlow
Large Scale Deep Learning with TensorFlow Large Scale Deep Learning with TensorFlow
Large Scale Deep Learning with TensorFlow
 
The Flow of TensorFlow
The Flow of TensorFlowThe Flow of TensorFlow
The Flow of TensorFlow
 
Statistics vs machine learning
Statistics vs machine learningStatistics vs machine learning
Statistics vs machine learning
 
Neuromation.io AI Ukraine Presentation
Neuromation.io AI Ukraine PresentationNeuromation.io AI Ukraine Presentation
Neuromation.io AI Ukraine Presentation
 
Fast and Scalable Python
Fast and Scalable PythonFast and Scalable Python
Fast and Scalable Python
 
Apache Cassandra Lunch #54: Machine Learning with Spark + Cassandra Part 2
Apache Cassandra Lunch #54: Machine Learning with Spark + Cassandra Part 2Apache Cassandra Lunch #54: Machine Learning with Spark + Cassandra Part 2
Apache Cassandra Lunch #54: Machine Learning with Spark + Cassandra Part 2
 
Py tables
Py tablesPy tables
Py tables
 
Large Data Analyze With PyTables
Large Data Analyze With PyTablesLarge Data Analyze With PyTables
Large Data Analyze With PyTables
 
PyTables
PyTablesPyTables
PyTables
 
Monitoring Big Data Systems - "The Simple Way"
Monitoring Big Data Systems - "The Simple Way"Monitoring Big Data Systems - "The Simple Way"
Monitoring Big Data Systems - "The Simple Way"
 
Apache Cassandra Lunch #50: Machine Learning with Spark + Cassandra
Apache Cassandra Lunch #50: Machine Learning with Spark + CassandraApache Cassandra Lunch #50: Machine Learning with Spark + Cassandra
Apache Cassandra Lunch #50: Machine Learning with Spark + Cassandra
 
Netflix machine learning
Netflix machine learningNetflix machine learning
Netflix machine learning
 
Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...
Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...
Ml pipelines with Apache spark and Apache beam - Ottawa Reactive meetup Augus...
 
Distributed machine learning 101 using apache spark from a browser devoxx.b...
Distributed machine learning 101 using apache spark from a browser   devoxx.b...Distributed machine learning 101 using apache spark from a browser   devoxx.b...
Distributed machine learning 101 using apache spark from a browser devoxx.b...
 
Big Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-onBig Data Analytics (ML, DL, AI) hands-on
Big Data Analytics (ML, DL, AI) hands-on
 
TensorfLow_Basic.pptx
TensorfLow_Basic.pptxTensorfLow_Basic.pptx
TensorfLow_Basic.pptx
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
 
Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)
 
A Platform for Accelerating Machine Learning Applications
 A Platform for Accelerating Machine Learning Applications A Platform for Accelerating Machine Learning Applications
A Platform for Accelerating Machine Learning Applications
 
Python Machine Learning - Getting Started
Python Machine Learning - Getting StartedPython Machine Learning - Getting Started
Python Machine Learning - Getting Started
 

Recently uploaded

The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
Social Samosa
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
aqzctr7x
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
kuntobimo2016
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
74nqk8xf
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
74nqk8xf
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
AndrzejJarynowski
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
mzpolocfi
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
nyfuhyz
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
Sm321
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
u86oixdj
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 

Recently uploaded (20)

The Ipsos - AI - Monitor 2024 Report.pdf
The  Ipsos - AI - Monitor 2024 Report.pdfThe  Ipsos - AI - Monitor 2024 Report.pdf
The Ipsos - AI - Monitor 2024 Report.pdf
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理一比一原版(UO毕业证)渥太华大学毕业证如何办理
一比一原版(UO毕业证)渥太华大学毕业证如何办理
 
State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023State of Artificial intelligence Report 2023
State of Artificial intelligence Report 2023
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
一比一原版(牛布毕业证书)牛津布鲁克斯大学毕业证如何办理
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
 
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
一比一原版(Dalhousie毕业证书)达尔豪斯大学毕业证如何办理
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
一比一原版(UMN文凭证书)明尼苏达大学毕业证如何办理
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Challenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more importantChallenges of Nation Building-1.pptx with more important
Challenges of Nation Building-1.pptx with more important
 
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
原版制作(swinburne毕业证书)斯威本科技大学毕业证毕业完成信一模一样
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 

Neural networks with python

  • 1. Neural Networks with Python Tom Dierickx Data Services Team Knowledge-Sharing December 7, 2018 42
  • 2. Today’s Agenda ● What is “learning” ? ● What is “machine learning”? ○ demo: decision tree (using scikit-learn and XGBoost inside PowerBI) ○ demo: logistic regression (using statsmodels w/ AWS SageMaker) ○ demo: neural network (using scikit-learn w/ Azure Notebooks) ● What is “deep learning”? ○ demo: neural network (using TensorFlow via Keras w/Google Colab) ● Where you can do it online today … for free! ● Resources and links
  • 3. What is “Learning” ? ● We typically think of learning in terms of the act, or subjective experience, of becoming aware of some new fact (which usually has a “feel” to it) or chunk of information (which generally has a larger “feel” to it) ● Note how this is a very personal, human-centered interpretation as it’s implicitly defined in terms of our own “consciousness” ● In truth, our brains give themselves a juicy hit of “dopamine” as a reward for each novel fact/information/news acquired … and it’s well-known that “emotional-to-us” things actually get packed away deeper into our long-term memories stronger and longer ● … BUT, a more objective viewpoint might be to define “learning” in terms of acquiring new skills that, hopefully, lead to increased accuracy, efficiency, and speed to recall for us in some subject area … after all, that’s why we humans evolved to learn right? ● So, let’s define learning as “becoming more proficient” in something (i.e. spotting mistakes quicker, making less errors, connecting dots, etc)
  • 4. What is “Machine Learning”? ● Continuous improvement in accurately “predicting” output values (called “labels”) from input values (called “features”) ● More specifically, automatically improving “on it own” against some pre-defined metric (typically, some “cost function”) that typically compares predicted values versus actual values, across all observations [i.e. want minimal error rate% (in classification problems) or minimal RMSE (in regression problems)] ● Various ML algorithms are used to “train” (i.e. learn rules) against some “training data” until accuracy is thought sufficiently “good enough”
  • 5. ● The learned rules (i.e. usually in the form of some weighted coefficients; albeit, sometimes 10’s of 1000’s, or even more, of them!) are then applied against some unseen “test data” (i.e. usually just some data, like 20%, that was held out from being trained on) to validate accuracy on “new” data and hope it holds up ● Want minimal under-fitting (aka, “high bias”) and minimal over-fitting (aka, “high variance”) ● Under-fitting can be improved with more data, better features, or better algorithm ● Over-fitting can be improved with simpler models and/or adding regularization parameters What is “Machine Learning”? (Cont.) Q: Why is ML such a “big deal” and so hyped today? A: b/c it’s transforming our world as we speak! Instead of a programmer having to (somehow!) know all the conditional logic rules needed upfront to produce desired output, internal rules that “just work” are “magically” inferred Note: “learned rules” may not always be directly accessible or even interpretable
  • 6. What is “Machine Learning”? (Cont.) ● Some popular tools of the trade (2018 edition) ○ Python (esp. Anaconda distro) is soaring (R is fading) ○ SQL (the “language”, in general) still essential ○ scikit-learn and TensorFlow (w/ Keras wrapper) very popular ML libraries ○ Apache Spark (big data backend) remains preferred over classic Hadoop
  • 7. Example: Decision Tree ● Attempt to find “natural splits” in data (usually by minimizing “entropy” and, thus, maximizing “information gain”; i.e., find the most homogeneous branches) ● Tend to overfit (thus, under-perform); but can improve with ensemble methods: ○ Bagged trees: (i.e. multiple trees generated by random sampling; use aggregate “consensus”) ■ “Random forest” most common algorithm ○ Boosted trees: (i.e. incrementally build tree “learning” from prior observations; “snowball”) ■ “XGBoost” most popular algorithm
  • 8. Live Demo: Decision Tree ● PowerBI Desktop to: ○ fetch dataset on 80k+ UFO sightings from the interwebs via URL ○ serve as an interactive, GUI reporting container to slice & dice things ● Python scripts inside PowerBI to: ○ download historical lunar phase data from external website ○ combine everything using Pandas ○ predict most likely times for UFO sightings using: ■ scikit-learn module to build a “simple” decision tree ■ XGBoost module to gain even better results.
  • 9. Example: Logistic Regression ● Good for predicting binary output (i.e. 1/0, yes/no, true/false, win/loss, pass/fail, in/out) ● Models “probability” [0 ≤ p ≤ 1] of “Y/N” responses; good for binary classification
  • 10. Live Demo: Logistic Regression ● AWS SageMaker platform to: ○ Create jupyter notebook in the cloud ○ Look at NFL turnover +/- margin vs win/loss for week 13 games ○ Use statsmodels library to perform logistic regression ○ Use seaborn plotting library for creating nice visuals
  • 11. Example: Neural Network ● A neural network is similar to logistic regression in some ways, but : ○ Has hidden layer in the middle, with multiple nodes, instead of a single output ○ These nodes (called “neurons”) in the middle each generate their own 0 ≤ value ≤ 1 ○ Other activation functions to introduce non-linearity besides “sigmoid” function can be used ○ Output layer can support multiple, predicted output values (p.s. though not shown below) ● Technical notes: ○ Weights are inferred by gradient descent (i.e. partial derivatives) optimization algorithm ○ Weights are updated through a very iterative process called backpropagation ○ Can take many iterations to minimize cost function and for it to converge
  • 12. Live Demo: Neural Network ● Microsoft Azure Notebooks platform to: ○ Create jupyter notebook in the cloud ○ Look at tic-tac-toe board configurations (https://datahub.io/machine-learning/tic-tac-toe-endgame) ○ Use scikit-learn library to train a “simple” neural network to “learn” what combination of moves equates to winning or losing for X ○ Validate a prediction by hand to show how the math works
  • 13. What is “Deep Learning”? (Cont.) ● There really is no exact definition, but is common to implicitly refer to a subset of machine learning types that focus mostly on very “deep” neural networks (aka, many hidden layers and nodes) ● Some evolving variants, like recurrent neural networks (RNN) for sequential data (think: speech recognition) or convolutional neural networks (CNN) for image data (think: image recognition) perform clever, custom calculations and connect the hidden layers together in slightly different ways to perform even better (i.e. faster, more accurately, with less calculations needed) than a “classic” feed forward network (FFN)
  • 14. What is “Deep Learning”? (Cont.) GoogLeNet (Inception v1)— Winner of ILSVRC 2014 (Image Classification)
  • 15. ● This is the best picture I have seen depicting how the various pieces of the data and analytics landscape relate to each other (... and my “problem” is I find every piece so interesting in its own right that I feel like I never know enough about any of them!!) Current Landscape Source: https://www.machinecurve.com/index.php/2017/09/30/the-differences-between-artificial-intelligence-machine-learning-more/
  • 16. Example: Deep Neural Network ● Neural network having multiple hidden layers and more nodes, so can “learn” more complex patterns … but requires much more data to do so, of course ● Newer architectures even employ different types of hidden layer nodes ● More complicated networks even stitch together multiple networks into one larger network in a pipeline fashion ● It’s common to plug-in pre-trained networks, especially in audio and/or vision applications, so don’t have to train from scratch; this is called transfer learning
  • 17. Live Demo: Deep Neural Network ● Google Colab platform to: ○ Create jupyter-based notebook in the cloud ○ Look at 60 years worth of daily weather data for Rockford, IL (generated from https://www.ncdc.noaa.gov) ○ Upload raw file to google drive ○ Use Keras wrapper library on top of TensorFlow library to train a “deep” neural network to “learn” for us if it will rain or snow for upcoming Saturday given today’s weather is X
  • 18. Where you can do it online - for free! There’s a bit of “space race” to take over the world through AI and ML and with cloud-based computing now ubiquitous and a commodity resource, typically metered by the hour, there’s lots of 100% free (for now, anyway) places to learn and practice ML (generally) and Neural Networks (specifically) beyond just your own laptop ● Google Colab (Python; 20GB RAM, free GPU/TPU hardware) ● Kaggle (Python or R; 17GB RAM; google acquired in 2017; compete for prizes!) ● Azure Notebooks (Python or R or F#; 4GB RAM) ● Amazon SageMaker (Python, R, Scala; 4GB RAM, access to AWS ecosystem, free tier = 250 hours limit) ● IBM Watson Studio (Python, R, Scala; 4GB RAM, feature-rich options) ● Many more out there popping up everyday...
  • 19. Resources and links ● “The differences between AI, machine learning & more” https://www.machinecurve.com/index.php/2017/09/30/the-differences-between-artificial-intelligence-machine-learning-more/ ● “Introduction to Data Science” https://www.saedsayad.com/data_mining_map.htm ● “Definitions of common machine learning terms” https://ml-cheatsheet.readthedocs.io/en/latest/glossary.html ● “Decision Trees and Boosting, XGBoost | Two Minute Papers #55” https://www.youtube.com/watch?v=0Xc9LIb_HTw ● “Logistic Regression - Fun and Easy Machine Learning” https://www.youtube.com/watch?v=7qJ7GksOXoA ● “3 Blue, 1 Brown: Neural networks” https://www.youtube.com/playlist?list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi ● “An introduction to Machine Learning (and a little bit of Deep Learning)” https://www.slideshare.net/ThomasDaSilvaPaula/an-introduction-to-machine-learning-and-a-little-bit-of-deep-learning ● “Modern Convolutional Neural Network techniques for image segmentation” https://www.slideshare.net/GioeleCiaparrone/modern-convolutional-neural-network-techniques-for-image-segmentation ● “Neural Networks and Deep Learning” free online course https://www.coursera.org/learn/neural-networks-deep-learning ● “NUFORC geolocated and time standardized UFO reports” https://github.com/planetsig/ufo-reports