AI/ML/DL
Şaban Dalaman
Computer Engineer,Msc.
https://www.linkedin.com/in/sabandalaman/
https://twitter.com/sdaPCA
 1989 – Graduated from Middle East Technical Unversity
 1989-1996 – Turkish Airlines, System Engineer in IT
 1997-1998 – Turkish AirForce, Military Duty
 1998-2014 – Koçbank/Yapı Kredi Bank, Application
Developer/Department Manager in IT
 2014-2019 – Freelancer,Master of Science in Data
Science
Alan Mathison Turing The universal machine
Computation Theory
Thinking Computers
if they were practically universal, they should be able to do anything. In 1948 he wrote,
"The importance of the universal machine is clear. We do not need to have an
infinity of different machines doing different jobs. A single one will suffice. The
engineering problem of producing various machines for various jobs is replaced
by the office work of `programming' the universal machine to do these jobs."
Alan M. Turing, Computing Machinery and Intelligence,1950,
"Can a machine think?"
We propose that a 2 month, 10 man study of articial intelligence
be carried out during the summer of 1956 at Dartmouth College in
Hanover, New Hampshire. The study is to proceed on the basis of
the conjecture that every aspect of learning or any other feature of
intelligence can in principle be so precisely described that a
machine can be made to simulate it. An attempt will be made to
nd how to make machines use language, form abstractions and
concepts, solve kinds of problems now reserved for humans, and
improve themselves. We think that a signicant advance can be
made in one or more of these problems if a carefully selected group
of scientists work on it together for a summer.
. . .
For the present purpose the articial intelligence problem is taken
to be that of making a machine behave in ways that would be
called intelligent if a human were so behaving.
Summer Research Project on Articial Intelligence
John McCarthy, Claude Shannon and Marvin Minsky, 1957
Artificial Intelligence is a very broad term. It is an attempt to make computers think
like human beings. Any technique, code or algorithm that enables machines to
develop, mimic, and demonstrate human cognitive abilities or behaviors falls under
this category.
Machine learning is the study of algorithms and mathematical models that computer
systems use.It is a computer’s ability to learn from a set of data, and adapt itself
without being programmed to do so.
Deep learning is only a subset of machine learning. It is one of the most popular
forms of machine learning algorithms. They use artificial neural networks (ANNs).
Data Science is a fairly general term for processes and methods that analyze and
manipulate data. It enables artificial intelligence to find meaning and appropriate
information from large volumes of data with greater speed and efficiency. Data
science makes it possible for us to use data to make key decisions not just in
business, but also increasingly in science, technology, and even politics.
• Symbolists
• Connectionists
• Evolutionaries
• Bayesians
• Analogizers
Five tribes of AI
The Master Algorithm
an algorithm capable of
finding knowledge and
generalizing from any kind of
data. The algorithm must use
paradigms and techniques
from each and every tribe
Artificial intelligence
Big data
analyticsMachine
learning
Unsupervised learning
Reinforcement learning
Supervised learning
What is Machine Learning (ML)?
“A computer program is said to learn from experience E with respect to
some task T and some performance measure P, if its performance on T, as
measured by P, improves with experience E.”,
Tom Mitchell, 1997.
Why Machine Learning?
• It is very hard to write programs that solve problems like recognizing a
face.
 We donʼt know what program to write because we donʼt know how
our brain does it.
 Even if we had a good idea about how to do it, the program might be
horrendously complicated.
• Instead of writing a program by hand, we collect lots of examples that
specify the correct output for a given input.
• A machine learning algorithm then takes these examples and produces a
program that does the job.
 If we do it right,the program works for new cases as well as the ones on
which we trained it.
Traditional Programming vs. Machine Learning
Data
Program
Performtask
Results Data
Model
Performtask
Results
Learning
Data
Program
Requirements
Results ML
Model
Machine Learning Workflow
Problem
Understanding
Evaluation
Criteria
Baseline
Prepare Data Train model Error Analysis
Simpler model
More data / features
Production
Trained Model
Ready to use
More complex model
Better problem understanding
Better baseline
Ordinary Least Squares Regression (OLSR)
Linear Regression
Logistic Regression
Stepwise Regression
Multivariate Adaptive Regression Splines (MARS)
Locally Estimated Scatterplot Smoothing (LOESS)
Regression Algorithms
k-Nearest Neighbor (kNN)
Learning Vector Quantization (LVQ)
Self-Organizing Map (SOM)
Locally Weighted Learning (LWL)
Instance-based Algorithms
Ridge Regression
Least Absolute Shrinkage and Selection Operator
(LASSO)
Elastic Net
Least-Angle Regression (LARS)
Regularization Algorithms
Classification and Regression Tree (CART)
Iterative Dichotomiser 3 (ID3)
C4.5 and C5.0 (different versions of a powerful approach)
Chi-squared Automatic Interaction Detection (CHAID)
Decision Stump
M5
Conditional Decision Trees
Decision Tree Algorithms
Naive Bayes
Gaussian Naive Bayes
Multinomial Naive Bayes
Averaged One-Dependence Estimators (AODE)
Bayesian Belief Network (BBN)
Bayesian Network (BN)
Bayesian Algorithms
k-Means
k-Medians
Expectation Maximisation (EM)
Hierarchical Clustering
Clustering Algorithms
Perceptron
Back-Propagation
Hopfield Network
Radial Basis Function Network (RBFN)
Artificial Neural Network Algorithms
Deep Boltzmann Machine (DBM)
Deep Belief Networks (DBN)
Convolutional Neural Network (CNN)
Stacked Auto-Encoders
Deep Learning Algorithms
Principal Component Analysis (PCA)
Principal Component Regression (PCR)
Partial Least Squares Regression (PLSR)
Linear Discriminant Analysis (LDA)
Mixture Discriminant Analysis (MDA)
Quadratic Discriminant Analysis (QDA)
Flexible Discriminant Analysis (FDA)
Dimensionality Reduction Algorithms
Boosting
Bootstrapped Aggregation (Bagging)
AdaBoost
Stacked Generalization (blending)
Gradient Boosting Machines (GBM)
Gradient Boosted Regression Trees (GBRT)
Random Forest
Ensemble Algorithms
Computational intelligence (evolutionary algorithms, etc.)
Natural Language Processing (NLP)
Recommender Systems
Reinforcement Learning
Graphical Models
Other Algorithms
Process Automation : one of the most common applications of machine learning in
finance. The technology allows to replace manual work, automate repetitive tasks,
and increase productivity.
Security : detecting frauds, identifies suspicious account behavior, financial
monitoring, network security
Credit scoring : process customer profiles and credit-scoring in real-life environments
Algorithmic trading : better trading decisions, analyze thousands of data sources
simultaneously
Robo-advisory : Portfolio management, Recommendation of financial products
Deep Learning
Deep Learning : Biological Inspiration
Connected network of
neurons.
Communicate by electric
and chemical signals
~ 1011 neurons
~1000 synapses per neuron
Signals come in via dendrites
into soma
Signal goes out via axon to other
neurons through
synapses
Geoffrey Hinton, Yoshua Bengio, Yann LeCun
Andrew NgJurgen Shedimuller
DL Model Examples
GoogleNet
Challenges in DL
DL Applications
Repsentation Learning
Transfer Learning
Deep Generative Networks
Discriminative vs. Generative Models
“Cat”
Discriminative models Generative models
Goal: Learn some underlying hidden structure
of the training samples to generate novel
samples from same data distribution
9
Goal: Learn a function to map x -> y
x y
Generative Modeling
P
pdata
pmodel
9Slide adapted from Sebastian Nowozin
Generative Modeling
P
pdata
pmodel
Assumptions on P:
• tractable sampling
Training examples
9Slide adapted from Sebastian Nowozin
Model samples
Generative Modeling
P
pdata
pmodel
Assumptions on P:
• tractable sampling
• tractable likelihood function
1Slide adapted from Sebastian Nowozin
1
Why study deep generative models?
• Go beyond associating inputs to outputs
• Understand high-dimensional, complex probability distributions
• Discover the “true” structure of the data
• Detect surprising events in the world (anomaly detection)
• Missing Data (semi-supervised learning)
• Generate models for planning (model-based reinforcement learning)
Model Examples
Model to separate noise within a video and single out individual
voices in a crowd through deep learning processes
Video-based emotion recognition using CNN-RNN and C3D hybrid networks
GoogleNet
Seq2seq Neural Machine Translation
Web Traffic Time Series Forecasting
Generative Adversarial Networks
Natural Language Processing or NLP is the
technique that uses algorithmic approaches to help
a computer understand and process human
language.
Applications of NLP
• Machine Translation
• Automatic text summarization
• Sentence segmentation
• Predicting words
• Sentiment analysis
• Document classification
• Document clustering
• Tagging
• Paraphrase Detection
• Natural language generation
• …..and many more.
Thank you

AI Presentation 1

  • 1.
  • 2.
     1989 –Graduated from Middle East Technical Unversity  1989-1996 – Turkish Airlines, System Engineer in IT  1997-1998 – Turkish AirForce, Military Duty  1998-2014 – Koçbank/Yapı Kredi Bank, Application Developer/Department Manager in IT  2014-2019 – Freelancer,Master of Science in Data Science
  • 4.
    Alan Mathison TuringThe universal machine Computation Theory
  • 5.
    Thinking Computers if theywere practically universal, they should be able to do anything. In 1948 he wrote, "The importance of the universal machine is clear. We do not need to have an infinity of different machines doing different jobs. A single one will suffice. The engineering problem of producing various machines for various jobs is replaced by the office work of `programming' the universal machine to do these jobs." Alan M. Turing, Computing Machinery and Intelligence,1950, "Can a machine think?"
  • 6.
    We propose thata 2 month, 10 man study of articial intelligence be carried out during the summer of 1956 at Dartmouth College in Hanover, New Hampshire. The study is to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it. An attempt will be made to nd how to make machines use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve themselves. We think that a signicant advance can be made in one or more of these problems if a carefully selected group of scientists work on it together for a summer. . . . For the present purpose the articial intelligence problem is taken to be that of making a machine behave in ways that would be called intelligent if a human were so behaving. Summer Research Project on Articial Intelligence John McCarthy, Claude Shannon and Marvin Minsky, 1957
  • 7.
    Artificial Intelligence isa very broad term. It is an attempt to make computers think like human beings. Any technique, code or algorithm that enables machines to develop, mimic, and demonstrate human cognitive abilities or behaviors falls under this category. Machine learning is the study of algorithms and mathematical models that computer systems use.It is a computer’s ability to learn from a set of data, and adapt itself without being programmed to do so. Deep learning is only a subset of machine learning. It is one of the most popular forms of machine learning algorithms. They use artificial neural networks (ANNs). Data Science is a fairly general term for processes and methods that analyze and manipulate data. It enables artificial intelligence to find meaning and appropriate information from large volumes of data with greater speed and efficiency. Data science makes it possible for us to use data to make key decisions not just in business, but also increasingly in science, technology, and even politics.
  • 10.
    • Symbolists • Connectionists •Evolutionaries • Bayesians • Analogizers Five tribes of AI
  • 11.
    The Master Algorithm analgorithm capable of finding knowledge and generalizing from any kind of data. The algorithm must use paradigms and techniques from each and every tribe
  • 13.
    Artificial intelligence Big data analyticsMachine learning Unsupervisedlearning Reinforcement learning Supervised learning
  • 24.
    What is MachineLearning (ML)? “A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.”, Tom Mitchell, 1997.
  • 25.
    Why Machine Learning? •It is very hard to write programs that solve problems like recognizing a face.  We donʼt know what program to write because we donʼt know how our brain does it.  Even if we had a good idea about how to do it, the program might be horrendously complicated. • Instead of writing a program by hand, we collect lots of examples that specify the correct output for a given input. • A machine learning algorithm then takes these examples and produces a program that does the job.  If we do it right,the program works for new cases as well as the ones on which we trained it.
  • 26.
    Traditional Programming vs.Machine Learning Data Program Performtask Results Data Model Performtask Results Learning Data Program Requirements Results ML Model
  • 27.
    Machine Learning Workflow Problem Understanding Evaluation Criteria Baseline PrepareData Train model Error Analysis Simpler model More data / features Production Trained Model Ready to use More complex model Better problem understanding Better baseline
  • 29.
    Ordinary Least SquaresRegression (OLSR) Linear Regression Logistic Regression Stepwise Regression Multivariate Adaptive Regression Splines (MARS) Locally Estimated Scatterplot Smoothing (LOESS) Regression Algorithms k-Nearest Neighbor (kNN) Learning Vector Quantization (LVQ) Self-Organizing Map (SOM) Locally Weighted Learning (LWL) Instance-based Algorithms Ridge Regression Least Absolute Shrinkage and Selection Operator (LASSO) Elastic Net Least-Angle Regression (LARS) Regularization Algorithms
  • 30.
    Classification and RegressionTree (CART) Iterative Dichotomiser 3 (ID3) C4.5 and C5.0 (different versions of a powerful approach) Chi-squared Automatic Interaction Detection (CHAID) Decision Stump M5 Conditional Decision Trees Decision Tree Algorithms Naive Bayes Gaussian Naive Bayes Multinomial Naive Bayes Averaged One-Dependence Estimators (AODE) Bayesian Belief Network (BBN) Bayesian Network (BN) Bayesian Algorithms k-Means k-Medians Expectation Maximisation (EM) Hierarchical Clustering Clustering Algorithms
  • 31.
    Perceptron Back-Propagation Hopfield Network Radial BasisFunction Network (RBFN) Artificial Neural Network Algorithms Deep Boltzmann Machine (DBM) Deep Belief Networks (DBN) Convolutional Neural Network (CNN) Stacked Auto-Encoders Deep Learning Algorithms Principal Component Analysis (PCA) Principal Component Regression (PCR) Partial Least Squares Regression (PLSR) Linear Discriminant Analysis (LDA) Mixture Discriminant Analysis (MDA) Quadratic Discriminant Analysis (QDA) Flexible Discriminant Analysis (FDA) Dimensionality Reduction Algorithms
  • 32.
    Boosting Bootstrapped Aggregation (Bagging) AdaBoost StackedGeneralization (blending) Gradient Boosting Machines (GBM) Gradient Boosted Regression Trees (GBRT) Random Forest Ensemble Algorithms Computational intelligence (evolutionary algorithms, etc.) Natural Language Processing (NLP) Recommender Systems Reinforcement Learning Graphical Models Other Algorithms
  • 33.
    Process Automation :one of the most common applications of machine learning in finance. The technology allows to replace manual work, automate repetitive tasks, and increase productivity. Security : detecting frauds, identifies suspicious account behavior, financial monitoring, network security Credit scoring : process customer profiles and credit-scoring in real-life environments Algorithmic trading : better trading decisions, analyze thousands of data sources simultaneously Robo-advisory : Portfolio management, Recommendation of financial products
  • 34.
  • 35.
    Deep Learning :Biological Inspiration Connected network of neurons. Communicate by electric and chemical signals ~ 1011 neurons ~1000 synapses per neuron Signals come in via dendrites into soma Signal goes out via axon to other neurons through synapses
  • 38.
    Geoffrey Hinton, YoshuaBengio, Yann LeCun
  • 39.
  • 44.
  • 56.
  • 69.
  • 79.
  • 84.
  • 91.
  • 96.
  • 97.
    Discriminative vs. GenerativeModels “Cat” Discriminative models Generative models Goal: Learn some underlying hidden structure of the training samples to generate novel samples from same data distribution 9 Goal: Learn a function to map x -> y x y
  • 98.
  • 99.
    Generative Modeling P pdata pmodel Assumptions onP: • tractable sampling Training examples 9Slide adapted from Sebastian Nowozin Model samples
  • 100.
    Generative Modeling P pdata pmodel Assumptions onP: • tractable sampling • tractable likelihood function 1Slide adapted from Sebastian Nowozin
  • 101.
    1 Why study deepgenerative models? • Go beyond associating inputs to outputs • Understand high-dimensional, complex probability distributions • Discover the “true” structure of the data • Detect surprising events in the world (anomaly detection) • Missing Data (semi-supervised learning) • Generate models for planning (model-based reinforcement learning)
  • 102.
  • 103.
    Model to separatenoise within a video and single out individual voices in a crowd through deep learning processes
  • 104.
    Video-based emotion recognitionusing CNN-RNN and C3D hybrid networks
  • 105.
  • 106.
  • 107.
    Web Traffic TimeSeries Forecasting
  • 108.
  • 115.
    Natural Language Processingor NLP is the technique that uses algorithmic approaches to help a computer understand and process human language.
  • 116.
    Applications of NLP •Machine Translation • Automatic text summarization • Sentence segmentation • Predicting words • Sentiment analysis • Document classification • Document clustering • Tagging • Paraphrase Detection • Natural language generation • …..and many more.
  • 117.