SlideShare a Scribd company logo
1 of 70
Probability for Data Scientists
Dr. Ferdin Joe John Joseph
Machine Learning
Machine Learning is an interdisciplinary field in Data Science that uses
• statistics
• probability
• algorithms
to learn from data and provide insights which can be used to build
intelligent applications.
2
Today We Learn
3
Probability in Real Life
4
Probability in Real Life
5
Probability in Real Life
6
Probability in Real Life
7
Probability for Data Science
•Probability deals with predicting the likelihood of
future events, while statistics involves the
analysis of the frequency of past events.
8
Terminologies
• Event
• Random Variable
• Empirical Probability
• Theoretical Probability
• Joint Probability
• Conditional Probability
9
Event
• An event is a set of outcomes of an experiment to which a probability
is assigned.
• E represents event
• P(E) is the probability that the event E occur.
• A situation where E might happen (success) or might not happen
(failure) is called a trial.
10
Event
• Tossing a coin
11
Event
• Rolling dice
12
Event
• Pulling colored ball out of the bag
13
Random Variable
• The variable that represents the outcome of an events is called a
random variable.
• Eg. Getting head or tail in tossing a coin
14
Random variable in tossing a coin
• If we toss a coin, the chances for getting head or tail is 50-50
• The probability of getting head or tail is ½ or 50%
• Random variable range between 0 and 1
15
Empirical Probability
• Also known as practical probability
• It is the number of times the event occurs divided by the total
number of incidents observed.
• If for ‘n’ trials and we observe ‘s’ successes, the probability of success
is s/n.
• Toss a coin 4 times. The outcome is H, H, H, T
• P(Head) =3/4=0.75
• P(Tail)=1/4=0.25
16
Theoretical probability
• The number of ways the particular event can occur divided by the
total number of possible outcomes.
• A head can occur once and possible outcomes are two (head, tail).
The true (theoretical) probability of a head is 1/2.
17
Exercise 1
A die is rolled, find the probability that an even number is obtained.
18
Exercise 1
A die is rolled, find the probability that an even number is obtained.
Solution:
Let us first write the sample space S of the experiment.
S = {1,2,3,4,5,6}
Let E be the event "an even number is obtained" and write it down.
E = {2,4,6}
We now use the formula of the classical probability.
P(E) = n(E) / n(S) = 3 / 6 = 1 / 2
19
Exercise 2
Two coins are tossed, find the probability that two heads are obtained.
Note: Each coin has two possible outcomes H (heads) and T (Tails).
20
Exercise 2
Two coins are tossed, find the probability that two heads are obtained.
Note: Each coin has two possible outcomes H (heads) and T (Tails).
The sample space S is given by.
S = {(H,T),(H,H),(T,H),(T,T)}
Let E be the event "two heads are obtained".
E = {(H,H)}
We use the formula of the classical probability.
P(E) = n(E) / n(S) = 1 / 4
21
Exercise 3
A card is drawn at random from a deck of cards. Find the probability of
getting the 3 of diamond.
22
Exercise 3
A card is drawn at random from a deck of cards. Find the probability of
getting the 3 of diamond.
The sample space S of the experiment in question 6 is shown below
23
Exercise 3
A card is drawn at random from a deck of cards. Find the probability of
getting the 3 of diamond.
24
Exercise 3
A card is drawn at random from a deck of cards. Find the probability of
getting the 3 of diamond.
Let E be the event "getting the 3 of diamond". An examination of the
sample space shows that there is one "3 of diamond" so that n(E) = 1
and n(S) = 52. Hence the probability of event E occurring is given by
P(E) = 1 / 52
25
Exercise 4
The blood groups of 200 people is distributed as follows:
50 have type A blood,
65 have B blood type,
70 have O blood type and
15 have type AB blood.
If a person from this group is selected at random, what is the
probability that this person has O blood type?
26
Exercise 4
We construct a table of frequencies for the the blood groups as follows
group frequency
A 50
B 65
O 70
AB 15
We use the empirical formula of the probability
P(E) = Frequency for O blood / Total frequencies
= 70 / 200 = 0.35
27
Classwork 1
What is the probability of throwing one dice and getting the number
greater than 4 ?
28
Classwork 2
The customer wants to buy a bread and a can. There are 30 pieces of
bread in the shop, including 5 from the previous day, and 20 cans with
unreadable expiration date, of which one has expired. What is the
probability that the customer will buy a fresh bread and a tin under
warranty ?
29
Classwork 3
What is the probability that if we choose a trinity from 19 boys and 12
girls, we will have :
a) three boys
b) three girls
c) two boys and one girl ?
30
Joint Probability
• Probability of events A and B denoted by P(A and B) or P(A ∩ B) is the
probability that events A and B both occur.
• P(A ∩ B) = P(A). P(B)
• This only applies if A and B are independent, which means that if A
occurred, that doesn’t change the probability of B, and vice versa.
31
Conditional Probability
• A and B are not independent
• When A and B are not independent, it is often useful to compute the
conditional probability, P (A|B)
• The probability of A given that B occurred: P(A|B) =
P(A ∩ B)
P(B)
• Similarly, P(B|A) =
P(A ∩ B)
P(A)
32
• Joint probability of A and B can be denoted as
• P(A ∩ B)= p(A).P(B|A)
33
Bayes Theorem
34
Bayes Theorem
• Used in Naïve Bayes Classifier
35
36
Types of Events
• Independent
• Mutually Exclusive
37
Independent Events
• Two or more events not having control over the outcome of the
others.
38
Mutually Exclusive Events
• If two events are NOT independent, then we say that they are dependent.
• Sampling may be done with replacement or without replacement.
• With replacement: If each member of a population is replaced after it is
picked, then that member has the possibility of being chosen more than
once. When sampling is done with replacement, then events are
considered to be independent, meaning the result of the first pick will not
change the probabilities for the second pick.
• Without replacement: When sampling is done without replacement, each
member of a population may be chosen only once. In this case, the
probabilities for the second pick are affected by the result of the first pick.
The events are considered to be dependent or not independent.
39
Sampling with replacement
• Suppose you pick three cards with replacement. The first card you
pick out of the 52 cards is the
• Q of spades. You put this card back, reshuffle the cards and pick a
second card from the 52-card deck. It is the ten of clubs. You put this
card back, reshuffle the cards and pick a third card from the 52-card
deck. This time, the card is the Q of spades again. Your picks are {Q of
spades, ten of clubs, Q of spades}. You have picked the Q of spades
twice. You pick each card from the 52-card deck.
40
Sampling without replacement
• Suppose you pick three cards without replacement. The first card you
pick out of the 52 cards is the
• K of hearts. You put this card aside and pick the second card from the
51 cards remaining in the deck. It is the three of diamonds. You put
this card aside and pick the third card from the remaining 50 cards in
the deck. The third card is the J of spades. Your picks are {K of hearts,
three of diamonds, J of spades}. Because you have picked the cards
without replacement, you cannot pick the same card twice.
41
Probability Distribution
• A probability distribution is a list of all of the possible outcomes of a
random variable along with their corresponding probability values.
42
Discrete Probability Distribution
• If we consider 1 and 2 as outcomes of rolling a six-sided die, then we
can’t have an outcome in between that (e.g. I can’t have an outcome
of 1.5).
• This is called probability mass function
43
Continuous Probability Distribution
• Sometimes we are concerned with the probabilities of random
variables that have continuous outcomes.
• Eg. The height of an adult picked at random from a population or the
amount of time that a taxi driver has to wait before their next job.
• When we use a probability function to describe a continuous
probability distribution we call it a probability density function
(commonly abbreviated as pdf).
44
Central Limit Theorem
• The central limit theorem states that if you have a population with
mean μ and standard deviation σ and take sufficiently large random
samples from the population with replacement text annotation
indicator, then the distribution of the sample means will be
approximately normally distributed.
45
Central Limit Theorem
46
Normal Distribution
• Uses the Central Limit Theorem
• Known as Bell Curve
47
Normal Distribution
48
Case Study
49
Genetic Algorithm
Genetic algorithm is a search heuristic that is inspired by Charles
Darwin’s theory of natural evolution.
This algorithm reflects the process of natural selection where the fittest
individuals are selected for reproduction in order to produce offspring
of the next generation.
50
Genetic Algorithm
51
Phases of Genetic Algorithm
Initial population
Fitness function
Selection
Crossover
Mutation
52
Initial Population
The process begins with a set of individuals which is called a
Population. Each individual is a solution to the problem you want to
solve.
An individual is characterized by a set of parameters (variables) known
as Genes. Genes are joined into a string to form a Chromosome
(solution).
In a genetic algorithm, the set of genes of an individual is represented
using a string, in terms of an alphabet. Usually, binary values are used
(string of 1s and 0s). We say that we encode the genes in a
chromosome.
53
Initial Population
54
Fitness Function
The fitness function determines how fit an individual is (the ability of
an individual to compete with other individuals).
It gives a fitness score to each individual.
The probability that an individual will be selected for reproduction is
based on its fitness score.
55
Selection
The idea of selection phase is to select the fittest individuals and let
them pass their genes to the next generation.
Two pairs of individuals (parents) are selected based on their fitness
scores. Individuals with high fitness have more chance to be selected
for reproduction.
56
Crossover
Crossover is the most significant phase in a genetic algorithm. For each
pair of parents to be mated, a crossover point is chosen at random
from within the genes.
For example, consider the crossover point to be 3 as shown below.
57
Crossover
• Offspring are created by exchanging the genes of parents among
themselves until the crossover point is reached.
• The new offsprings A5 and A6 are added to the population.
58
Probability in crossover
• Choosing which chromosome to perform crossover
• Choosing the pair to perform crossover
• Choosing the part of chromosome to perform crossover
59
Mutation
• In certain new offspring formed, some of their genes can be
subjected to a mutation with a low random probability.
• This implies that some of the bits in the bit string can be flipped.
60
Probability in mutation
• Choosing which chromosome to perform mutation
• Choosing whether to perform mutation or not
• Choosing the part of chromosome to perform mutation
61
Sample Java Code
https://github.com/ferdinjoe/Genetic-Algorithm
62
Probability usage in programming
63
Probability usage in programming
64
# generate random floating point values
from random import seed
from random import random
# seed random number generator
seed(1)
# generate random numbers between 0-1
for _ in range(10):
value = random()
print(value)
Probability usage in programming
65
# generate random integer values
from random import seed
from random import randint
# seed random number generator
seed(1)
# generate some integers
for _ in range(10):
value = randint(0, 10)
print(value)
Probability usage in programming
66
# choose a random element from a list
from random import seed
from random import choice
# seed random number generator
seed(1)
# prepare a sequence
sequence = [i for i in range(20)]
print(sequence)
# make choices from the sequence
for _ in range(5):
selection = choice(sequence)
print(selection)
Probability usage in programming
67
# randomly shuffle a sequence
from random import seed
from random import shuffle
# seed random number generator
seed(1)
# prepare a sequence
sequence = [i for i in range(20)]
print(sequence)
# randomly shuffle the sequence
shuffle(sequence)
print(sequence)
Slides Available in link below
www.slideshare.net/ferdinjoe
68
More topics recommended to learn
• Queueing Theory
• Statistics
• Numerical Methods
• Discrete Mathematics
• Optimization problems in Operations Research
69
70

More Related Content

What's hot

Linear Regression Analysis | Linear Regression in Python | Machine Learning A...
Linear Regression Analysis | Linear Regression in Python | Machine Learning A...Linear Regression Analysis | Linear Regression in Python | Machine Learning A...
Linear Regression Analysis | Linear Regression in Python | Machine Learning A...Simplilearn
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionDrZahid Khan
 
Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...
Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...
Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...Edureka!
 
Latent Dirichlet Allocation
Latent Dirichlet AllocationLatent Dirichlet Allocation
Latent Dirichlet AllocationSangwoo Mo
 
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...Edureka!
 
Introduction to Maximum Likelihood Estimator
Introduction to Maximum Likelihood EstimatorIntroduction to Maximum Likelihood Estimator
Introduction to Maximum Likelihood EstimatorAmir Al-Ansary
 
7. logistics regression using spss
7. logistics regression using spss7. logistics regression using spss
7. logistics regression using spssDr Nisha Arora
 
Linear Regression Algorithm | Linear Regression in R | Data Science Training ...
Linear Regression Algorithm | Linear Regression in R | Data Science Training ...Linear Regression Algorithm | Linear Regression in R | Data Science Training ...
Linear Regression Algorithm | Linear Regression in R | Data Science Training ...Edureka!
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsMd. Main Uddin Rony
 
K-Nearest Neighbor Classifier
K-Nearest Neighbor ClassifierK-Nearest Neighbor Classifier
K-Nearest Neighbor ClassifierNeha Kulkarni
 
Introduction to Probability and Probability Distributions
Introduction to Probability and Probability DistributionsIntroduction to Probability and Probability Distributions
Introduction to Probability and Probability DistributionsJezhabeth Villegas
 
Naive Bayes Classifier | Naive Bayes Algorithm | Naive Bayes Classifier With ...
Naive Bayes Classifier | Naive Bayes Algorithm | Naive Bayes Classifier With ...Naive Bayes Classifier | Naive Bayes Algorithm | Naive Bayes Classifier With ...
Naive Bayes Classifier | Naive Bayes Algorithm | Naive Bayes Classifier With ...Simplilearn
 
Feature selection
Feature selectionFeature selection
Feature selectiondkpawar
 
Descriptive Statistics with R
Descriptive Statistics with RDescriptive Statistics with R
Descriptive Statistics with RKazuki Yoshida
 

What's hot (20)

Bayesian networks
Bayesian networksBayesian networks
Bayesian networks
 
Linear Regression Analysis | Linear Regression in Python | Machine Learning A...
Linear Regression Analysis | Linear Regression in Python | Machine Learning A...Linear Regression Analysis | Linear Regression in Python | Machine Learning A...
Linear Regression Analysis | Linear Regression in Python | Machine Learning A...
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
K - Nearest neighbor ( KNN )
K - Nearest neighbor  ( KNN )K - Nearest neighbor  ( KNN )
K - Nearest neighbor ( KNN )
 
Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...
Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...
Linear Regression Algorithm | Linear Regression in Python | Machine Learning ...
 
Latent Dirichlet Allocation
Latent Dirichlet AllocationLatent Dirichlet Allocation
Latent Dirichlet Allocation
 
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
Logistic Regression in Python | Logistic Regression Example | Machine Learnin...
 
Introduction to Maximum Likelihood Estimator
Introduction to Maximum Likelihood EstimatorIntroduction to Maximum Likelihood Estimator
Introduction to Maximum Likelihood Estimator
 
7. logistics regression using spss
7. logistics regression using spss7. logistics regression using spss
7. logistics regression using spss
 
Linear Regression Algorithm | Linear Regression in R | Data Science Training ...
Linear Regression Algorithm | Linear Regression in R | Data Science Training ...Linear Regression Algorithm | Linear Regression in R | Data Science Training ...
Linear Regression Algorithm | Linear Regression in R | Data Science Training ...
 
Classification Based Machine Learning Algorithms
Classification Based Machine Learning AlgorithmsClassification Based Machine Learning Algorithms
Classification Based Machine Learning Algorithms
 
Machine Learning in R
Machine Learning in RMachine Learning in R
Machine Learning in R
 
K-Nearest Neighbor Classifier
K-Nearest Neighbor ClassifierK-Nearest Neighbor Classifier
K-Nearest Neighbor Classifier
 
Introduction to Probability and Probability Distributions
Introduction to Probability and Probability DistributionsIntroduction to Probability and Probability Distributions
Introduction to Probability and Probability Distributions
 
Neural network
Neural networkNeural network
Neural network
 
Naive Bayes Classifier | Naive Bayes Algorithm | Naive Bayes Classifier With ...
Naive Bayes Classifier | Naive Bayes Algorithm | Naive Bayes Classifier With ...Naive Bayes Classifier | Naive Bayes Algorithm | Naive Bayes Classifier With ...
Naive Bayes Classifier | Naive Bayes Algorithm | Naive Bayes Classifier With ...
 
Feature selection
Feature selectionFeature selection
Feature selection
 
Descriptive Statistics with R
Descriptive Statistics with RDescriptive Statistics with R
Descriptive Statistics with R
 
Statistics for data science
Statistics for data science Statistics for data science
Statistics for data science
 
Ridge regression
Ridge regressionRidge regression
Ridge regression
 

Similar to Probability Theory for Data Scientists

powerpoints probability.pptx
powerpoints probability.pptxpowerpoints probability.pptx
powerpoints probability.pptxcarrie mixto
 
1 - Probabilty Introduction .ppt
1 - Probabilty Introduction .ppt1 - Probabilty Introduction .ppt
1 - Probabilty Introduction .pptVivek Bhartiya
 
Chapter Five.ppthhjhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
Chapter Five.ppthhjhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhChapter Five.ppthhjhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
Chapter Five.ppthhjhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhbeshahashenafe20
 
Chapter Five.ppthhjhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
Chapter Five.ppthhjhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhChapter Five.ppthhjhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
Chapter Five.ppthhjhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhbeshahashenafe20
 
random variable and distribution
random variable and distributionrandom variable and distribution
random variable and distributionlovemucheca
 
probability and its functions with purpose in the world's situation .pptx
probability and its functions with purpose in the world's situation .pptxprobability and its functions with purpose in the world's situation .pptx
probability and its functions with purpose in the world's situation .pptxJamesAlvaradoManligu
 
chapter five.pptx
chapter five.pptxchapter five.pptx
chapter five.pptxAbebeNega
 
Binomial distribution good
Binomial distribution goodBinomial distribution good
Binomial distribution goodZahida Pervaiz
 
Lab23 chisquare2007
Lab23 chisquare2007Lab23 chisquare2007
Lab23 chisquare2007sbarkanic
 
Lecture Notes MTH302 Before MTT Myers.docx
Lecture Notes MTH302 Before MTT Myers.docxLecture Notes MTH302 Before MTT Myers.docx
Lecture Notes MTH302 Before MTT Myers.docxRaghavaReddy449756
 

Similar to Probability Theory for Data Scientists (20)

Probability and Statistics - Week 1
Probability and Statistics - Week 1Probability and Statistics - Week 1
Probability and Statistics - Week 1
 
PROBABILITY THEORIES.pptx
PROBABILITY THEORIES.pptxPROBABILITY THEORIES.pptx
PROBABILITY THEORIES.pptx
 
powerpoints probability.pptx
powerpoints probability.pptxpowerpoints probability.pptx
powerpoints probability.pptx
 
1 - Probabilty Introduction .ppt
1 - Probabilty Introduction .ppt1 - Probabilty Introduction .ppt
1 - Probabilty Introduction .ppt
 
Chapter7ppt.pdf
Chapter7ppt.pdfChapter7ppt.pdf
Chapter7ppt.pdf
 
probability
probabilityprobability
probability
 
Chapter Five.ppthhjhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
Chapter Five.ppthhjhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhChapter Five.ppthhjhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
Chapter Five.ppthhjhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
 
Chapter Five.ppthhjhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
Chapter Five.ppthhjhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhChapter Five.ppthhjhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
Chapter Five.ppthhjhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
 
Probability
ProbabilityProbability
Probability
 
5Enote5.ppt
5Enote5.ppt5Enote5.ppt
5Enote5.ppt
 
5Enote5.ppt
5Enote5.ppt5Enote5.ppt
5Enote5.ppt
 
probability.pptx
probability.pptxprobability.pptx
probability.pptx
 
random variable and distribution
random variable and distributionrandom variable and distribution
random variable and distribution
 
probability and its functions with purpose in the world's situation .pptx
probability and its functions with purpose in the world's situation .pptxprobability and its functions with purpose in the world's situation .pptx
probability and its functions with purpose in the world's situation .pptx
 
chapter five.pptx
chapter five.pptxchapter five.pptx
chapter five.pptx
 
Stat.pptx
Stat.pptxStat.pptx
Stat.pptx
 
Basic concepts of probability
Basic concepts of probability Basic concepts of probability
Basic concepts of probability
 
Binomial distribution good
Binomial distribution goodBinomial distribution good
Binomial distribution good
 
Lab23 chisquare2007
Lab23 chisquare2007Lab23 chisquare2007
Lab23 chisquare2007
 
Lecture Notes MTH302 Before MTT Myers.docx
Lecture Notes MTH302 Before MTT Myers.docxLecture Notes MTH302 Before MTT Myers.docx
Lecture Notes MTH302 Before MTT Myers.docx
 

More from Ferdin Joe John Joseph PhD

Week 11: Cloud Native- DSA 441 Cloud Computing
Week 11: Cloud Native- DSA 441 Cloud ComputingWeek 11: Cloud Native- DSA 441 Cloud Computing
Week 11: Cloud Native- DSA 441 Cloud ComputingFerdin Joe John Joseph PhD
 
Week 10: Cloud Security- DSA 441 Cloud Computing
Week 10: Cloud Security- DSA 441 Cloud ComputingWeek 10: Cloud Security- DSA 441 Cloud Computing
Week 10: Cloud Security- DSA 441 Cloud ComputingFerdin Joe John Joseph PhD
 
Week 9: Relational Database Service Alibaba Cloud- DSA 441 Cloud Computing
Week 9: Relational Database Service Alibaba Cloud- DSA 441 Cloud ComputingWeek 9: Relational Database Service Alibaba Cloud- DSA 441 Cloud Computing
Week 9: Relational Database Service Alibaba Cloud- DSA 441 Cloud ComputingFerdin Joe John Joseph PhD
 
Week 7: Object Storage Service Alibaba Cloud- DSA 441 Cloud Computing
Week 7: Object Storage Service Alibaba Cloud- DSA 441 Cloud ComputingWeek 7: Object Storage Service Alibaba Cloud- DSA 441 Cloud Computing
Week 7: Object Storage Service Alibaba Cloud- DSA 441 Cloud ComputingFerdin Joe John Joseph PhD
 
Week 6: Server Load Balancer and Auto Scaling Alibaba Cloud- DSA 441 Cloud Co...
Week 6: Server Load Balancer and Auto Scaling Alibaba Cloud- DSA 441 Cloud Co...Week 6: Server Load Balancer and Auto Scaling Alibaba Cloud- DSA 441 Cloud Co...
Week 6: Server Load Balancer and Auto Scaling Alibaba Cloud- DSA 441 Cloud Co...Ferdin Joe John Joseph PhD
 
Week 5: Elastic Compute Service (ECS) with Alibaba Cloud- DSA 441 Cloud Compu...
Week 5: Elastic Compute Service (ECS) with Alibaba Cloud- DSA 441 Cloud Compu...Week 5: Elastic Compute Service (ECS) with Alibaba Cloud- DSA 441 Cloud Compu...
Week 5: Elastic Compute Service (ECS) with Alibaba Cloud- DSA 441 Cloud Compu...Ferdin Joe John Joseph PhD
 
Week 4: Big Data and Hadoop in Alibaba Cloud - DSA 441 Cloud Computing
Week 4: Big Data and Hadoop in Alibaba Cloud - DSA 441 Cloud ComputingWeek 4: Big Data and Hadoop in Alibaba Cloud - DSA 441 Cloud Computing
Week 4: Big Data and Hadoop in Alibaba Cloud - DSA 441 Cloud ComputingFerdin Joe John Joseph PhD
 
Week 3: Virtual Private Cloud, On Premise, IaaS, PaaS, SaaS - DSA 441 Cloud C...
Week 3: Virtual Private Cloud, On Premise, IaaS, PaaS, SaaS - DSA 441 Cloud C...Week 3: Virtual Private Cloud, On Premise, IaaS, PaaS, SaaS - DSA 441 Cloud C...
Week 3: Virtual Private Cloud, On Premise, IaaS, PaaS, SaaS - DSA 441 Cloud C...Ferdin Joe John Joseph PhD
 
Week 2: Virtualization and VM Ware - DSA 441 Cloud Computing
Week 2: Virtualization and VM Ware - DSA 441 Cloud ComputingWeek 2: Virtualization and VM Ware - DSA 441 Cloud Computing
Week 2: Virtualization and VM Ware - DSA 441 Cloud ComputingFerdin Joe John Joseph PhD
 
Week 1: Introduction to Cloud Computing - DSA 441 Cloud Computing
Week 1: Introduction to Cloud Computing - DSA 441 Cloud ComputingWeek 1: Introduction to Cloud Computing - DSA 441 Cloud Computing
Week 1: Introduction to Cloud Computing - DSA 441 Cloud ComputingFerdin Joe John Joseph PhD
 
Sept 6 2021 BTech Artificial Intelligence and Data Science curriculum
Sept 6 2021 BTech Artificial Intelligence and Data Science curriculumSept 6 2021 BTech Artificial Intelligence and Data Science curriculum
Sept 6 2021 BTech Artificial Intelligence and Data Science curriculumFerdin Joe John Joseph PhD
 
Transforming deep into transformers – a computer vision approach
Transforming deep into transformers – a computer vision approachTransforming deep into transformers – a computer vision approach
Transforming deep into transformers – a computer vision approachFerdin Joe John Joseph PhD
 

More from Ferdin Joe John Joseph PhD (20)

Invited Talk DGTiCon 2022
Invited Talk DGTiCon 2022Invited Talk DGTiCon 2022
Invited Talk DGTiCon 2022
 
Week 12: Cloud AI- DSA 441 Cloud Computing
Week 12: Cloud AI- DSA 441 Cloud ComputingWeek 12: Cloud AI- DSA 441 Cloud Computing
Week 12: Cloud AI- DSA 441 Cloud Computing
 
Week 11: Cloud Native- DSA 441 Cloud Computing
Week 11: Cloud Native- DSA 441 Cloud ComputingWeek 11: Cloud Native- DSA 441 Cloud Computing
Week 11: Cloud Native- DSA 441 Cloud Computing
 
Week 10: Cloud Security- DSA 441 Cloud Computing
Week 10: Cloud Security- DSA 441 Cloud ComputingWeek 10: Cloud Security- DSA 441 Cloud Computing
Week 10: Cloud Security- DSA 441 Cloud Computing
 
Week 9: Relational Database Service Alibaba Cloud- DSA 441 Cloud Computing
Week 9: Relational Database Service Alibaba Cloud- DSA 441 Cloud ComputingWeek 9: Relational Database Service Alibaba Cloud- DSA 441 Cloud Computing
Week 9: Relational Database Service Alibaba Cloud- DSA 441 Cloud Computing
 
Week 7: Object Storage Service Alibaba Cloud- DSA 441 Cloud Computing
Week 7: Object Storage Service Alibaba Cloud- DSA 441 Cloud ComputingWeek 7: Object Storage Service Alibaba Cloud- DSA 441 Cloud Computing
Week 7: Object Storage Service Alibaba Cloud- DSA 441 Cloud Computing
 
Week 6: Server Load Balancer and Auto Scaling Alibaba Cloud- DSA 441 Cloud Co...
Week 6: Server Load Balancer and Auto Scaling Alibaba Cloud- DSA 441 Cloud Co...Week 6: Server Load Balancer and Auto Scaling Alibaba Cloud- DSA 441 Cloud Co...
Week 6: Server Load Balancer and Auto Scaling Alibaba Cloud- DSA 441 Cloud Co...
 
Week 5: Elastic Compute Service (ECS) with Alibaba Cloud- DSA 441 Cloud Compu...
Week 5: Elastic Compute Service (ECS) with Alibaba Cloud- DSA 441 Cloud Compu...Week 5: Elastic Compute Service (ECS) with Alibaba Cloud- DSA 441 Cloud Compu...
Week 5: Elastic Compute Service (ECS) with Alibaba Cloud- DSA 441 Cloud Compu...
 
Week 4: Big Data and Hadoop in Alibaba Cloud - DSA 441 Cloud Computing
Week 4: Big Data and Hadoop in Alibaba Cloud - DSA 441 Cloud ComputingWeek 4: Big Data and Hadoop in Alibaba Cloud - DSA 441 Cloud Computing
Week 4: Big Data and Hadoop in Alibaba Cloud - DSA 441 Cloud Computing
 
Week 3: Virtual Private Cloud, On Premise, IaaS, PaaS, SaaS - DSA 441 Cloud C...
Week 3: Virtual Private Cloud, On Premise, IaaS, PaaS, SaaS - DSA 441 Cloud C...Week 3: Virtual Private Cloud, On Premise, IaaS, PaaS, SaaS - DSA 441 Cloud C...
Week 3: Virtual Private Cloud, On Premise, IaaS, PaaS, SaaS - DSA 441 Cloud C...
 
Week 2: Virtualization and VM Ware - DSA 441 Cloud Computing
Week 2: Virtualization and VM Ware - DSA 441 Cloud ComputingWeek 2: Virtualization and VM Ware - DSA 441 Cloud Computing
Week 2: Virtualization and VM Ware - DSA 441 Cloud Computing
 
Week 1: Introduction to Cloud Computing - DSA 441 Cloud Computing
Week 1: Introduction to Cloud Computing - DSA 441 Cloud ComputingWeek 1: Introduction to Cloud Computing - DSA 441 Cloud Computing
Week 1: Introduction to Cloud Computing - DSA 441 Cloud Computing
 
Sept 6 2021 BTech Artificial Intelligence and Data Science curriculum
Sept 6 2021 BTech Artificial Intelligence and Data Science curriculumSept 6 2021 BTech Artificial Intelligence and Data Science curriculum
Sept 6 2021 BTech Artificial Intelligence and Data Science curriculum
 
Hadoop in Alibaba Cloud
Hadoop in Alibaba CloudHadoop in Alibaba Cloud
Hadoop in Alibaba Cloud
 
Cloud Computing Essentials in Alibaba Cloud
Cloud Computing Essentials in Alibaba CloudCloud Computing Essentials in Alibaba Cloud
Cloud Computing Essentials in Alibaba Cloud
 
Transforming deep into transformers – a computer vision approach
Transforming deep into transformers – a computer vision approachTransforming deep into transformers – a computer vision approach
Transforming deep into transformers – a computer vision approach
 
Week 11: Programming for Data Analysis
Week 11: Programming for Data AnalysisWeek 11: Programming for Data Analysis
Week 11: Programming for Data Analysis
 
Week 10: Programming for Data Analysis
Week 10: Programming for Data AnalysisWeek 10: Programming for Data Analysis
Week 10: Programming for Data Analysis
 
Week 9: Programming for Data Analysis
Week 9: Programming for Data AnalysisWeek 9: Programming for Data Analysis
Week 9: Programming for Data Analysis
 
Week 8: Programming for Data Analysis
Week 8: Programming for Data AnalysisWeek 8: Programming for Data Analysis
Week 8: Programming for Data Analysis
 

Recently uploaded

Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Book
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Bookvip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Book
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Bookmanojkuma9823
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdfHuman37
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxFurkanTasci3
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 

Recently uploaded (20)

Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Book
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Bookvip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Book
vip Sarai Rohilla Call Girls 9999965857 Call or WhatsApp Now Book
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf20240419 - Measurecamp Amsterdam - SAM.pdf
20240419 - Measurecamp Amsterdam - SAM.pdf
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Data Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptxData Science Jobs and Salaries Analysis.pptx
Data Science Jobs and Salaries Analysis.pptx
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 

Probability Theory for Data Scientists

  • 1. Probability for Data Scientists Dr. Ferdin Joe John Joseph
  • 2. Machine Learning Machine Learning is an interdisciplinary field in Data Science that uses • statistics • probability • algorithms to learn from data and provide insights which can be used to build intelligent applications. 2
  • 8. Probability for Data Science •Probability deals with predicting the likelihood of future events, while statistics involves the analysis of the frequency of past events. 8
  • 9. Terminologies • Event • Random Variable • Empirical Probability • Theoretical Probability • Joint Probability • Conditional Probability 9
  • 10. Event • An event is a set of outcomes of an experiment to which a probability is assigned. • E represents event • P(E) is the probability that the event E occur. • A situation where E might happen (success) or might not happen (failure) is called a trial. 10
  • 13. Event • Pulling colored ball out of the bag 13
  • 14. Random Variable • The variable that represents the outcome of an events is called a random variable. • Eg. Getting head or tail in tossing a coin 14
  • 15. Random variable in tossing a coin • If we toss a coin, the chances for getting head or tail is 50-50 • The probability of getting head or tail is ½ or 50% • Random variable range between 0 and 1 15
  • 16. Empirical Probability • Also known as practical probability • It is the number of times the event occurs divided by the total number of incidents observed. • If for ‘n’ trials and we observe ‘s’ successes, the probability of success is s/n. • Toss a coin 4 times. The outcome is H, H, H, T • P(Head) =3/4=0.75 • P(Tail)=1/4=0.25 16
  • 17. Theoretical probability • The number of ways the particular event can occur divided by the total number of possible outcomes. • A head can occur once and possible outcomes are two (head, tail). The true (theoretical) probability of a head is 1/2. 17
  • 18. Exercise 1 A die is rolled, find the probability that an even number is obtained. 18
  • 19. Exercise 1 A die is rolled, find the probability that an even number is obtained. Solution: Let us first write the sample space S of the experiment. S = {1,2,3,4,5,6} Let E be the event "an even number is obtained" and write it down. E = {2,4,6} We now use the formula of the classical probability. P(E) = n(E) / n(S) = 3 / 6 = 1 / 2 19
  • 20. Exercise 2 Two coins are tossed, find the probability that two heads are obtained. Note: Each coin has two possible outcomes H (heads) and T (Tails). 20
  • 21. Exercise 2 Two coins are tossed, find the probability that two heads are obtained. Note: Each coin has two possible outcomes H (heads) and T (Tails). The sample space S is given by. S = {(H,T),(H,H),(T,H),(T,T)} Let E be the event "two heads are obtained". E = {(H,H)} We use the formula of the classical probability. P(E) = n(E) / n(S) = 1 / 4 21
  • 22. Exercise 3 A card is drawn at random from a deck of cards. Find the probability of getting the 3 of diamond. 22
  • 23. Exercise 3 A card is drawn at random from a deck of cards. Find the probability of getting the 3 of diamond. The sample space S of the experiment in question 6 is shown below 23
  • 24. Exercise 3 A card is drawn at random from a deck of cards. Find the probability of getting the 3 of diamond. 24
  • 25. Exercise 3 A card is drawn at random from a deck of cards. Find the probability of getting the 3 of diamond. Let E be the event "getting the 3 of diamond". An examination of the sample space shows that there is one "3 of diamond" so that n(E) = 1 and n(S) = 52. Hence the probability of event E occurring is given by P(E) = 1 / 52 25
  • 26. Exercise 4 The blood groups of 200 people is distributed as follows: 50 have type A blood, 65 have B blood type, 70 have O blood type and 15 have type AB blood. If a person from this group is selected at random, what is the probability that this person has O blood type? 26
  • 27. Exercise 4 We construct a table of frequencies for the the blood groups as follows group frequency A 50 B 65 O 70 AB 15 We use the empirical formula of the probability P(E) = Frequency for O blood / Total frequencies = 70 / 200 = 0.35 27
  • 28. Classwork 1 What is the probability of throwing one dice and getting the number greater than 4 ? 28
  • 29. Classwork 2 The customer wants to buy a bread and a can. There are 30 pieces of bread in the shop, including 5 from the previous day, and 20 cans with unreadable expiration date, of which one has expired. What is the probability that the customer will buy a fresh bread and a tin under warranty ? 29
  • 30. Classwork 3 What is the probability that if we choose a trinity from 19 boys and 12 girls, we will have : a) three boys b) three girls c) two boys and one girl ? 30
  • 31. Joint Probability • Probability of events A and B denoted by P(A and B) or P(A ∩ B) is the probability that events A and B both occur. • P(A ∩ B) = P(A). P(B) • This only applies if A and B are independent, which means that if A occurred, that doesn’t change the probability of B, and vice versa. 31
  • 32. Conditional Probability • A and B are not independent • When A and B are not independent, it is often useful to compute the conditional probability, P (A|B) • The probability of A given that B occurred: P(A|B) = P(A ∩ B) P(B) • Similarly, P(B|A) = P(A ∩ B) P(A) 32
  • 33. • Joint probability of A and B can be denoted as • P(A ∩ B)= p(A).P(B|A) 33
  • 35. Bayes Theorem • Used in Naïve Bayes Classifier 35
  • 36. 36
  • 37. Types of Events • Independent • Mutually Exclusive 37
  • 38. Independent Events • Two or more events not having control over the outcome of the others. 38
  • 39. Mutually Exclusive Events • If two events are NOT independent, then we say that they are dependent. • Sampling may be done with replacement or without replacement. • With replacement: If each member of a population is replaced after it is picked, then that member has the possibility of being chosen more than once. When sampling is done with replacement, then events are considered to be independent, meaning the result of the first pick will not change the probabilities for the second pick. • Without replacement: When sampling is done without replacement, each member of a population may be chosen only once. In this case, the probabilities for the second pick are affected by the result of the first pick. The events are considered to be dependent or not independent. 39
  • 40. Sampling with replacement • Suppose you pick three cards with replacement. The first card you pick out of the 52 cards is the • Q of spades. You put this card back, reshuffle the cards and pick a second card from the 52-card deck. It is the ten of clubs. You put this card back, reshuffle the cards and pick a third card from the 52-card deck. This time, the card is the Q of spades again. Your picks are {Q of spades, ten of clubs, Q of spades}. You have picked the Q of spades twice. You pick each card from the 52-card deck. 40
  • 41. Sampling without replacement • Suppose you pick three cards without replacement. The first card you pick out of the 52 cards is the • K of hearts. You put this card aside and pick the second card from the 51 cards remaining in the deck. It is the three of diamonds. You put this card aside and pick the third card from the remaining 50 cards in the deck. The third card is the J of spades. Your picks are {K of hearts, three of diamonds, J of spades}. Because you have picked the cards without replacement, you cannot pick the same card twice. 41
  • 42. Probability Distribution • A probability distribution is a list of all of the possible outcomes of a random variable along with their corresponding probability values. 42
  • 43. Discrete Probability Distribution • If we consider 1 and 2 as outcomes of rolling a six-sided die, then we can’t have an outcome in between that (e.g. I can’t have an outcome of 1.5). • This is called probability mass function 43
  • 44. Continuous Probability Distribution • Sometimes we are concerned with the probabilities of random variables that have continuous outcomes. • Eg. The height of an adult picked at random from a population or the amount of time that a taxi driver has to wait before their next job. • When we use a probability function to describe a continuous probability distribution we call it a probability density function (commonly abbreviated as pdf). 44
  • 45. Central Limit Theorem • The central limit theorem states that if you have a population with mean μ and standard deviation σ and take sufficiently large random samples from the population with replacement text annotation indicator, then the distribution of the sample means will be approximately normally distributed. 45
  • 47. Normal Distribution • Uses the Central Limit Theorem • Known as Bell Curve 47
  • 50. Genetic Algorithm Genetic algorithm is a search heuristic that is inspired by Charles Darwin’s theory of natural evolution. This algorithm reflects the process of natural selection where the fittest individuals are selected for reproduction in order to produce offspring of the next generation. 50
  • 52. Phases of Genetic Algorithm Initial population Fitness function Selection Crossover Mutation 52
  • 53. Initial Population The process begins with a set of individuals which is called a Population. Each individual is a solution to the problem you want to solve. An individual is characterized by a set of parameters (variables) known as Genes. Genes are joined into a string to form a Chromosome (solution). In a genetic algorithm, the set of genes of an individual is represented using a string, in terms of an alphabet. Usually, binary values are used (string of 1s and 0s). We say that we encode the genes in a chromosome. 53
  • 55. Fitness Function The fitness function determines how fit an individual is (the ability of an individual to compete with other individuals). It gives a fitness score to each individual. The probability that an individual will be selected for reproduction is based on its fitness score. 55
  • 56. Selection The idea of selection phase is to select the fittest individuals and let them pass their genes to the next generation. Two pairs of individuals (parents) are selected based on their fitness scores. Individuals with high fitness have more chance to be selected for reproduction. 56
  • 57. Crossover Crossover is the most significant phase in a genetic algorithm. For each pair of parents to be mated, a crossover point is chosen at random from within the genes. For example, consider the crossover point to be 3 as shown below. 57
  • 58. Crossover • Offspring are created by exchanging the genes of parents among themselves until the crossover point is reached. • The new offsprings A5 and A6 are added to the population. 58
  • 59. Probability in crossover • Choosing which chromosome to perform crossover • Choosing the pair to perform crossover • Choosing the part of chromosome to perform crossover 59
  • 60. Mutation • In certain new offspring formed, some of their genes can be subjected to a mutation with a low random probability. • This implies that some of the bits in the bit string can be flipped. 60
  • 61. Probability in mutation • Choosing which chromosome to perform mutation • Choosing whether to perform mutation or not • Choosing the part of chromosome to perform mutation 61
  • 63. Probability usage in programming 63
  • 64. Probability usage in programming 64 # generate random floating point values from random import seed from random import random # seed random number generator seed(1) # generate random numbers between 0-1 for _ in range(10): value = random() print(value)
  • 65. Probability usage in programming 65 # generate random integer values from random import seed from random import randint # seed random number generator seed(1) # generate some integers for _ in range(10): value = randint(0, 10) print(value)
  • 66. Probability usage in programming 66 # choose a random element from a list from random import seed from random import choice # seed random number generator seed(1) # prepare a sequence sequence = [i for i in range(20)] print(sequence) # make choices from the sequence for _ in range(5): selection = choice(sequence) print(selection)
  • 67. Probability usage in programming 67 # randomly shuffle a sequence from random import seed from random import shuffle # seed random number generator seed(1) # prepare a sequence sequence = [i for i in range(20)] print(sequence) # randomly shuffle the sequence shuffle(sequence) print(sequence)
  • 68. Slides Available in link below www.slideshare.net/ferdinjoe 68
  • 69. More topics recommended to learn • Queueing Theory • Statistics • Numerical Methods • Discrete Mathematics • Optimization problems in Operations Research 69
  • 70. 70