1.
1. Introduction to ProbabilityIn some areas, such as mathematics or logic, results of some process can be known with certainty (e.g.,2+3=5). Most real life situations, however, involve variability and uncertainty. For example, it isuncertain whether it will rain tomorrow; the price of a given stock a week from today is uncertainNote_1 ; the number of claims that a car insurance policy holder will make over a one-year period isuncertain. Uncertainty or "randomness" (meaning variability of results) is usually due to some mixture oftwo factors: (1) variability in populations consisting of animate or inanimate objects (e.g., people vary insize, weight, blood type etc.), and (2) variability in processes or phenomena (e.g., the random selectionof 6 numbers from 49 in a lottery draw can lead to a very large number of different outcomes; stock orcurrency prices fluctuate substantially over time).Variability and uncertainty make it more difficult to plan or to make decisions. Although they cannotusually be eliminated, it is however possible to describe and to deal with variability and uncertainty, byusing the theory of probability. This course develops both the theory and applications of probability.It seems logical to begin by defining probability. People have attempted to do this by giving definitionsthat reflect the uncertainty whether some specified outcome or ``event" will occur in a given setting.The setting is often termed an ``experiment" or ``process" for the sake of discussion. To take a simple``toy" example: it is uncertain whether the number 2 will turn up when a 6-sided die is rolled. It issimilarly uncertain whether the Canadian dollar will be higher tomorrow, relative to the U.S. dollar, thanit is today. Three approaches to defining probability are:The classical definition: Let the sample space (denoted by ) be the set of all possible distinct outcomes toan experiment. The probability of some event isprovided all points in are equally likely. For example, when a die is rolled the probability of getting a 2 isbecause one of the six faces is a 2.The relative frequency definition: The probability of an event is the proportion (or fraction) of times theevent occurs in a very long (theoretically infinite) series of repetitions of an experiment or process. Forexample, this definition could be used to argue that the probability of getting a 2 from a rolled die is .
2.
The subjective probability definition: The probability of an event is a measure of how sure the personmaking the statement is that the event will happen. For example, after considering all available data, aweather forecaster might say that the probability of rain today is 30% or 0.3.Unfortunately, all three of these definitions have serious limitations.Classical Definition:What does "equally likely" mean? This appears to use the concept of probability while trying to define it!We could remove the phrase "provided all outcomes are equally likely", but then the definition wouldclearly be unusable in many settings where the outcomes in did not tend to occur equally often.Relative Frequency Definition:Since we can never repeat an experiment or process indefinitely, we can never know the probability ofany event from the relative frequency definition. In many cases we cant even obtain a long series ofrepetitions due to time, cost, or other limitations. For example, the probability of rain today cant reallybe obtained by the relative frequency definition since today cant be repeated again.
3.
Subjective Probability:This definition gives no rational basis for people to agree on a right answer. There is some controversyabout when, if ever, to use subjective probability except for personal decision-making. It will not be usedin Stat 230.These difficulties can be overcome by treating probability as a mathematical system defined by a set ofaxioms. In this case we do not worry about the numerical values of probabilities until we consider aspecific application. This is consistent with the way that other branches of mathematics are defined andthen used in specific applications (e.g., the way calculus and real-valued functions are used to model anddescribe the physics of gravity and motion).The mathematical approach that we will develop and use in the remaining chapters assumes thefollowing:probabilities are numbers between 0 and 1 that apply to outcomes, termed ``events,each event may or may not occur in a given setting.Chapter 2 begins by specifying the mathematical framework for probability in more detail.
4.
ExercisesTry to think of examples of probabilities you have encountered which might have been obtained by eachof the three ``definitions".Which definitions do you think could be used for obtaining the following probabilities?You have a claim on your car insurance in the next year.There is a meltdown at a nuclear power plant during the next 5 years.A persons birthday is in April.Give examples of how probability applies to each of the following areas.Lottery drawsAuditing of expense items in a financial statementDisease transmission (e.g. measles, tuberculosis, STDs)Public opinion polls2. Mathematical Probability ModelsSample Spaces and ProbabilityConsider some phenomenon or process which is repeatable, at least in theory, and suppose that certainevents (outcomes) are defined. We will often term the phenomenon or process an ``experiment" and
5.
refer to a single repetition of the experiment as a ``trial". Then the probability of an event , denoted , is anumber between 0 and 1.If probability is to be a useful mathematical concept, it should possess some other properties. Forexample, if our ``experiment consists of tossing a coin with two sides, Head and Tail, then we mightwish to consider the events = ``Head turns up and = ``Tail turns up. It would clearly not be desirableto allow, say, and , so that . (Think about why this is so.) To avoid this sort of thing we begin with thefollowing definition.DefinitionA sample space is a set of distinct outcomes for an experiment or process, with the propertythat in a single trial, one and only one of these outcomes occurs. The outcomes that make up thesample space are called sample points.A sample space is part of the probability model in a given setting. It is not necessarily unique, as thefollowing example shows.Example: Roll a 6-sided die, and define the eventsThen we could take the sample space as . However, we could also define eventseven number turns upodd number turns upand take . Both sample spaces satisfy the definition, and which one we use would depend on what wewanted to use the probability model for. In most cases we would use the first sample space.
6.
Sample spaces may be either discrete or non-discrete; is discrete if it consists of a finite or countablyinfinite set of simple events. The two sample spaces in the preceding example are discrete. A samplespace consisting of all the positive integers is also, for example, discrete, but a sample space consistingof all positive real numbers is not. For the next few chapters we consider only discrete sample spaces.This makes it easier to define mathematical probability, as follows.DefinitionLet be a discrete sample space. Then probabilities are numbers attached to the s such thatthe following two conditions hold:The set of values is called a probability distribution on .DefinitionAn event in a discrete sample space is a subset If the event contains only one point, e.g. wecall it a simple event. An event made up of two or more simple events such as is called a compoundevent.Our notation will often not distinguish between the point and the simple event which has this point asits only element, although they differ as mathematical objects. The condition (2) in the definition abovereflects the idea that when the process or experiment happens, some event in must occur (see thedefinition of sample space). The probability of a more general event (not necessarily a simple event) isthen defined as follows:DefinitionTheprobability of an event is the sum of the probabilities for all the simple events that makeup .
7.
For example, the probability of the compound event The definition of probability does not say whatnumbers to assign to the simple events for a given setting, only what properties the numbers mustpossess. In an actual situation, we try to specify numerical values that make the model useful; thisusually means that we try to specify numbers that are consistent with one or more of the empirical``definitions of Chapter 1.Example: Suppose a 6-sided die is rolled, and let the sample space be , where means the number 1occurs, and so on. If the die is an ordinary one, we would find it useful to define probabilities asbecause if the die were tossed repeatedly (as in some games or gambling situations) then each numberwould occur close to of the time. However, if the die were weighted in some way, these numericalvalues would not be so useful.Note that if we wish to consider some compound event, the probability is easily obtained. For example,if = ``even number" then because we get .We now consider some additional examples, starting with some simple ``toy" problems involving cards,coins and dice and then considering a more scientific example.Remember that in using probability we are actually constructing mathematical models. We canapproach a given problem by a series of three steps:Specify a sample space .
8.
Assign numerical probabilities to the simple events in .For any compound event , find by adding the probabilities of all the simple events that make up .Many probability problems are stated as ``Find the probability that .... To solve the problem you shouldthen carry out step (2) above by assigning probabilities that reflect long run relative frequencies ofoccurrence of the simple events in repeated trials, if possible.Some ExamplesWhen has few points, one of the easiest methods for finding the probability of an event is to list alloutcomes. In many problems a sample space with equally probable simple events can be used, and thefirst few examples are of this type.Example: Draw 1 card from a standard well-shuffled deck (13 cards of each of 4 suits - spades, hearts,diamonds, clubs). Find the probability the card is a club.Solution 1: Let = { spade, heart, diamond, club}. (The points of are generally listed between brackets {}.)Then has 4 points, with 1 of them being "club", so (club) = .
9.
Solution 2: Let = {each of the 52 cards}. Then 13 of the 52 cards are clubs, soNote 1: A sample space is not necessarily unique, as mentioned earlier. The two solutions illustrate this.Note that in the first solution the event = "the card is a club" is a simple event, but in the second it is acompound event.Note 2: In solving the problem we have assumed that each simple event in is equally probable. Forexample in Solution 1 each simple event has probability . This seems to be the only sensible choice ofnumerical value in this setting. (Why?)Note 3: The term "odds" is sometimes used. The odds of an event is the probability it occurs divided bythe probability it does not occur. In this card example the odds in favour of clubs are 1:3; we could alsosay the odds against clubs are 3:1.
10.
Example: Toss a coin twice. Find the probability of getting 1 head. (In this course, 1 head is taken tomean exactly 1 head. If we meant at least 1 head we would say so.)Solution 1: Let and assume the simple events each have probability . (If your notation is not obvious,please explain it. For example, means head on the toss and tails on the .) Since 1 head occurs for simpleevents and , we get (1 head) = .Solution 2: Let = { 0 heads, 1 head, 2 heads } and assume the simple events each have probability . Then(1 head) = .9 tosses of two coins eachWhich solution is right? Both are mathematically "correct". However, we want a solution that is useful interms of the probabilities of events reflecting their relative frequency of occurrence in repeated trials. Inthat sense, the points in solution 2 are not equally likely. The outcome 1 head occurs more often thaneither 0 or 2 heads in actual repeated trials. You can experiment to verify this (for example of the ninereplications of the experiment in Figure coins, 2 heads occurred 2 of the nine times, 1 head occurred 6of the 9 times. For more certainty you should replicate this experiment many times. You can do thiswithout benefit of coin at http://shazam.econ.ubc.ca/flip/index.html). So we say solution 2 is incorrectfor ordinary physical coins though a better term might be "incorrect model". If we were determined touse the sample space in solution 2, we could do it by assigning appropriate probabilities to each point.From solution 1, we can see that 0 heads would have a probability of , 1 head , and 2 heads . However,there seems to be little point using a sample space whose points are not equally probable when onewith equally probable points is readily available.
11.
Example: Roll a red die and a green die. Find the probability the total is 5.Solution: Let represent getting on the red die and on the green die.Then, with these as simple events, the sample space isThe sample points giving a total of 5 are (1,4) (2,3) (3,2), and (4,1).(total is 5) =Example: Suppose the 2 dice were now identical red dice. Find the probability the total is 5.Solution 1: Since we can no longer distinguish between and , the only distinguishable points in are :
12.
Using this sample space, we get a total of from points and only. If we assign equal probability to eachpoint (simple event) then we get (total is 5) = .At this point you should be suspicious since . The colour of the dice shouldnt have any effect on whattotal we get, so this answer must be wrong. The problem is that the 21 points in here are not equallylikely. If this experiment is repeated, the point (1, 2) occurs twice as often in the long run as the point(1,1). The only sensible way to use this sample space would be to assign probability weights to thepoints and to the points for . Of course we can compare these probabilities with experimentalevidence. On the website http://www.math.duke.edu/education/postcalc/probability/dice/index.htmlyou may throw dice up to 10,000 times and record the results. For example on 1000 throws of two dice(see Figure 2dice), there were 121 occasions when the sum of the values on the dice was 5, indicatingthe probability is around 121/1000 or 0.121 This compares with the true probabilityResults of 1000 throws of 2 diceA more straightforward solution follows.Solution 2: Pretend the dice can be distinguished even though they cant. (Imagine, for example, that weput a white dot on one die, or label one of them 1 and the other as 2.) We then get the same 36 samplepoints as in the example with the red die and the green die. Hence
13.
But, you argue, the dice were identical, and you cannot distinguish them! The laws determining theprobabilities associated with these two dice do not, of course, know whether your eyesight is so keenthat you can or cannot distinguish the dice. These probabilities must be the same in either case. In manycases, when objects are indistinguishable and we are interested in calculating a probability, thecalculation is made easier by pretending the objects can be distinguished.This illustrates a common pitfall in using probability. When treating objects in an experiment asdistinguishable leads to a different answer from treating them as identical, the points in the samplespace for identical objects are usually not ``equally likely" in terms of their long run relative frequencies.It is generally safer to pretend objects can be distinguished even when they cant be, in order to getequally likely sample points.While the method of finding probability by listing all the points in can be useful, it isnt practical whenthere are a lot of points to write out (e.g., if 3 dice were tossed there would be 216 points in ). We needto have more efficient ways of figuring out the number of outcomes in or in a compound event withouthaving to list them all. Chapter 3 considers ways to do this, and then Chapter 4 develops other ways tomanipulate and calculate probabilities.To conclude this chapter, we remark that in some settings we rely on previous repetitions of anexperiment, or on scientific data, to assign numerical probabilities to events. Problems 2.6 and 2.7below illustrate this. Although we often use "toy" problems involving things such as coins, dice andsimple games for examples, probability is used to deal with a huge variety of practical problems.Problems 2.6 and 2.7, and many others to be discussed later, are of this type.
14.
Problems on Chapter 2Students in a particular program have the same 4 math profs. Two students in the program eachindependently ask one of their math profs for a letter of reference. Assume each is equally likely to askany of the math profs.List a sample space for this ``experiment.Use this sample space to find the probability both students ask the same prof.List a sample space for tossing a fair coin 3 times.What is the probability of 2 consecutive tails (but not 3)?You wish to choose 2 different numbers from 1, 2, 3, 4, 5. List all possible pairs you could obtain and findthe probability the numbers chosen differ by 1 (i.e. are consecutive).Four letters addressed to individuals , , and are randomly placed in four addressed envelopes, oneletter in each envelope.List a 24-point sample space for this experiment.List the sample points belonging to each of the following events:: ``s letter goes into the correct envelope;: ``no letters go into the correct envelopes;: ``exactly two letters go into the correct envelopes;
15.
: ``exactly three letters go into the correct envelopes.Assuming that the 24 sample points are equally probable, find the probabilities of the four events in (b).Three balls are placed at random in three boxes, with no restriction on the number of balls per box; listthe 27 possible outcomes of this experiment. Assuming that the outcomes are all equally probable, findthe probability of each of the following events:: ``the first box is empty;: ``the first two boxes are empty;: ``no box contains more than one ball.Find the probabilities of events , and when three balls are placed at random in boxes .Find the probabilities of events , and when balls are placed in boxes .Diagnostic Tests. Suppose that in a large population some persons have a specific disease at a givenpoint in time. A person can be tested for the disease, but inexpensive tests are often imperfect, and maygive either a ``false positive result (the person does not have the disease but the test says they do) or a``false negative result (the person has the disease but the test says they do not).In a random sample of 1000 people, individuals with the disease were identified according to acompletely accurate but expensive test, and also according to a less accurate but inexpensive test. Theresults for the less accurate test were that920 persons without the disease tested negative60 persons without the disease tested positive
16.
18 persons with the disease tested positive2 persons with the disease tested negative.(a)Estimate the fraction of the population that has the disease and tests positive using the inexpensive test.Estimate the fraction of the population that has the disease.Suppose that someone randomly selected from the population tests positive using the inexpensive test.Estimate the probability that they actually have the disease.Machine Recognition of Handwritten Digits. Suppose that you have an optical scanner and associatedsoftware for determining which of the digits an individual has written in a square box. The system mayof course be wrong sometimes, depending on the legibility of the handwritten number.Describe a sample space that includes points , where stands for the number actually written, andstands for the number that the machine identifies.Suppose that the machine is asked to identify very large numbers of digits, of which occur equally often,and suppose that the following probabilities apply to the points in your sample space:;forGive a table with probabilities for each point in . What fraction of numbers is correctly identified?
17.
ProbabilityOutcome Sample SpaceEventRelative FrequencyProbabilitySubjective ProbabilityIndependent EventsMutually Exclusive EventsAddition RuleMultiplication RuleConditional ProbabilityLaw of Total ProbabilityBayes Theorem
18.
--------------------------------------------------------------------------------Main Contents page | Index of all entries--------------------------------------------------------------------------------OutcomeAn outcome is the result of an experiment or other situation involving uncertainty.The set of all possible outcomes of a probability experiment is called a sample space.Sample SpaceThe sample space is an exhaustive list of all the possible outcomes of an experiment. Each possibleresult of such a study is represented by one and only one point in the sample space, which is usuallydenoted by S.ExamplesExperiment Rolling a die once:Sample space S = {1,2,3,4,5,6}Experiment Tossing a coin:
19.
Sample space S = {Heads,Tails}Experiment Measuring the height (cms) of a girl on her first day at school:Sample space S = the set of all possible real numbersEventAn event is any collection of outcomes of an experiment.Formally, any subset of the sample space is an event.Any event which consists of a single outcome in the sample space is called an elementary or simpleevent. Events which consist of more than one outcome are called compound events.Set theory is used to represent relationships among events. In general, if A and B are two events in thesample space S, then(A union B) = either A or B occurs or both occur(A intersection B) = both A and B occur(A is a subset of B) = if A occurs, so does BA or = event A does not occur(the empty set) = an impossible eventS (the sample space) = an event that is certain to occurExampleExperiment: rolling a dice once -Sample space S = {1,2,3,4,5,6}Events A = score < 4 = {1,2,3}B = score is even = {2,4,6}
20.
C = score is 7 == the score is < 4 or even or both = {1,2,3,4,6}= the score is < 4 and even = {2}A or = event A does not occur = {4,5,6}Relative FrequencyRelative frequency is another term for proportion; it is the value calculated by dividing the number oftimes an event occurs by the total number of times an experiment is carried out. The probability of anevent can be thought of as its long-run relative frequency when the experiment is carried out manytimes.If an experiment is repeated n times, and event E occurs r times, then the relative frequency of theevent E is defined to berfn(E) = r/nExampleExperiment: Tossing a fair coin 50 times (n = 50)Event E = headsResult: 30 heads, 20 tails, so r = 30Relative frequency: rfn(E) = r/n = 30/50 = 3/5 = 0.6If an experiment is repeated many, many times without changing the experimental conditions, therelative frequency of any particular event will settle down to some value. The probability of the eventcan be defined as the limiting value of the relative frequency:P(E) = rfn(E)
21.
For example, in the above experiment, the relative frequency of the event heads will settle down to avalue of approximately 0.5 if the experiment is repeated many more times.ProbabilityA probability provides a quantatative description of the likely occurrence of a particular event.Probability is conventionally expressed on a scale from 0 to 1; a rare event has a probability close to 0, avery common event has a probability close to 1.The probability of an event has been defined as its long-run relative frequency. It has also been thoughtof as a personal degree of belief that a particular event will occur (subjective probability).In some experiments, all outcomes are equally likely. For example if you were to choose one winner in araffle from a hat, all raffle ticket holders are equally likely to win, that is, they have the same probabilityof their ticket being chosen. This is the equally-likely outcomes model and is defined to be:P(E) = number of outcomes corresponding to event E--------------------------------------------------------------------------------total number of outcomesExamplesThe probability of drawing a spade from a pack of 52 well-shuffled playing cards is 13/52 = 1/4 = 0.25sinceevent E = a spade is drawn;
22.
the number of outcomes corresponding to E = 13 (spades);the total number of outcomes = 52 (cards).When tossing a coin, we assume that the results heads or tails each have equal probabilities of 0.5.Subjective ProbabilityA subjective probability describes an individuals personal judgement about how likely a particular eventis to occur. It is not based on any precise computation but is often a reasonable assessment by aknowledgeable person.Like all probabilities, a subjective probability is conventionally expressed on a scale from 0 to 1; a rareevent has a subjective probability close to 0, a very common event has a subjective probability close to1.A persons subjective probability of an event describes his/her degree of belief in the event.ExampleA Rangers supporter might say, "I believe that Rangers have probability of 0.9 of winning the ScottishPremier Division this year since they have been playing really well."Independent EventsTwo events are independent if the occurrence of one of the events gives us no information aboutwhether or not the other event will occur; that is, the events have no influence on each other.
23.
In probability theory we say that two events, A and B, are independent if the probability that they bothoccur is equal to the product of the probabilities of the two individual events, i.e.The idea of independence can be extended to more than two events. For example, A, B and C areindependent if:A and B are independent; A and C are independent and B and C are independent (pairwiseindependence);If two events are independent then they cannot be mutually exclusive (disjoint) and vice versa.ExampleSuppose that a man and a woman each have a pack of 52 playing cards. Each draws a card from his/herpack. Find the probability that they each draw the ace of clubs.We define the events:A = probability that man draws ace of clubs = 1/52B = probability that woman draws ace of clubs = 1/52Clearly events A and B are independent so:= 1/52 . 1/52 = 0.00037That is, there is a very small chance that the man and the woman will both draw the ace of clubs.See also conditional probability.
24.
Mutually Exclusive EventsTwo events are mutually exclusive (or disjoint) if it is impossible for them to occur together.Formally, two events A and B are mutually exclusive if and only ifIf two events are mutually exclusive, they cannot be independent and vice versa.ExamplesExperiment: Rolling a die onceSample space S = {1,2,3,4,5,6}Events A = observe an odd number = {1,3,5}B = observe an even number = {2,4,6}= the empty set, so A and B are mutually exclusive.A subject in a study cannot be both male and female, nor can they be aged 20 and 30. A subject couldhowever be both male and 20, or both female and 30.Addition Rule
25.
The addition rule is a result used to determine the probability that event A or event B occurs or bothoccur.The result is often written as follows, using set notation:where:P(A) = probability that event A occursP(B) = probability that event B occurs= probability that event A or event B occurs= probability that event A and event B both occurFor mutually exclusive events, that is events which cannot occur together:= 0The addition rule therefore reduces to= P(A) + P(B)For independent events, that is events which have no influence on each other:The addition rule therefore reduces toExample
26.
Suppose we wish to find the probability of drawing either a king or a spade in a single draw from a packof 52 playing cards.We define the events A = draw a king and B = draw a spadeSince there are 4 kings in the pack and 13 spades, but 1 card is both a king and a spade, we have:= 4/52 + 13/52 - 1/52 = 16/52So, the probability of drawing either a king or a spade is 16/52 (= 4/13).See also multiplication rule.Multiplication RuleThe multiplication rule is a result used to determine the probability that two events, A and B, bothoccur.The multiplication rule follows from the definition of conditional probability.The result is often written as follows, using set notation:where:P(A) = probability that event A occursP(B) = probability that event B occurs= probability that event A and event B occurP(A | B) = the conditional probability that event A occurs given that event B has occurred alreadyP(B | A) = the conditional probability that event B occurs given that event A has occurred already
27.
For independent events, that is events which have no influence on one another, the rule simplifies to:That is, the probability of the joint events A and B is equal to the product of the individual probabilitiesfor the two events.Conditional ProbabilityIn many situations, once more information becomes available, we are able to revise our estimates forthe probability of further outcomes or events happening. For example, suppose you go out for lunch atthe same place and time every Friday and you are served lunch within 15 minutes with probability 0.9.However, given that you notice that the restaurant is exceptionally busy, the probability of being servedlunch within 15 minutes may reduce to 0.7. This is the conditional probability of being served lunchwithin 15 minutes given that the restaurant is exceptionally busy.The usual notation for "event A occurs given that event B has occurred" is "A | B" (A given B). Thesymbol | is a vertical line and does not imply division. P(A | B) denotes the probability that event A willoccur given that event B has occurred already.A rule that can be used to determine a conditional probability from unconditional probabilities is:where:P(A | B) = the (conditional) probability that event A will occur given that event B has occured already= the (unconditional) probability that event A and event B both occurP(B) = the (unconditional) probability that event B occurs
28.
Law of Total ProbabilityThe result is often written as follows, using set notation:where:P(A) = probability that event A occurs= probability that event A and event B both occur= probability that event A and event B both occur, i.e. A occurs and B does not.Using the multiplication rule, this can be expressed asP(A) = P(A | B).P(B) + P(A | B).P(B)Bayes TheoremBayes Theorem is a result that allows new information to be used to update the conditional probabilityof an event.Using the multiplication rule, gives Bayes Theorem in its simplest form:
29.
Using the Law of Total Probability:P(A | B) = P(B | A).P(A)--------------------------------------------------------------------------------P(B | A).P(A) + P(B | A).P(A)where:P(A) = probability that event A occursP(B) = probability that event B occursP(A) = probability that event A does not occurP(A | B) = probability that event A occurs given that event B has occurred alreadyP(B | A) = probability that event B occurs given that event A has occurred alreadyP(B | A) = probability that event B occurs given that event A has not occurred already--------------------------------------------------------------------------------Top of page | Main Contents page
Be the first to comment