• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
GENETICS
 

GENETICS

on

  • 1,272 views

 

Statistics

Views

Total Views
1,272
Views on SlideShare
1,272
Embed Views
0

Actions

Likes
0
Downloads
14
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    GENETICS GENETICS Document Transcript

    • Chance and the Distribution. of Families. In genetic experiments with plants and other animals, the most direct inferences as to method of inheritance come from a count of the progeny. For example, if one' were to make a cross betwee~ two pure lines. intercross the FJ,and from the inter-. cross obtaip approximately three-fourths ofone parental type and one-fourth of the other in thl! F z, a simple Mendelian model. one locus witbJ»'P...".aJJeJes...!'lnd-c1Qmi.". ,-llilllc.e-~oJJiJ1..b.elI1kII.~~ This conclusion would be reinforced if the backcross of an Fr hybrid (to the pure-line parental type that had not appeared in the F 1) pro.: duced progeny of which approximately one-half resembled the. aforementioned; parental ty)X's and one-hctlf the Fr hybrid. In man, the number of progeny from any single family is usually so ;m~ll that. these conclusions would not be warranted. A mating of two .heterozygotes for a. recessive defect can produce 4:0, 2:2, I :3, and 0:4 ratios, as ,well as the expected J: I, in families offour children. Similarly, the mating A/a X a/a having two chil­ dren, can produce 2:0 and 0:2 ratios as well as the expected 1:1. A bit ofreflection will convince the reader that this is not surprising. Consider., . for example, the couple with genotypes A/a and a/a, respectively, having two chil­ dren. Genetic theory expects them to have one A/a child and one a/a child because the A/a parent is expected to produce 1/2 A gemetes and 1/2 a gametes. ffthe A/a. parent is the male, this ratio will usually be realized among the sperm. Is there,' however, any guarantee that these"sperm will take turns, so to. speak, in fertilizing the egg'! Obviously not. The sperm involved in producing the two children could. well be A sperm in both cases or a sperm in both. If the A/a parent is the female, , there is not even a guarantee that the gametic ratio will be 1/2 A and 1/2 a, since ! each meiosis ordinarily produces only one gamete and the various meioses are independent events: what happens in one meiosis does not influence what is to happerl the next or any subsequent one. Hence, it could happen easily that the two' children of this A/a X a/a mating are both A/a or both a/a. Clearly the prediction of genetic results is fraught with uncertainties, events over which no one has control. Such uncertainty is generally referred to as' "chance." One author has recognized the large element of chance involved byenti. t1ing his book <?n genetics The Dice ofDestiny. The reader may well ask: how can . 171 ! .t I f
    • ..'1 ~ I
    • ; S I J : Chance and (he Distribution bf Families .. In genetic experiments with plants and other animals, the most direct inferences as to method of inheritance come from a count of the progeny. For example, if one were to make a cross between two pure lines, intercross the Flo :and from the inter-; cross oblaip approximately three-fourths of one parental type a~d one-fourth of the, other in lh. F 2• a simple Mendelian model. one locus with tw~ alleles and domi-I nance. wouid be inferred. This conclusion would be reinforced if the backcross of· an F,hybrid (to the pure-line parental type that had not appeared in the F,) pro- i duced progeny of which approximately one-half resembled the aforementioned I parental tyIXs and one-half the F, hybrid. : ' In man, the number of progeny from any single family is usually so small that.. these conclusions would not be warranted. A mating of two heterozygotes for a I recessive defect can produce 4:0, 2:2, I: 3, and 0:4 ratios, as ..Jell as the expected : 3: I, in families of four children. Similarly, the mating A/a X a/a having two chil- I dren, can produce 2:0 and 0:2 ratios as well as the expected I: Ii. i A bit of reflection will convince the reader that this is not s'u·rprising. Consider, ' for example, the couple with genotypes A/a and a/a, respectively, having two chil­ dren. Genetic theory expects them to have one A/a child and one a/a child because the A/a parent is expected to produce 1/2 A gemetesand 1/2 a gametes. If the A/a parent is the male, this ratio will usually be realized among the sperm. Is there, however, any guarantee that these sperm will take turns, so to speak, in fertilizing the egg'! Obviously not. The.sperm involved in producing the two children could well be A sperm in both c:;ases or a sperm in both. If the A/a parent is the female, there is not even a guarantee that the gametic ratio will be 1/2 ~ and 1/2 a, since each meiosis ordinarily produces only one gamete and the various meioses are independent events: what happens in one· meiosis does not influence what is to happed the next or any subsequent one. Hence, it could happen basily that the two children of this A/a X a/a mating are both A/a or both a/a. Clearly the prediction of genetic results is fraught with uncertainties, events I I over which no one has control. Such uncertainty is generally referred to as "chance." One author has recognized the large element.ofchance..involved byenti­ tling his book on genetics The Dice ofDestiny. The reader may rell ask: how can 177
    • J
    • ....-=----~ ~-­ Ii / (a) (e) (d) (e) (f) Figure 3.10 Examples of some simple Mendelian traits in hJmans that are relatively bommon: (a)common ba!dness, (b) chin fissure, (c) ear pits, (d) Darwin tubercle, (e) congenital ptosis, (f)epicanthus, (g)camptodactyly, and (h) mid-digital hair. See Table 3.1 for descriptions and modes of inheritance., I',
    • SIMPLE MENDELIAN INHERITANCE 11 c Ii , f Fig. 1-'4 A-C Nail defects in the nail-patella syndrome (anonycho-osteo-dysplasia). In A they are most severe on the index fingers and (not shown) the thumbs. The little fingers appear normal. B. Dystrophy of thumbnails in an affected brother. The lunulae are abnormally large. C. Complete absence of thumbnails in a daughter of the patient in A and in her eight-year old son. O-G. Some of the bone defects encountered in the nail-patella syndrome. O. Absence of the patella in the eight-year old boy whose nail defects are shown in C. (The epiphyseal centers of his femur and tibia have not yet fused to the shafts.) The boy's mother has a patella (E) but suffers greater difficulties in walk­ ing because the patella-and the associated extensor tendon-is displaced laterally. In other affected the patella may be in normal position but hypoplastic. F and G: Typical elbow defects. In F the head of the right radius. somewhat abnormally shaped, is dis­ placed fdrward; in G the left radius, similarly abnormal, is displaced backward. In other cases the capitulum of the humerus is poorly developed, compounding the problem. Note also, in F, the exostosis of the coronoid process of the ulna. (A-D courtesy of L. S. Wildervanck. E-G from Wildervanck, 1950b; courtesy of L. S. Wildervanck and Acta Radioiogica. )
    • (a) (e) (d) (h) ~~~f (e) (i) I Figure 3.10 Examples of some simple Mendelian traits in humans that are relatively common: (a)common ba!dness, (b) chin fissure, (c) ear pits, (d) Darwin tubercle, (e) congenital ptosis, (f)epicanthus. (g)camptodactyly, and (h) mid-digital hair. See Table 3.1 for descriptions and modes of inheritance.'
    • SIMPLE MENDELIAN INHERITANCE 11 Fig. 1-'4 A-C Nail defects in the nail-patella syndrome (anonycho-osteo-dysplasia). In A they are most severe on the index fingers and (not shown) the thumbs. The little fingers appear normal. B. Dystrophy of thumbnails in an affected brother. The lunulae are abnormally large. C. Complete absence of thumbnails in a daughter of the patient in A and in her eight-year old son. D-G. Some of the bone defects encountered in the nail-patella syndrome. D. Absence of the patella in the eight-year old boy whose nail defects are shown in C. (The epiphyseal centers of his femur and tibia have not yet fused to the shafts.) The boy's mother has a patella (E) but suffers greater difficulties in walk­ ing because the patella-and the associated extensor tendon-is displaced laterally. In other affected the patella may be in normal position but hypoplastic. F and G: Typical elbow defects. In F the head of the right radius, somewhat abnormally shaped, is dis­ placed forward; in G the left radius, similarly abnormal, is displaced backward. In other cases the capitulum of the humerus is poorly developed, compounding the problem. Note also, in F, the exostosis of the coronoid process of the ulna. (A-D courtesy of L. S. Wildervanck. E-G from Wildervanck, 1950b; courtesy of L. S. Wildervanck and Acta Radiologica. )
    • 1-6. Anonycho-osteo-dysplasia, better known as the nail-pateJla syndrome (Fig. 1­ 4), is one of the traits listed in Table 1-3. <!:all it the np locus. A man and his sister both have the trait, and both marry persons who lack it. The man's son, who lacks it, marries the sister's daughter, his cousin( who does have it. (a) What is the out­ look for their children? (b) If their first ctlild is normal with respect to this trait, what is the outlook for their next child? 1-7. Suppose you examine the widowed mother of the man add his sister of exer­ cise 1-6 and you find that she lacks the nail-patella syndrome. What would you conclude about the phenotype and genotype of her late husband (their father)? If more than one answer is possible, which islthe more likely, and why? Table 1-3 Progeny from testcrosses for several dominant traits (For references see Table 5-1 of Levitan and Montagu, 1977) , Num~r of Test Trait Cross Sibships Normal Affected Total Anonychia with ectrodactyly (Fig. 1-9) 139 57 66 123 ElliptocytosIS (ovalocytosis). both loci '65 113 99 212 (Fig. 13-2) Epidermolysis bullosa, all dominant forms 50 80 62 142 i (Fig. 1-3) Nail-pate~la syndrome (Fig. 1-4) 157 268 288 556 TOlal 31 518 515 1033 Ratio 1.006 :
    • 3.29. Infantile amaurotic idiocy (Tay-Sachs disease) is a recessive hereditary abnormality causing death within the first few years of life only when homozygous (ii), The dominant condition at this locus produces a normal phenotype (/-), Abnormally shortened fingers (brachyphalangy) is thought to bidue't~- a 'genoi"ype heterozygous for a lethal gene (BB L ), the hom~~ygote (Sii) being~riormal, and the-9th~rfi(;;;;~Jyg9.!~JBLBL) be~r:tgJetha.L What are the phenotypic expectations among teenage children from parents_who ,are both brachyphalangic and heterozygous(or''nfantileamaurotic idiocy? . '-~' '.~ ..--"" '-"--_'_- .. - , , . 3.30. In addition to the gene governing mfantile amaurotic idiocy in the above problem, the recessive genotype of another locus (jj ) results in death before age 18 due to a condition called "juvenile amaurotic idiocy." Only individuals of genotype /-J- will survive to adulthood, (a) What proportion of the children from parents of genotype liJj would probably nol survive to adulthood? (b) What proportion of the adult survivors in part (a) would not be carriers of either hereditary abnormality?
    • .. " Table 1-3 Progeny (rom testcrosses (~r several domina,nt tr~its (For refer~nces see Table 5-1 of Levitan and Montagu, 1977) i Number of Test Cross Sibships Normal ' Affected Total Trait Anonychia with ectrodactyly (Fig. 1-9) 39 57 66 123 65 113 99 212 . Elliptocytosis (ovalocytosis), both loci .j ~ r 1 (Fig. 13-2) 50 . ~ 80 62 142 Epidermolysis buJlosa, all dominant forms , (Fig. 1-3) 157 268 288 556 Nail.patella syndrome (Fig. 1-4) 311 518 515 I 033 Total Ratio 1.006 : Table 1-4 Offspring in 416 marriages b¢tween various M·N blood types. Data (rom Wiener et aJ. (1963). Progeny Total Number of Father Mother M M:N N ! • Offspring Families M M 71 1" 0 72 42 N N 0 0 29 29 20 M N 0 43' 0 43 23 N M 0 24 0 24 i3 All M X N 0 ~ 67 ,­ 0 67 36 M MN 67 46 0 113 63 MN M 60 55 0 115 59 All M X MN 127 IOIj 0 228 122 N MN 0 31 44 75 39 MN N 0 40, 27 35 All N X MN 0 71) 71 142 74 MN MN 61 118 53 '232 < 122 Totals 259 358 ' 153 770 416 , , I 3This apparent contradiction to the laws of heredity is believed to be owing t~ ,illegitimacy. but it may represent a new mutation, a change in the genetic material of one of the parents. '
    • 5i o o o "
    • --.~- --~~ .. .~. ./it 1-6. Anonycho-osteo-dysplasia, better known as the nail-patella syndrome (Fig. 1­ 4), is one of the traits listed in Table 1-3. Call it the np locus. A man and his sister 1 • ' both have the trait, and both marry persons who'lack it. The'man's son, who lacks it, marries the sister's daughter, his cousin, who does have it. (a) What ,is the out ­ look for their children? (b) If their first ehild is normal witt{ respect to this trait, what is the outlook for their next child? ' 1-7. Suppose y6u examine the widowed ,mother of the man and his sister of exer­ cise 1-6 and you find that she lacks the nail-patella syndrome. What would you conclude about the phenotype and genotype of her late husband (their father)? If , 1 more than one answer is possible, which is the more likely, and why? Table 1-3 Progeny from testcrosses for several dominant traits (F'or references see Table 5-1 of Levitan and Montagu, 1977) . i' Number of Test ! Trait CrosslSibships Normal Affected Total Anonychia with ectrodactyly (Fig. 1-9) 39 57 66 123 ElliptocytosIS (ovalocytosis). both loci i 65 113 99 212 (Fig. 13-2) Epidermolysis bullosa, all dominant forms 50 80 62 142 (Fig. 1-3) Nail-patella syndrome (Fig. 1-4) 157 268 288 556 Total 311 518 515 1033 Ratio 1.006 :
    • 3.29. Infantile amaurotic idiocy (Tay-Sachs disease) is a recessive hereditary abnormality causing death within the first few years of life only when homozygous (ii). The dominant condition at this locus produces a normal phenotype (/-). Abnormally shortened fingers (brachyphalangy) is thought to be due to a genotype heteroz.ygous for a lethal gene (881..), the homozygote (88) being normal. and the other homozygote (8 L8 L ) being lethal. What are the phenotypic expectations among teenage children from parents who are both brachyphaJangic and heterozygous for infantile amaurotic idiocy? 3.30. In addition to the gene governing infantile amaurotic idiocy in the above problem. the recessive genotype of another locus (jj ) results in death before age 18 due to a condition called "juvenile amaurotic idiocy." Only individuals of genotype /.). will survive to adulthood. (u) What proportion of the children from parents of genotype IUj would probably not survive to adulthood'! (b) What proportion of the adult survivors in part (a) would not be carriers of either hereditary abnormality? .
    • Table 1-3 Progeny from testcrosses for several domin~nt traits (For'references see Table 5-1 of Levitan and Montagu, 1977) Number of Test Trait Cross Sibships t Normal Affected Total Anonychia with ectrodactyly (Fig. 1-9) 39 57 66 123 Elliptocytosis (ovalocytosis), both loci 65 113 99 212 (Fig. 13-2) Epidermolysis bullosa. all dominant forms,I 50 80 62 142 (Fig. 1-3) , Nail~patella syndrome (Fig. 1-4) 157 268 288 556 311 518 515 I 033 Total Ratio 1.006 : Table 1-4 Offspring in 416 marriages between various M-N blood types. Data from , I Wiener et al. (1963). . Progeny lotal Number of Father Mother M MN N Offspring Families M M 71 I" 0 72 42 N N 0 0 29 29 20 M N 0 43 0 43 23 N M ~ 24 0 24 13 .-4 All M X N 0 67 0 67 36 M MN 67 46 0 113 63 MN M 60 5$ -, ~ 115 59 All M X MN 127 101' 0 228 122 ..~. N MN·.t 144 75 39 N 0 4gt 67 MNt ~ 35 All N X MN 0 71 71 142 74 MN MN 61 118 53 232 122 Totals 259 .358' 153 770 416 "This apparent contradiction to the laws of heredity is believed to be owing to illegitimacy, but it may represent . a new mutation, a change in the genetic material of one of the parents. .
    • , Table 8-1 Expected progeny in a random sample of four-children families vhere one is A/a and the other a/a, locus A being autosomal Expected Progeny As Proportions of Next In Numbers~ Generation Kind of Number of Family ProbabililY~ FamiJies b A/a a/a A/a a/a 4 A/a 1/16 100 400 1/16 3 A/a, 1 a/a 4/16 400 1200 400 3/16 1/16 2 A/a, 2 a/a 6/16 600 1200 1200 3/16 3/16 I A/a, 3 a/a 4/16 400 400 1200 1/16 3/16 4 a/ans ---'1.L.§ 100 400 !L!..Q Tolal 16/16 1600 3200 3200 8/16 8/16 1/2 1/2 'Calculated in the text. bCalculated on the basis of 1600 families. Table 8-2 Expected progeny in a random sample of four-children families where the parents are heterozygous for a recessive gene a Expected Progeny As Proportions of In Numbersb Next Generation Kind of Number of Family Probability' Families b A/-c a/a A/-< a/a 4,1/­81/256 2025 8100 81/256 3.1/-,1 a/a 108/256 2700 8100 2700 81/256 27/256 2,1/-,2 a/a 54/256 1375 2700 2700 27/256 27/256 I ,1/-,3 a/a 12/256 300 300 900 3/256 9/256 4 a/a 1/256 25 100 1/256 Total 256/256 6400 19200 6400 192/256 64/256 3/4 1/4 'Calculated in the text. bCalculated on the basis of 6400 families. <.4/- means that the genotype is either A/d or A/a. Table 8-3 Expected progeny in truncate sample of four-children families of which the parents are both heterozygous for a recessive gene a Progeny in the Kind of Proportions of all Proportions of the Observed Sample Family A/a X A/a Families' Observed Families A/­ a/a 3.4/-, I a/a 108/256 108/175 81/175 27/175 2A/-,2a/a 54/256 54/175 27/175 27/175 I A/a, 3 a/a 12/256 12/175 3/175 9/175 4 a/a 1/256 1/175 ~ Total 175/256 175/175 111/175 64/175 63.4% 36.6% ·See Table 8-2.
    • r r / Table 8-4 Analysis of the data in Table 8-3 by the direct $ib method (Weinberg's general proband method) . Total Sibs of Corrected Expected "Observed" Sibs/Proband Probands Progeny Affected Proportion" Proportion Probands Affected: Normal Affected: Normal Affected: Normal 0 81 0 0 0:0 0:0 0:0 I 108 108 I 0:3 0:3 0:324 2 54 54 2 1:2 2:4 108:216 3 12 12 3 2: I 6:3 72;36 4 4 3:0 12:0 • 12:0 Total 192: 576 Ratio 1:3 'Per 256 four-children families of A/a x A/a marriages in the population. ., Table 8-5 Suggested calculation of the~ gib Fl'IcrilOct:"'1l1ustration is for a theoretical truncate distribution of four-child families from ~/a X A/a Number of Number of Number of Corrected Progeny Family Unaffected/Family Affected/Family Families Normal Affected Type (U) (A) (N) UAN A(A - I)N 3: I 3 I 108 324 0 2:2 2 2 54 216 108 1:3 I 3 12 36 72 0:4 0 4 I 12 Total 576 192 Table 8-6 Direct sib correction in six families of spongy type polycystic kidneys of onset. (Data from Lundin and Olow, 1 ) Unaffected Affected Family Per Family Per Family Number of Type (U) (A) Familie.s (N) UAN 'A(A - I)N A I I 3 3 0 B 2 2 2 8 4 C 2 l ' I 2 Totals 8 5 6 21 6 Ratio 0.778 0.222 ExPected ratio 0.75 0.25 Expected numbers 20.25 6.75 x 2 : . 0.106 P;> 0.70
    • ........ .I
    • ~8 ·.~ .. - ':---.=.-.~ Chance and the Distribution of Families In genetic experiments with plants and other animals, the most direct inferences as to method of inheritance come from a count of the progeny. For example. if one were to make a cross between two pure lines, intercross the F" and from the inter­ cross obtain approximately three-fourths ofone parental type and one-fourth of the other in the F2• a simple Mendelian model, one locus with two alleles and domi­ nance, would be inferred. This conclusion would be reinforced if the backcross of an F, hybrid (to the pure-line parental type that had not appeared in the F,) pro~ duced progeny of which approximately one-half resembled the aforementioned parental types and one-half the F, hybrid. In man, the number of progeny from any single family is usually so small that these conclusions would not be warranted. A mating of two heterozygotes for a rece!>sive defect can produce 4:0, 2:2, I :3, and 0:4 ratios, as well as the expected 3: I, in families offour children. Similarly, the mating A/a X a/a having two chil­ dren, can produce 2:0 and 0:2 ratios as well as the expected I: I. A bit ofreflection will convince the reader that this is not surprising. Consider, for example, the couple with genotypes A/a and a/a. respectively, having two chil­ , . _~,Il dren. Genetic theory expects them to have one A/a child and one a/a child because the A/a parent is expected to produce 1/2 A gemetes and 1/2 a gametes. If the A/a parent is the male, this ratio will usually be realized among the sperm. Is there, however, any guarantee that these sperm will take turns, so to speak, in fertilizing the egg? Obviously not. The sperm involved in producing the two children could well be A sperm in both cases or a sperm in both. If the A/a parent is the female, there is not even a guarantee that the gametic ratio will be 1/2 A and 1/2 a, since each meiosis ordinarily produces only one gamete and the various meioses are independent events: what happens in one meiosis does not influence what is to happen the next or any subsequent one. Hence. it could happen easily that the two children of this A/a X a/a mating are both A/a or both a/a. Oearly the prediction of genetic results is fraught with uncertainties. events over which no one has control. Such uncertainty is generally referred to as "chance." One author has recognized the large element ofchance involved by enti­ tling his book on genetics The Dice ofDestiny. The reader may well ask: how can ~ 177 ~. -elY.i
    • / ',. 178 '. c TEXTBOOK OF HUMAN GENETICS CHANCE AND THE DISTRIBUTION OF FAMILlES 179 genetics pretend to be a science if it cannot really predict the results of a given failure. The chance of tossing a coin heads, for example, is I: I. one chance of suc­ mating? Or. to put it more concretely, since chance plays such a large part in genetic cess to one chance of failure; similarly the odds of obtaining a five in the toss of a transmission. of what value are the stated Mendelian ratios? The question becomes die are I :5. one chanee of success to five chances of failure. For arithmetic especially pertinent. we shall discover below, when the family size becomes greater ulation, however, the best way to state the probability is the way it is often instinc­ than two: 1110.1'1 families then will /lol yield the stated ratios of offspring. tively given for the coin example: a~~aJ.1iQt1,..¥"hQs~dc:,Mminator is the nl!.T ber Before answering this question it is necessary to understand that genetics in ...oLeg.l1,alLy_likely-e..f.ents-flossible-in-th@-osi-tuat,ion,..llJld,er discussion and whosc being beset with uncertainties is not singular among the sciences. Every scientist _n!l.menl1o~is..the.JlUll1,q~J_qlt.he~e_that~Qnsli!uJe the event whoscfik"ClTil'O'OO is in knows that in the real world absolute certainty does not exist: there arc only relative _q_ue~tion.,. ­ degrees ofuncertainty. To pick a rather absurd example, one could ask the follow. Example: Suppose an urn contains five balls of identical size and shape. Two ing two questions of a meteorologist: (I) will the sun rise tomorrow'? and (2) will arc black, marked B I and B2, and three arc white, marked WI, W2, and WJ. tomorrow be a sunny day? He cannot answer either question with absolute ccr. respectively. We reach in to take out one ball. (I) What is the probability of obtain­ but it is clear that he can answer the first with much more confidence than ing the ball marked W2; and (2) what is the probability of obtaining a white ball') the sl'cond. A great stride in the progress of a science is mad~: when it can ascertain or "I' I measure these relative degrees of uncertainty. The mc:eorologist feels a sense of ( I) P ro b ab 1 Jty 0 f W2 = ___ n_um_be_r_o_f_W_2_b_al_ls__ total number of balls present 5 achievement when he can improve his forecast from "it may rain" or even '"it will probahly rain" to "there is a 90 percent chance ofrain:'... JJ'cpmeasurement or quan. 1 number of white balls 3 ti.l;lti.Ql}"Qf~uncc:natnt~iS"eallt'a..f~t:61t)alJiJUJ;" ­ (2) Probability of white = total number of balls present - 5 ,._' " Usually the positive aspect of probability is emphasized. We arc more inter. ested in the degree we have moved toward certainty than in the amount of uncer­ Mathematically. these statements would be written: tainty left. Hence we think of probability as the likclihoGd ofa desired event rather than as the degree to which we fall short of certainty in attaining it. P (W2) = 1/5: Let us examine some of the basic properties of probability. Perhaps we should P (white) = 3/5. hegin with a well-known example. If a coin were to be tossed and we were asked P represents the phrase "probability of' the item in parentheses. the probability of its turning up "heads" the immediate answer would be some •. How large or now small may probabilities be? Thc....maximum probability 'ariant of "tinY-fifty:' "one to one" or "one-hal(" If we asked why we gave this would be reached if we were satisfied with obtaining any ball present. wfictliCr"'ii answer the reply would be roughly: why. it stands to reason; there are two ways the was white or black. Thus, coin may fall and one of these is heads, so the chance is I out of 2. one-halfor one 10 one. If we had an evenly weighted die and wished to know the probability of its P (white or black) = 5/5 = I. upright face showing five dots when thrown, it does not seem difficult to extend the logic of this answer: since any of the six sides of the cube could turn up and the Similarly, the mlfllmum probability would come from wishing to obtain so 111 t' l:lCe with five dots is only one of these, there is one chance out of six of obtaining other color than white or black from this urn: the "Ii'e." Clearly. the proba 1 ility of success, that is. the likelihood that a desired l'Cnt will happen, depends on the number of alternative events that could happen /' (neither white nor black) = 0/5 = O. and on the number of these that spell "success." This indicates that probabilit~ is a positive number between 0 and I. Merely saying "the number of alternative events" C:.tll be misleading. however. The sophisticated reader will suggest that probabilities of I and 0 repn:sl'nt heca lISC one could argue that in throwing the die on a gi ven throw there arc only "certainty," certainty of success and certainty of failure. rl'spectively, This is cor­ two alternatives. success or failure, obtaining a five or not obtaining a five. and this rect. Certainty is possible in a mathematical sense, since a mathematical system lead to the belief that the probability of SUCcess was one of two. Obviously assumes that only the conditions postulated will occur. In a few simple real situa­ this is not true: for every chance of obtaining a five there arc five chances of obtain­ tions this degree of simplicity may be approximated-but there is never a guaran­ il'g something else. We must modify our statement. therefore, to state that the tee that the postulated condition will prevail. Thus we may feci secure that there lllJ.l.bah.il~t..j;.,Qf success depends on the number of eqll(/I/J;,.lwn/Jahil' (or equally likely. are two black and three white balls in the urn, and this usually docs approximall' :1~1b,(ll1alicians-sa.Y;)..Go;€'nts"'fl0ssihlG-and_Lh~!lumber of these""'1IUi'l'Spell ~~l;CS,"- _ a mathematical system. but we really cannot guarantee absolutely, for example. that one or another of the balls did not disintegrate after wc placed it in the urn. III The probabilities discussed above may be stated "one chance out of two" and practice, tl},elikelihoQd.of.su~c,~af,l~ull~lI,.pec.te_d.e,y,glltjs_oO,cn so small that we ignore " lilt'cham'!' out of six." They could also be phrased as relative odds of success and it and we do speak of probabilities of I or 0 even for living things. Thus we have
    • " "-"""J.-______ 8 ,,! .:. ' 0­ " - a M a Chance andthe Distribution CT of Families o " In genetic experiments with plants and other animals, the most direct inferences as to meJhQd, of inheri!anfe~come from a count of tQ~_l'-rogeJ:lY. For example;-if one were to make a cross between t"'QPureljJle.s, inten::ross tl!e. Fh and from the inter­ cross obtain approximately three-fourths ofone parental type and one-fourth ofthe other in the F2, a simple Mendelian m~del, onc;Jp~l!s with two allel~~ ;~d do~i­ nance, would be interred. This conclusion would be reinforce<rifthe backcross of an-F,hybrid (to the pure-line parental type~thai had-not appeared in ti!e F,) pro­ ducedpr()geny~f which appro~imately ol,le-6a-lf resembled ·the' a'forementioned parental types and one-half the F, hybri4. In man, the number of progeny from any single family is usually so small that these conclusions woui<f noi'be" ~arr;.~ted Amating-of t~o heterozygotes for a ~<:e.~si~e_d~fectca'!.PIo_<!uce 4:0,2:2, 1:3. and 0:4 ratio.s. as wella's the expected .3: I., in families of four children. SimilarlY. the mating A/a X a/a having two chil­ dren, 'can produce '2:0 and 0:2 ratios as well-as the' expected I : I. . ". " . - . A bit of reflection wiUoonvince the reader that this is not surprising. Consider, for example, the couple with genotypes A/a and a/a, respectively. having two chil­ dren. Genetic theory expects them to have one A/a child and one a/a child because the A/a parent is expected to produce 1/2 A gemetes and 1/2 a gametes. If the A/a parent is the male, this ratio will usually be realized among the sperm. Is there, however, any guarantee that these sperm will take turns. so to speak, in fertilizing the egg? Obviously not. The sperm involved in producing the two children could well be A sperm in both cases or a sperm in both. If the A/a parent is the female, there is not even a guarantee that the gametic ratio will be 1/2 A and 1/2 a, since each meiosis ordinarily produces only one gamete and the various meioses are independent events: what happens in one meiosis does not influence what is to happen the next or any subsequent one. Hence, it could happen easily that the two children ofthis A/a X a/a mating are both A/a or both a/a. Clearly the prediction of genetic results is fraught with uncertainties, events over which no one has control. Such uncertainty is generally referred to as "chance." One author has recognized the large element ofchance involved by enti­ tling his book on genetics The Dice of Destiny. The reader may well ask: how can 117 :
    • ;> ". '& 178 TEXTBOOK OF HUMAN GENETICS CHANCE AND THE DISTRIBUTION OF FAMILIES 179 g('netics pretend to be a science if it cannot really predict the results of a given failure. The chance of tossing a coin heads, for exam pic, is I: I, one chance of suc­ mating'.' Or, to put it more concretely, since chance plays such a large part in genetic cess 'to one chance of failure; similarly the odds of obtaining a fivc in the toss of a transmission. of what value arc the stated Mendelian ratios? The question becomes die are I :5, one chance of success to five chances of failurc. For arithmetic manip­ cspecially pertinent. we shall discover below. when the family size becomes greater ulation. however. the best way to statc the probability is the way it is oftcn instinc­ than two: most families then will flot yield the stated ratios of offspring. tively given for the coin example: as a .fractioQ~,ho~cIdC'iiQiiilnatodi!£_lbe_number Aefore answering this question it is necessary to understand that genetics in QLtqlJ.alh: likcJLe~yenJs_possjblc_in Jhc~iJ.!.Iali.OT'l_!.I.!1der discussion and who..:;e hcing beset with uncertainties is not singular among the sciences. Every scientist in umerato~ is the n umber oLtl;le~e_thill, cQ1Jstitllle.tl1e.eveIlLwhosdi kcl.i hoodj!!jn knows that in the real world absolute certainty docs not exist: there arc only relative question,. dcgrees of uncertainty. To pick a rather absurd example, one could ask the follow- Example: Suppose an urn contains five balls of identical size and shape. Two two questions of a meteorologist: (I) will the sun rise tomorrow'! and (2) will arc black. marked 81 and 82, and thrcc arc white. marked WI, W2, and W1. tomorrow be a sunny day? He cannot answcr either question with absolute cer­ respectively. We reach in to take out one ball. (I) What is the probability of obtain­ tainty. but it is clear that he can answer the first with much more confidence than the sccond . ing the ball marked W2; and (2) what is the probability of obtaining a white ball') .. great stride in the progress of a science is mad~: when it can ascertain or number ofW2 balls ~ measure thesc relative degrees of uncertainty. The mc:eorologist feels a sense of (1) Probability of W2 = total number of balls present 5 achicvement when he can improve his forecast from "it may rain" or even "it will probably rain" to "there is a 90 percent chance of rain. " The measurement or quan­ numbc·r of white balls ~ titation of uncertainty is called !1 rohahility. (2) Probability of white = total number of balls present 5 Usually the positive aspect of probability is emphasized. We arc more inter­ ested in the degrec we have moved toward certainty than in the amount ofuncer­ Mathematically. these statements would be written: tainty leti. Hence we think of probability as the likelihood ofa desired event rather than as the degree to which we fall short of certainty in attaining it. /,(W2) = 1/5; P (white) 3/5. Let us examine some of the basic properties of probability. Perhaps we should hegin with a well-known example. If a coin were to be tossed and we were asked @,=eprescJ)ts.th~ph(a~g"::'lrQbl'lJ:tililJ'...QC the item in pnrentheses. the probability of its turning up "heads" the immediate answer would be some How Inrge or how small may probabilities be? The maximum probability ariant of "fiftY-fifty," "one to one" or "one-half." If wc asked why we gave this would be reached if we were satisfied with obtaining any ball present, whether it answer the reply would be roughly: why. it stands to reason: there arc two ways the was white or black. Thus, coj n may fall and one of these is heads. so the chance is lout of 2, one-half or one to one. If'e had an evenly weighted die and wished to know the probability of its P (white or black) = 5/5 =<i) upright fnce shOwing five dots when thrown. it docs not seem difficult to extend the logic of this answer: since any of the six sides of the cube could turn up and the Similarly, thc rninimuQlJll·Q.babiJity would come from wishing to obtain some nll'c with fie dots is only one of these. there is one chance out of six of obtaining other color than white or black from this urn: the "five." Clearly. the proba 1 ility of success, that is. the likelihood that a desired l'Cnt vill happen. depends on the number ofalternative events that could happen P (ncither white nor black) = 0/5 =(2) and on the number of these that spell "success." This indicates that J~'=9bability is a posi!ive number betweeo!b and I) Tvh.'rdy saying "the number of alternative events" can be misleading. however. The sophisticated readcr will suggest that probabilities of 1 and 0 represent because one could argue that in throwing the die on a given throw there arc only "certainty," certainty of success and certainty of failure, respectively. This is cor­ two a!ternati·es. success or failure. obtaining a five or not obtaining a five. and this rect. Certainty is possible in a mathematical sense. since a mathematical systelll might lead to the belief that the probability of success was one of two. Obviously assumes that only thc conditions postulated will occur. In a few simple real situa­ this is not true: for cvery chance of obtaining a five there arc five chances of obtain­ tions this degree of simplicity may be approximated-but there is never a gUllrnn· II'g something else. We must modify our statement, therefore, to state that thc tec that thc postulated condition will prevail. Thus we may feci secure that there pronability ofSuccess depends on the number of ('quail)' /lrobah/c (or equally likely. arc two black and thrce white balls in the urn, and this usually does approximate as thc mnthelllaticians say) events possible and the number of thesc that spell SUCt'l'ss. a mathematical system, but we really cannot guarantee absolutely. for example, that onc or another of the balls did not disintegrate after we placed it in the urn. In The probabilities discussed above may be stated "one chance out of two" and chance out of six." They could nlso be phrased as relative odds of success and "Illlt' practice, the likelihood of such an unexpected event is often so smalUl}a,t weigno!:r it. and we do spcak of probabilities of i or 0 even for living things. Thus we h:lVl'
    • . ~ ,.~ l1:fO TEXTBOOK OF HUMAN GENETICS CHANCE AND THE DISTRIBUTION OF FAMILIES 18 seen already that we,ignore the p-robabililY of new mutations when we predict that Urn A Urn B Urn 1 Urn B Urn A Urn [j Urn 1 Urn R Urn 1 Urn B the cross: p 0)G) CDG) 0)G) (0G) QG) A/A X :1/:1 - all A/A. 0)0 (00 0) 0' (00 80 In probability terms we arc saying G)G) CDG) 8G) (0G) 0G) P (A/A X ..1/.·1 - A/A) = I. and this cross is therefore usually wntten 0)0 (00 80 (00 00 A/A X A/A - I A/A. Q. Q CDQ 8Q 0Q QQ Implicit in asking the probability of obtaining a Il-hill' ball from the urn is the 88 08 8 8 08 08 Idea that il docs not matter whether we obtain W I. Wi. or W3. Any of these alter­ n:! I i Vl' e'en Is spells success. Thus. Fig. 8-1 The 30 combinations possible when a ball is drawn from each of two urns i the first urn contains 2 black balls and 3 white balls, the second contains 3 black Oil!' P (white) == I + I + I and 3 white ones, and the balls are equal in size but distinguishable by numbpr. 1/5 + 1/5 + 1/5. Thus. Hut. x3 P(WI)= 1/5. P (black. black) = 5 X 6 I' (W2) = 1/5. and .; 2 3 P (W3) = 1/5. ,= 5 X'f;' Iknec. But. P(white) = peW!) + P(W2) + P(W3). '2 111 ot her ords.jla II erelll has S(xccqLalt~p.wtiJ;G,J'1J;w.);J!oJla.Uaj,n m.c.n.t..,Q,f a llt. ~_~ = I' (black from urn A) a Ill'rna I i e is considered at tai n,1l1cn.l~ofTt.h('TdesiFC&eVcrr1'!'"tht?"flf(-)/!ahilil,h.4).t,;.IlC,1:~.~ i~1'i11'7f1TIi(' [I}'{)bahiliI ic..!i-!J£Jjw.(!_lJi/'CI;lJati-w.l•.J(J!~!1IS-J n com monsen.'ie terms, and when we arc nol particular or choQsy the chances for success areincreased. 'J An equally interesting probability is that of a success which is composed or P (black from urn B). 6 scwral eents thaI must happen simultancously or ill succession. Clearly, this prohahility must be smaller than the likelihood of attaining anyone of the events Thus. the probability of obtaining black balls from both urns is the product () alone. How much smaller? To solve this problem let us relurn to the urn with the the probabilities or obtaining a black baH from each urn. 10 hlack halls and the three white ones and set up a second urn with three black We may generalize this into a rule:.J1l£'prohahilil.1' Oro/Jlaillillg ('('rtaill sill/III halls. 83. 84. B5, and Ihree white ones. W4, WS. and W6. If a ball is to be removed f a 11(:tJ.IIJ-(Qt-s,ucc~si~c:,I:J:l11..W.l--llJ.(;'.J1lJ.1d.IJ5;t....Q[ 11U:"'IID1!i'tilri1TTif'n'JiTl/(· (w·n I.' from each urn. how likely is it. for example. that both will be black? To answer this illl·()lJ.;(:lLJ.J&..chcck,Jbi.s~generalization. we may note that. c may return to our basic concept of probability and enumerate how many - l'qually probable events. each consisting of one bal! from urn A and one ball rrom I ) (white. w h' . Itc) = '3 X (; 5 .1 urn B. can happen and no Ie how many of these consist ()f a black ball from each urn. The equally probable events arc shown in Fig. 8-1. 9 "" 30 There arc .10 equally probable events and 6 of these represent pulls of two black 1.,111 .... TIm'>, and this corresponds to the enumerated proportion. P (black. black) ;", 6/.10 = 1/5. . This rule too. fits our commonsense expectations. The likelihood that both wil happen must be smaller than the likelihood ofcither one alone. I'or each c )uJd Oftl'l NOlI.' Ihal Ihe 30 in the denominator is the product of the number or balls in urn happen without the other. Since probabilities are fractions less than one, multipl~ -. : and the number of balls in urn B: likewise, the 6 in Ihe numerator is the product ing them has the effect of producing a still smaller fraction. tha,1 is. decreasing th, oflhe Ilumherofblack balls in urn A and the number of blacks in urn B. no ' probabilitv.
    • • Jil2 # .J TEXTBOOK OF HUMAN GENETICS CHANCE AND THE DISTRIBUTION OF FAMILIES 183 .' This "'plUltipJkatioll~r,uk;;:";s valid whether the desired events occur simulta ­ ncously or in a particular order. Furthermore, it is applicable whether the proba­ WHA T THE GENETIC RATIOS MEAN hilitics of successive events arc independent or dependent. Drawing balls from the We may now consider the meaning of the Mendelian ratios. tO urns-described above-is an example of independent probabilities. where the rrohability of one event does not influence the probabilty of another in the series. 1I1ealling /: A genetic ratio sta(~.Llu:..pt;abaililil,l,-:lh[q~si!Jglc:.hil:!Jl. When we state ~ . A/a X a/a ...... 1/2 A/a, 1/2 a/a Likewise. if only one urn is used and the ball replaced between pulls. the probabil ­ ities of successive choices are independent. To illustrate tJ!:pendeflt nro.babililies we., we are saying that at any given birth from these parents the odds arc I: I that the may consider only the first urn and calculate, for example. !!l$..n(ob@jLit¥~oipick­ child will be a/a. On the other hand, for the mating in a black ball twice iIl..-succ~ssi0n..if-t·tu;:uirst~Q.Q.e..r~mo.xedjs not seen and 110t replaced )efore m!.Uil'l~Q.tJt...lJ1e_sf.CQn.d_ ) Ala X A/a -+ 1/4 A/A. 1/2 II/a, 1/4 a/a. "';;;';"Forrt~ choice. as before. the stated ratio means that at any given birth there is twice as much chance that P (l31 or 82) = 2/5. the baby will be A/a than that it will be A/A. Likewise the odds are 2: I for A/a """""'---"!""--=:;.:00Ii!'!"- ­ versus a/a and I: I for A/A versus a/a. If A is dominant over a, the probability is I f the first choice is successful. only four balls remain in the urn and only one of 3/4 that a child from the last mating will show the A phenotype. 1/4 that it will I hese is black. Therefore. for the second choice show a phenotype, in other words, three times as much chance for gene A to he P (black) = [1.4......· present at least once than for its complete absence, a/a. .:-.lI(l$). Meaning 2:,..!.1kall,ll.,sJbsllip..+hfM'e,5uit..-suggested.lJ,v..Jl,u;...ge/l£,tic ratio has the W!.ldal The probability that both events happen is. then prohabili1.I. This probability is usually /lot the probability in the Mendelian ratio. 2/5 X 1/4 = 2/20 = 1/10 "-but must be calculated for each case. The genetic ratio leads us to expect that the distribution of children in a family The student is often disturbed by dependent probabilities because it seems will conform to this ratio. but we know it usually will not do so. For example. if more necessary than for independent probabilities to suppose that the first choice the parents in the mating is slIccessful before calculating the probability of the second. He should keep upper­ most in mind that we arc seeking the likelihood that both events do happen. We A/a X a/a -> 1/2 :I/a. 1/2 a/a wish to...calculate the probability of success. not the probability of failure. Failure. not obi5ininii'he desired resuft"tndudes all the other results that could occur: the decide to have two children the ratio "predicts" that they will have one A/a child lirst choice is "right" but the second "wrong." the first wrong even if the second is and one a/a child. But our biological knowledge, and observation of actual fami­ right. or both choices are wrong. Calculating the probability of failure on the same lies. tells'us that such families will consists of two A/a children or two a/a children basis used for the probability of success verifies our result: as well as one A/a and one a/a. Using our rules of probability and the first meaning for the genetic ratio, we can calculate the probability or expected relative frequency I' (black. white) = 2/5 X 3/4 = 6/20 of these sibships. P (white. black) 3/5 X 2/4 = 6/20 P (white. white) = 3/5 X 2/4 = 6/20 , 1'(2:1/a) = /'(/I/a. A/a) = 1/2 X 1/2 = 1/4. Sinn' any of these alternative spells failure. Likewise. P (Failure) = 6/20 + + = 18/20. Since 6/20 6/20 )Jjgj£) = P(a/a. a/.£l;. 1&~2 = For one A/a and one a/a, we should note that this may happen in two ways. the 1/4. ­ /' (Success) + P (Failure) = 1* birth first or the a/a first. P (Success) I - P (Failure) Therefore. I - 18/20 2/20 = 1/10 = the same, result obtained directly. lJd/a. a/2) = y2 X 1/2 = I~ ..£(aLq~.Jta.),~tb2~UL2 = 1~Jl.n~L ,...llWL<ca.IldJ ll.IJJJ = JL4 + 1/4 = 1/2. *This is a usl'ful relation 10 keep in mind. Unlike the case above. it is Qnl'n easier 10 calculate the prnhahility or I l i ohwining the desired result. and then subtra(:ting rrom unily. than to calculate The previous line is unique because the probability of obtaining the "expected" direrllv lhe prnhahilily or the desired result. ratio is the same as a fraction in the ratio.
    • ., CHANCE AND THE DISTRIBUTION OF FAMILIES 183' "1~2 TEXTBOOK OF HUMAN GENETICS WHAT THE GENETIC RATIOS MEAN 'I (. This "multiplication rule" is valid whether the desired events occur simulta­ neously or in a particular order. Furthermore, it is applicable whether the proba­ hilities of successive events are independent or dependent. Drawing balls from the We may now consider the meaning of the Mendelian ratios. two urns-described above-is an example of independent probabilities, where the Meaning I: :'~genelic rgti!Lslq}e~_lh(' [JrQhal!.W.!J~.t(U· (LsU!gle..bil:th. When we state probability of one event docs not influence the probabilty of another in the series. .A/a X a/a ....... II1JIjJl._IJ2.~q/.a Likewise. if only 011e urn is used and the ball replaced between pulls. the probabil­ ities of successive choices are independent. To illustrate dr.11£'ndent probabilittes_,e we are saying that at any given birth from these parents the odds are I: I that the may consider only the first urn and calculate, for example, the .P!9babilit.y of pi~k­ child will be a/a. On the other hand. for the mating ­ ing.a_hIHdsJ!.~IU~ice"",iJ;uuccession, if the Jirst one rem~qye_dJs_fJQLScgO_.mJLI.1()t I~nlacc.(lpefore pulling_o.uJ_the..se.cond. A/a X A/a ....... 1/4 A/A. 1/2 A/a, 1/4 a/a, ----------.--~-'.­ fool' the first choice. as before. the stated ratio means that at any given birth there is twice as much chance that P (BI or B2) = 21J. the baby will be A/a than that it will be A/A. Likewise the odds are_2..:...Lfm.,..dul _v_e[s.u~ALa and.t:J fOL!1L.:cCv.ersus~ala. IfA.is._<:!QOJi(l3tlJ..o.Y.cL(1. the probability is I If the tirst choice is successful. only four balls remain in the urn and only one of ~/~thauuAil.dJi:Q.qUb!!Jas.tmating...w.illshow-th~...A....phcnot.y.pe. J./.;l...lb.3LiJ~ these is black. Therefore. for the second choice sho~{L1~henotYJ!e, in other words. three times as much chance for gene :I to be J:.Jhj~c;Jsl_=_tM_ present at least once than for its complete absence. a/a. Meaning 2: itl,l1fJ,ILSiqAbiv.• lhe..resuiUu&qcsf.ecLby.. tb.('"gclI.cli.(.;..raUQ.ba:s..tlu· "Iot/ql The probability that bOlh events happen is. then probabili/..!:. This probability is usually not the probability in the Mendelian ratio. 2/5_~_1./_4 = 2/2Q....:;.....lLLO but must be calculated for each case. The genetic ratio leads us to expect that the distribution of children in a family The student is often disturbed by dependent probabilities because it seems will conformJo this ratio, but we know it usually will not do so. For exam"plc. if more necessary than for independent probabilities to sUJ)pose lhat the firSt choice the parents in the mating is successful beforecalcutating tlieprobability of the second. He should keep upper­ most in mind that we arc seeking the likelihood that both events do happen. We A/a X a/a ....... 1/2 A/a. 1/2 a/a wish to calculal~J!l<;..probability of SIlC(!'.!S. not the probability of failure. Failure. not obtaining the desired result, includes all the other results that could occur: the decide to have two children the ratio "predicts" that they will have one il/a child first choice is "right" but the second "wrong." the first wrong even if the second is and one a/a child. But our biological knowledge. and observation of actual fami­ right. or both choices are wrong. Calculating the probability offailure on the same lies. tells us that such families will ~onsists of two A/a children or twoJl/a children hasis used for thc probability of success verifies our result: as wellas one A/a and one a/a. Using our rules of probability and the first meaning for the genetic ratio. we can calculate the probability or expected relative frequency f> (black. white) = 2/5 X 3/4 = 6/20 of these sibships. P (white. black) = 3/5 X 2/4 = 6/20 P (white. white) = 3/5 X 2/4 = 6/20 lX?tJLa) "" fJJ/a...::l£g) =Jj2 X 1/2 = l/.i Since any of these alternative spells failure. Likewise, P (Failure) = 6/20 + 6/20 + 6/20 = 18/20. J.j2a/q) = eLqLQ,.J1£q) = JJ~ X 1/2 = -1L4: Since For one A/a and one a/a. we should note that this may happen in two Vay~, the /' (Success) + P (Failure) = 1* A/a birth first or the a/a first. P (Success) = I - P (Failure) Therefore. "" I - 18/20 = 2/20 = 1/0 = the same result obtained directly'; = 1i2._~~LL2-=--ljS.. .P'(d/.i1,-a.iJl) p(f}ja. A/.Q.l = l.a~>-U.L2 = 1/..4. and ./ L'<L:!/a and I a/!!..~ =:..1/4 + 1/4 = t.Lb *This is a useful rcialion to keep in mind. Unlike the case above, it is often ('asier 10 calculale the The previous line is unique because the probability of obtaining the "cxpceted" prohahility of I/O/ obtaining the dcsir('d rcsult. and Ihen subtracting from unity. than to calculatc dirl:l'lly tht' prnhahility of the desired result. ratio is the same as a fraction in the ratio.
    • l~O '~ TEXTBOOK OF HUMAN GENETICS CHANCE AND THE DISTRIBUTION OF FAMILIES 1 seen already that l::-e ignore the probability of new mutations when we predict that Urn A Urn B Urn A Urn B UrnA Urn Il Urn A Urn B UrnA Urn B thc cross: ii8 0 fl0 0 8G) 00 GQ ~~"""'aIlA/A. "1100~00 00 00 80 In rrobability terms we are saying -ott 0 @ -.1 0 8@ G)0 80 P lJ/A X A/A ....... A{"-L::'--l. 80 0 80 G)0 80 and Ihis cross is therefore usually written .1/..1 X A/A ....... J A/A. 88 08 G8 G8 G8 Implicit in asking the probability of obtaining a while ball from the urn is the 08 08 8 0) 08 idea that it docs not matter whether we obtain WI. W2. or W3. Any of Ihese alter ­ Fig. 8-1 The 30 combinations possible when a ball is drawn from each of two urns natin- CTnts spells success. Thus. the first urn contains 2 black balls and 3 white balls, the second contains 3 black onl and 3 white ones, and the balls are equal in size but distinguishable by number. I' (white) :: I + I + I 5 1/5 + 1/5 + Thus, Aul. 2X3 P(W!) = 1/5. P (black, black) = 5 X 6 P (W2) = 1/5. and 2 3 P(W3) = 1/5. = 5X 6' IicncC'. But. I'(while) P(WI) + P(W'l) + I'(W3). 2 P (black from urn A) In other 'ords:..i.fall erelll has sereral allf'!1l.S!.!.i.!,t.lill!2IS and allainQl£I1J •o C'lny 5 ill!l:UEltixl,~is~(~(msidcreJ:LaJtiljnmcr!t..Qf.lhc..desired>e_vcnl. the J!f'Jh(//JiliO' IJj'SIIC<'l'' and IS (II(' SUIIl o[lhe pro/'ahililies of'lliese allemalire Ji}f!Ul. In commonsense lerms. hell we arc not particular or choosy the chances for Success arc increased. .-n equally interesling probability is that of a success which is composed of ~ P (black from urn 8). sewal en~nts that must happen simultaneously or in succession. Clearly. Ihis probahilily must be smaller than the likelihood of attaining anyone of the events Thus. the probability of obtaining black balls from both urns is the product 0 alone. How much smaller'! To sol'(' this problem let us relurn to the urn wilh Ihe Ihe probabilities of obtaining a black ball from cach urn. two hlack halls and the three white ones and set up a second urn with Ihree black We may generalize this into a rule:.:rhf.•J!rohahilill' (!( ol>ta;I/;l/g cl'rtoiJl.:Ji!PU halls. A3. R4. R5. and three white ones. W4. W5. and W6. If a ball is to be removed tal/eOIlS (or suq::£l>siyc,) _('n'llls is II/(' prodl~<.!_!f Ille pmhahilifil's 0" Ihe l'r(!}fi From each urn. how likely is it. for example. that both will be black? To answer Ihis Iiimln,:d.To check this generalization. ;"'c may note that c may return to our basic concept of probability and enumerate how many . .) 3 3 l'qually probable events. each consisting of one ball from urn A and one ball from P (whIte. whIte = 5 X "6 urn H. can happen and note how many of these consiSI of a black ball from each 9 urn. Thl' equally probable events arc shown in Fig. 8-1. 30 There arc 30 equally probable events and 6 of these represent pulls of two black I Ills. Thus. and Ihis corresponds to the enumerated proportion. I'(black. black) = 6/30 This rule too. fits our commonsense expectations. The likelihood Ihat both wil happen must be smaller than the likelihood ofeilher one alone. for each c .wld 011L'1 Note Ihat the )0 in the denominator is the product of the number of balls in urn happen without the other. Since probabilities arc fractions less than one. nlultipl) i , and the number of balls in urn R: likewise. the 6 in the numeralor is Ihe product ing them has the effect of producing a still smaller fraction. that is. dccreasing 1h (11' thl' numher of black balls in urn A and the number of hlacks in urn R. probabilitv. .
    • . 184 '. TEXTBOOK OF HUMAN GENETICS CHANCE AND THE DISTRIBUTION OF FAMILIES 18! Consider the parents with Jimr children. All four births may result in A/a prog­ means the genetic ratio is expected only if the sampling procedures are unbiased eny. The likelihood of this is Examples of the most likely forms of biased sampling will be described later. 1/2 X 1/2 X 1/2 X 1/2 == (1/2)4 == 1/16 . .. Consider again the parents A/a and a/a having four children each, The result: 'h <a= ;"h'ld' f: 'I H :lVtng t r~/a an one a/a C:L ren.Jn.o:varrnY,,6W'IOtlf-may.r0ffir_t,o..our dt.f; f,'" € ',"' expected in a random sample of the five kinds offamilies are given in Table 8-1 The progeny are caJ<.:ulat!!d...iI!.,JWO wa¥2:~6It~1 numbers eX~d)in a typica Ii.'rl'nt ways since the a/a child could be the, first-born, the secondo, the third-. or sampICofT660 familie.land (2, .rra:ctional proportionsof1hc total sample. Eithe the lourfn:oorn. Tlicse 'iIifferent families are: ~lie calCUlations arc made, the ta Ie s- ows the expected result is the same a: ,Probabilitv . the genetic ratio: half A/a and half a/a. ~t('{!JJ;j./l.,.,A/a.Ar{J:·' 1/1,6 '- I/;;. I( f/'2, it tl2 A '/2.. A similar conclusion is demonstrated by Table 8-2 concerning the four-chil Aja, ''G/a,;A/a. A/a 1/16 dren families expected from a cross of two normal heterO?;:y,gQtcsyfoJ;..vecessivl A/a. 'A/a.<{/3.:'lla· 1/16 trait, for example, alkaptonuria-:-Agiin there are five kinds of families. but these A/a. A/a. A/a. rfl/q,r- 1/16 have different probatrflities than in the previous cross. (3 A/a, I a/a)'" ~ 4/16 == I./,,4.. ......... Here the classical expected rati~iU/4 nO;'mal L:!L::.);!L 4 alka[!tonuti.c,s..f!l/a) Each birth order has a probability of 1/16. Since-we are interested only in the prob­ By our first meaning of the ratio, thl~~<l!ls~th!!t at each birth thc,re is.a~R.rQbaQmt ability ofobtaining 3 A/a and I a/a. irrespective of the birth order, the probability -2Lthrr~-fQur.ths.that.the.c.hUQ~iIl be nQrm1!.!.-one-fourth tha1.hc wUU~~~n..alk!l,n is the sum of the probabilities of the alternative orders, or 4/16. .. ton uric. Hence the probability that all four children will turn out to be normal i: Families of2 Ala. and 2 a/a~hildren may be achieved by any of6 birth orders: -rf/4t or ~ /25(]", a very substantial likelihood. that all four ~il1 t;£. abnormal i: 'L~ or 0.!!IY-.l{~56. A/a A/a a/a a/a~ k(t; I • "'*" ","", -Having three normal and one alkapton uric can ~CC)! in four ways: Ala a/a Ala a/a . Ala a/a a/a A/a' , I a/a Ala A/a a/a' I , .. a/a A/a a/a A/a .', th- O~'-b ( "2- 3,e, ~ /" ~ ~ ~. nann : ltv rtf)2- .~~ a/a ala A/a Ala i. ,~ annn Each birth order has probabilitY..2L(lL4)~I.!. or 27/256, so the probability tha Again each birth order has a probability of 1/16, so the probability of A/a X a/a anyone of the four alternatives will be achieved is 4(27/25'6 or 108/256. Obtain parents having two Ala and two aLa children is.Q/..lc6o:r By argument si!11 ilar to ing one normal and three alkaptonurics can alsoo(;;ur in four alternate birtl the case of three .~'/a and one a/a. ha~ving one A/a and three a/a has a probability orders, hut each has a probability of (3/4) . (1/4)'. or 3/256, for a total of 12/256 of 4/16. and the probability of four a/a is identicai to the probability of four ~..' ~'-- .I/a: 1116. 1.: 't~' ~: : ( . The most remarkable result of th?s analysis is that the family predicted by the genetic ratio, two A/a and two a/a. has le_~probability of being achieved (6/16) Table 8-1 Expected progeny in a random sample of four-children families where Oil( parent is A/a and the other a/a, locus A being autosomal .!.b an of 1101 being achieved (lOill1.This will be true for any familysize greater tnan two. Of the five kinds of families with four children possible. however. this family Expc,·ted Progcny _._-----_." does have the.lIig/test single probahilit.l'..Jhus. the genetic ratio docs not tell us the As Proportions likelihood of ~iiiirfg sllch a ratio in a sibship of given size. It does tell us, though, of Nex.t I that the sibship indicated by the genetic ratio is the one we are .t'WJ1_WU:J.y to Kind of In NumbcO Gcneration NumbefC) r: encounter. Therefore, when we say that th~.cross A/a X ala is "expected" to pro­ Family Familic~h A/a a/a A/:} a/a !~ duce a family oLhaltdLa and half a/a, we mean this family is the "most oficlI 4,I/a '/16 100 400 1/16 expected" one.* ­ 3 A/a. 1 a/a 4/16 400 1200 400 3/16 ;lcollillg J: Jlp.!!!/J.c:!!.Y ()f' a large Numbel' ofcouples with tlie same ge/wtJ)j,c.w;c.. 2 A/a. 2 a/a 6/16 600 120() 1200 .1/16 :1/ If i"rnmted. the total are expected to equal The genetic ratio, RfOvlded that all the var­ I ,I/a. 3 a/a 4/16 400 400 1200 1/16 .1/1r ious sibship~an be ascertained in their expected propor·tr;ns. The proviso simply 4 a/i/ns ~ JQQ 1.1 1( Total 16/16 1600 .nlO J200 RII6 XIII 1/2 1/ ~If th~ theory leads 10 alJ .![l!eal ex.u eetation • ..;~ay 3_!i2:3 laln a (ami'¥-l.!L? Ihe most cx.prl:ted ". Jamilies are the rcal ones which arc closes I 3:~ and 4:3 in the e,,",lmple). .( :::kulatc.4.i.!lJhcACX~ ~ hCakulatcd on thl' basis of t600 families, ).'
    • 1'8b"f' ;, TEXTBOOK OF HUMAN GENETICS, CHANCE AND THE DISTRIBUTION OF FAMILIES 187 Table 8-2 Expected progeny in a random sample of four-('hildren families where the METHODS OF ASCERTA~ part'tlls are heterozygous 'for a recessive gene a t:: L .......",.".. Expected Progeny The 'method of collecting data in Tables 8-1 and 8-2 is generally referred to as As Proportions of comp/ele ascertainment (or complete selection). In complete ascertainment the sib­ Kind of Number of In Numbcrs~ Next Generation ships to be included (selected) arc ascertained through the parents or, for X-linked ralllily Prohability· Families~ A/-' ala A/-' traits, the grandparents. It is "complete" because an attempt is made to study all ~ Ii the members of the sample. When all the genotypes can be differentiated easily, 81/~50 ~ 8100 81/256 'J I,' , r aIel IOli/~50 2700 8100 ~700 almost all collections of families constitute completely ascertained samples. Thus 81/256 27/256 "'I 1/ . 2 a/el 54/~50 ,1375 2700 ~700 27(256 27/256 the data in Table 1-3 for relatively rare dominants are pools of completely ascer­ 1/ . J a/II 121256 JOO 300 900 v:! 56 W250 tained families, and, as we noted there, fit the Mendelian ratios very well. -I a!eI ~o 100 A problem arises in attempting complete ascertainment for a recessive trait: 1'(1;11 250/250 MOO 19200 h400 I 92 t::! 50 04/250 We arc unahle to recognize A/a X A/a matings when they happen to have all nor· 3/4 1/4 mal offspring. The technical term for this error is trunmtl'sl'/cctio/l, since what we '("lnoiall.',1 in the 1,'1 are really doing in such a case is cutting off a portion of the sample by selecting ~~~~ "( '"kuiall'tI nil Ih~ oasIs of Il~O{l famihes only those families with at least one affected child for our data. The older term mean' that the ;lennt'l'" i, ,·;tll", II. I or AId "complete selection." that is, ofaftixted individuals. was obviously unsatisfactory, ~ but a combination of the two. complete truncate selection,'may be best. Note'. how ­ ever, that the word. "selection" is used here in a very different sense from the Dar ­ winian "selection" of Chapter II. One has to he very careful in reading the literatun.' to note whether truncak Th,' ren,,;n;ng t,,,, nf lamd,. half nonnal and half alkapton",;" selection was the method of collecting the data even though another criterion may si forms (as on page I 84L each with probability of (3/4f . (1/4)'. or 9/256, so the be suggested by the title of the table or the article. For example. an total is 5-l/256. ' was testing the hypothesis that in one series of families retinohlastoma was inher ­ Note that ollr "expected" family. threc normal:one alkaptonuria. is again the ited as a dominant. Many affected persons now survive through early recognition blllily with the highest single probability. in conformity with Meaning 2. of the disease and surgical removal of the affected eye. In the study in question. Again we calculate the progeny in Table 8-2 in two ways: by counting a spec­ data were collected concerning the proportion of retinoblastoma among the chil ­ ifil'd number of families any by proportions. In either case the expected progeny dren of such survi vors. Ostensibly this should in volve no prohlem of ascertainment from the random sample conform to the genetic ratio: 3/4 normal and 1/4 since the retinoblastoma parent in each case was heterozygous and one parent was alkapton uric normal, leading to an expected ratio of I normal: I retinoblastoma among the prog­ In actual practice the expected results will rarely hc attained exactly. The likc­ eny just as for the similar dominants in Table 1-3. In the data presented. the ratio lihood of finding precisely 15 families out of 6400 with all four members alkapton ­ was 20 normal:63 retinoblastoma, an apparent 3: I ratio in favor of the dominant urics. filr example. is very small. Had there been 24 or 16 such families. the exact results would not have been attained. Similarly, if the number of fam­ ilies counted werr not a multiple of 256. these exact reSults could not be expected. This type of deviation from the expected result is referred to as "sampling error." a+ Table 8-3 Expected progeny in truncate sample of (our-children (amilies o( which It is L'Onsidered due to chance. since it results from the accidents of sampling and the parents are both heterozygous for a recessive gene a 1101 from any change in our biological model or any purposeful deviation from J>rogl'ny in thl' randnril sampling procedures. I Ohserved of of all Proportions of I hI.' t Seeral statistical methods are availab to calculate how large a deviation must hl' hd(lfL' it may be considered significant. that is. bcfbre wc suspect that perhaps Family A/a X A/a Families' Observed FamiliL's A/­ a/a J ,'1/-. I a/a 108/256 108/175 XI/175 27/175 pur expectation is based on the wrong hiological model or that random sampling 2.1/--. :1 a/a 54/256· 54/175 '27/175 '27/175 procedures were /101 f{)lIowed in obtaining the data. I .-JIa.:I a/a 12/256 12/175 :1/175 9/175 It suIliL'c<' at this Doint to note ( I ) that random samolim! errors tend to cancel 4010 ~ cJLl}) tends to vary inversely with the sam ­ Total 175(250 175/175 111/175 04/175 size becomes larger the chance deviations li'om expected 63.4% 36J,Hh :cnd to hecome less important. ·Sec Tallie 8-2, l~_-_- ­
    • ;, 1'86* TEXTBOOK OF HUMAN GENETICS CHANCE AND THE DISTRIBUTION OF FAMILIES 187· i' Table 8-2 Expected progeny in a random sample of four-children families where the METHODS OF ASCERTAINME~T ~'M('nt~ art' heterozygous for a recessive gene a • Expecled Progeny The method of collecting data in Tables 8-1 and 8-2 is generally referred to as , As Proporlions of C.Q!l)p.lfte...asccl:laim.lJ(.mL(~r;:r::QrrmJ~te ~~l,gc~rojY}. In complete ascertainment the sib ­ "'illti "I' Number of In NumberQ Nexi Generalion ships to be includcd (selected) are ascertained through the parents or, for X-linked, lalllily Prnbabililyi::J Familief~) a/a A/-0 a/a t,mi(s, the grandparents. It is 1~plej£i' Q.€:.caus~.aJLilJte!!!pt is made to study...!" ~em:)~Ls_QLlhe_sampl~. When all the genotypes can be differentiated easily, ~ 1/ 81 {256 .2025 8100 81/256 Ii 1 a/II 108/256 .2700 8100 2700 almost all collections of families constitute completely ascertained samples. Thus 81/.256 27/256 ; 1/ ~.:? a/a 54/256 1375 2700 2700 '21/256 27/256 the data in Table 1-3 for relatively rare dominants arc pools of completely asg;:r­ /I , ,1 a/a 1'2/'256 300 300 900 3/'256 9/256 tained famili.('.s, and. as we noted there, fit the Men.oelian_ ratios very well. r,~,-- ___ ._. . ~'=-u ~ _ ' _._ _ u ._ _""""" -111/<1 1/256 A problem arises in attempting complete ascertainment for a recessive trail: T,'I.,1 25&/25& 6400 19.200 t.400 19.2(256 64/256 We arc !-,lI1able to recognize A/a X d,la mating§.wh~n thcyJID.l!lle.!'u,9 hA"P,U,U nOJ­ 3/4 1/4 '!lill.Q(fspring. The technical term for this error istU;.!!llfafWIcr[i'gw since what we "( 'alrulaled III Ihl' 'l'1. arc really doing in such a case is&ulti~ng~Qff_il~p()l.:Jiqn onhe samptiyscjceting' ~llall'll nf,'. ~h<:..ha~is ,9L6,I[)()J:'lflJ,!lil;~, only those famiUs~ ~<!Ueast_OJle<!.[e~tedshiI9Jor...QuX_djlJ.<~, The older term , f' Illea'" Ihat Iht',~l'n(1typt' is l'itlll'r(i7)"rQ '~Ielerssl~' that is, ofaffecJed~indi.xi,d.l.!als, was Ob~Y-!lJl!'a,tis.r~!KtQLY. hut a S,,0mbination of the two,@pfetel,truncafCJselffii'Om may be.Q£ll. Note, how- . ever. that the word, "selection" is used here in a very different sense from the Dar· winian "seleetion" of Chapter I L . One has to he very careful in reading the literature to note whether truncate ), Till' remaining type of family, hillJ:"nQrmqtalJ,dJJJ!ILal~ap,tonur,ic, may take any of selection was the method of collecting the data even though another criterion may (as 011 page 184). each with prohability of(3L1'>'~j,11..~U~~Qf_9.t],5,6,-.s.(U.be six 1(1I'l11s be suggested by the titlcofthe table or the article. For example. an opthalm()!ogist lotal is~ was testing the hypothesis that in one series of families retinohlastoma was inher ­ d ' , , ' NOll' thaI our "cxpected" family, three normal:one alkaptonuria. is again the ited as a dominant. Many affected persons now survive through l'arly recognition with the highest single prohability. in conformity with Meaning 2. of the disease and surgical removal of the affected'eye. In the stUdy in question, -gain we calculate the progeny in Table 8-2 in two ways: by counting a spec­ data were collected concerning the proportion of retinoblastoma among the chil­ ified numbcr of families any by proportions. In either case the expected progeny dren of such survivors. Ostensibly this shouid involve no problem ofascertainment from thl' random sample conform to the genetic ratio: 3,44 normal and I/..4 since the retinoblastoma parent in each case was heterozygous and (~ne parent was alkal2tomlti.~. normal. leading to an expected ratio of I normal: 1 retinoblastoma among the prog­ I n.. !Fi-,II-a-:-L-p-9-c-·,.t..,..~tlte~ f::,)pex!~,stte,s,ult~ w.tILrilIdy~be.att<Ji 11I;d~cxac.tly. The Ii ke­ ! .... eny just as for the similar dominants in Table 1-3, In the data presented, the ratio , Ii hood of finding precisdy~_~i.f~l!liJies~oucoCMo.O with all LQu.cmcnibcrs.al.kapton­ was 20 normal:63 retinoblastoma, an apparent 3: I ratio in favor·or the dominant, I l!L!.(,'1i, for exampk. LU:cJ_y_small. Had there bC'en 24 or 26 such families. the exact "expectcd" re-sults would not have been attained. Similarly. if the number of fam­ ilies counted wert' not a multiple of 256. these exact results could not be expected. 1"1 This lpe of deviation li'om the expected result is referred to as "sampling crror." Table 8-3 Expected progeny in truncate sample of four-fhildrcn families of which . 11 is c;lIlsiden:dllul'.!o~chal!c.e, since it results from the accidentJLgfsampr~~~~d ( the parents are both heterozygous for a recessive gene a - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 'j rU,)t li'om any change ill our biological model or_~!!.Y_Ill!!Qoserl!Lgevi<JtiQIL(roll1 Prngl'ny in the f:;u:ul ulll.s:LI1m!in!LP...!:Qced u res. Kind of Proportions of all Proportions of the Ohsef:_e51~~~r~':.. SCl'ral statist ical methods arc availabe to calculate how.large a_ds.v.iatj(ln.must Family A/a X A/a Families· Observed Families A/- a/a U:. ll1',tiJfl'jullm he cOl1sidl'red l'jgIJiJiQlJ},t. that is. belore we suspect that perhaps l 3..1/-, I a/a 108/256 108/175 81/175 27/175 our l'xpectation is hased on the wrong biological model or that random sampling 2A/-.2a/a 54/256 54/175 . 27/175 27/175 proccdures were 1101 followed in obtaining the data. 1 ,J/a. 3 a/a 12/256 12/175 )/175 W17 5/, It sullln:s at this point to note (I) that randQJ1LsamnliJ]gXJ:r.Q(s_t~ng~to_cancd 4 a/a ~ 11.!1~ ~,;,Hll,U!h,t'. .ut!!.~: and (2) the rdatin; m,~,gnitude l£!!9~!.Q...!1!.!:Y inversely with.!,llc..S3Ql,; Total 175n56 175/175 111/175 M@ J11c sill':: as thl' fu11)1pJe_siza_bccOl.ucs...la(g~r the chance dl'vi~tions from expected 6.1.4% ':i6,(,1l1o I lU!.ito hecome Icss imQort'!!!.t. 'See Tahle 8·2, "'213 '" 113 "
    • i84 '. TEXTBOOK OF HUMAN GENETICS CHANCE AND THE DISTRIBUTION OF FAMILIES ,< 18 Consider the parents Wil~children, All four births may result in~/.fLRrQg­ means the genetic ratio is expected only if the sampling procedures are unbiased S!l~:' The- likelihood of this is Examples of the most likely forms o~ biased sampling will be described later. LLL2:CL/Lx_lt'2_X_I-t'L=_(J ,.,2}'!..o: _L/~L6, Consider again the parents A/a and a/a having four children each. The result expected in a random sample of the five kinds of families are given in Table 8-1 Ha "ing tbJ:£e_.J,La~l!!ld~QnLqLa, children in a family of four may occur in ,(Qur dif: The QlQg~ny_are~<;:aJ£!JlaledjJL1wQ_ways:, (I )taclual numbers expectccb in aJYRica 1j.'J:cnJ_way,s-.since the a/a child could be the first-born. the second-. the third-. or ~pleofl600 fllmili~s and (2)tfrictl07ari>;-oporti~()Clhe~tntaLsampk, -Eithe the fourth-born, Thesc different families are: way the calculations are made. the table shows the expected result is the same a Probability the genetic ratio: half A/a and half aiq, a/a. A/a, A/a. A/a 1/16 A similar conclusion is demonstrated by Table 8-2 concerning the four-chil A/a. a/a. A/a. A/a 1/16 dren families expected from a cross of two normaLhelqQD/~otes for a recess!v A/a. A/a. a/a. A/a 1/16 trait. for example. alkapJ,Q!I11Jill. Again there are five kinds of families. but thes A/a. A/a. A/a. a/a 1/16 have different probabilities than in the previous cross. (3 A/a. I a/a) -1/16 =0 1/4 Here the classical expected ratio is li.ino;·m.ll! Jdi - ):Ui.ltlkaptQ@rksJa/a) Each hirth order has a probability of 1/16. Since we arc interested only in the proh ­ By our first meaning of the ratio. this m.eansJbataJ.ea"<::1jJ.i[thJbere is a_RrobabiliJ. ahility of obtaining 3 A/a and I a/a. irrespective of the birth order. the probability of thr<;:~:fourths,.that.Jhe"Gh~ld~wi.ll-be-nor:mal.,.one",follJth_lhat he will he an alka,p is the sum of the prohabilities of th'e alternative orders. or 4/16. tgnur.ic. Hence the probability that all f(tu.r chijdr,en will turn out to Q~,.,.!!9II1.1::J1 i, Families of2 Ala and 2 a/a children may be achieved hy..illlY of {iJ2irJh._ordecs: (]/.i)~ or ~/25(i .. a very substantial lik_elihood.- ihat alLfour will be ahnorJl1~lJ (1/4)~. or 0_nILIj256. A/a A/a a/a a/a I II Co . Having three normal and one alkapton uric can occur in four ways: A/a a/a A/a a/a 1/1 b r:b~' A/a a/a a/a A/a nnna (314)'·(114)"27/25" 1/1(0 a/a A/a A/a a/a nnan l/{~ a/a A/a a/a A/a nann Illb a/a a/a A/a A/a Illb ann n Each hirth order has probahility of Q/.:!L ·.Jj4. or 27/256. so the prohahility tha Again each birth order has a probability of 1/16. so the probability of A/a X a/a any one of the four alternatives will be achieved is 4(27/256). or 108/256. Ohtain parents having two A/a and two a/a children is_{iUQ", By argument similar to ing one normal and three alkaptonurics can also occur in four alu;mate hirtl the case of three A/a and one a/a. having one A/a and three a/a has a probability of .4(16. and the probability of fgl1r_a/q..,is identicai to the probahility of four .f/{cL/~LL I: 4 : (, : 4 : I orders. hut each has a probability of (3/4) . (1/4)'. or,3/256. for a total of 11/256 . .--.- ~~ --­ The most remarkable result of this analysis is th",t the family predicted hy the genetic ratio. two A/a and two a/a. hasJ~s~_PrQbabjli.ty~g achieved (6~~) Table 8-1 Expected progeny in i1 random sample of four-children filmilies when' onl than_o(I/(.II-hcingachiefe.<:iJJ..Q/l.§). This will be true for any family size greater than pilrt'nt is A/II ilnd the other a/a, locus A being autosomal two. Of the five kinds offamilies with four children possible. however. this family Expected Progeny does have theiJJg!/e.ILsingt!;../IJJ2!Jal"jJiD:.. Thus, the genetic ratio does not tell us the As Proportions likelihood of attaining' such a ratio in a sibship of given size. It does tell us. though. of Next that the sibship indicated by the genetic ratio is the one we are !l1.!l'iLlikcb: to In Numhers~ Cieneration Kind of Numbe~of CXlcounJer;. Therefore. when we say that the cross A/a X aLa is "expect~~<!" to pro ­ Family Prohahilit0 Familles"-' A/a a/a A/a a/<I duce a family oLhaIL"l'a~<!nd_haILa,La. we mean this family is the "most (!fklJ. 4 .f/a 1/16 100 400 1/16 £X1?££1~Q:''onc.! 3 .I/a. I a/a 4/16 400 1200 400 3/16 1/11 ,lcaning 3: N..,?.!!.!.g£!!.1:...! f a large IIIlmber (~r couples with tile same gcno/.u!!:.!!re 2 .f/a, 2 a/a 6/16 600 I~OO 1200 3/16 3/1 ( ('(//{I/t{'{/.Jlle total arc expected to equal the gcn(lic{alio. provided that all the var­ I .f/a, 3 a/a 4/16 400 4()0 1200 1/16 3/1 ( i~bsl~ips can he ascertained in their expected proportions, The proviso simply 4 a/(ms .Jill 100 4()0 L/I ( TOlal 16/16 1600 3200 3~OO R/16 X/II 1/2 I,' . *If the theory leads to an.,u~p£.~ta).io!1, say lla~-1L:LinJIJam"l!y_oU, the most expected I;tlllilics are the real ones which are closest (J:.'tillld~4:J_il1_thc.e"'.amph:). "Calculaled in the texl. ~Iated on the b:t.Sls~Q1:'L6()Ql~mil~~,
    • 188 • TEXTBOOK OF HUMAN GENETICS CHANCE AND THE DISTRIBUTION OF FAMILIES 189 trait. The difference from expected is highly significant. It turned out that the sib­ An illustration of how it works i~ given in Table 8-4 for families of Table sh i ps cre almost always identified through a patient with retinoblastoma and then 8-3. linding that he had had a parent who had survived the condition. Thus the ascer­ It should be noted that the method involves eliminaliog,.tll.e..n~~Jlf.!,J:tU.l that lainn1l."nl was by way of the affected offspring rather than through the surviving each of the affected ~gW:r.l1.usC~.~.W"lliAa,p,o.te.ntial<PLo=band+-lhus, each of parent. Sibships with no affected individuals were not being counted at all, a clear 'iliCfour affected (probands) in the last row of Table 8-4 has three affected siblings. vase of truncate selection for a dominant trait. To set the record straight, a few of or aTotafOr 11, (nat must jje counted. Similarly, each of the three probands in t.l;te the I;lmilies of Table 1-3 were originally ascertained through an affected child. For families wllfiJlii~g.a.fIecJ~Jib_a~ne..!!QJ:t:naLa(l.(Uwo affected sibs, or a total QLtJ.u;:eg this whit,. howC"er. sibships from these families were collected on the basis of an '"pormal and six affected per family; since there are] 2 s_u_ch families per 256 in the alkcted' parent. so that sibships having no affected children were as likely to be population, the progeny ~£!J.If,lt~d:aF_72 affected:36 normal, et.c.• l'(luntcd as sibships containing one or more affectcd. Calculation of the direct sib correction is simplified by handling the data as in For the effect of truncate selection on a recessive trait. let us reexamine the Table 8-5. Usually the data contain a mixture of families of different sizes. For mating of two heterozygotes. A/a. studied in Table 8-2. Since both parents are nor­ example, Lundi,n and Olow studied an early onset form of polycystic kidneys. The mal. how do we know they arc heterozygotes so that their families should be disease is present at birth and the infant rarely survives more than a few hours. counted in our sample'! Usually our only clue has come from the presence of at They found nine cases in six families. Although the ratio of normal:affectcd chil­ least one affected child among .their progeny. In other words. the families with all dren in these families was 12:9. Lundin and Olow concluded from the direct sib I<)tlr children norm·a!. being Indistinguishable from the ICSt of the normal popula­ tion. arc not counted at all! correction that the data suggest a 3: I ratio (Table 8-6). The weakness of the direct sib method lies in the difficulty of obtaining data Table 8-3 shows the effect this has on the observed ralio even irthe ascertaincd ....,lYiJh which to use it. Generally, the literature does not describe the numbers of families arc ohtained without bias. that is, families with fbur affected children arc 'affected:normal per sibship as was provided by Lundin and Olow: rather, the total no more likely to be counted than families with only one affl'cted child. Clearly. _. data in a collection of sibshiQ.Ul.Lg~ven size is given. Breaking down the data into the 3: I ratio should /l0/ be expected when data are colleded in this way. expected random numbers of different types is not onJYlaborious .but also suscep­ Very soon after Menders laws were rediscovered in 1900many geneticists real­ tible t6 error: Furthermore, the me'thod wastes considerable collected data if the izcd that truncate selection posed a major' obstacle to attempts to determine bulk of it derives from sibships of three children or less. All one-sib families must wl'ether human traits followed the Mendelian ratios. If a trait was suspected to be be thrown out. Similarly. in a random sample gL:.!La X ALa marria&$.1,.vnl~ 8 items .dominant.sampling bias could be minimized by increasing the size of the sample __(~._.i![e£!J,~p:.6_no..1m~t)...ra.n..,b,.!:, salvaged from the 14 progeny of each 7 two-sibling and taking as great pains to find the families with only one or two affected members _fam.UiC$, and 96 from the III progeny of 37 three-SIbling families. Note that the 6 as the families "'ith many affected. However, even the most assiduous and pains- children in the 3 two-child families in Table 8-6 contribute only 3 items of data. effort could not avoid truncate selection for a suspected recessive trait. Often These considerations hamper calculations of statistical parameters to test the ran­ there was simply n~.. way to identifYJhe h_eler~xg,otes who had no affected children. . dom sampling error of the data and the validity of pooling data from various sources. The direct sib method does have the virtue of pointing out its own unbiased CORRECTIONS FOR TRUNCATE COMPLETE SELECTION ratio of affected: normal in the progeny. This is not true in the second method of correcting complete ascertainment data: the so-called "direct a {lI'iori" method. Fortunately for the history of human genetics. geneticists and mathematicians developed methods to correct for this vexing problem. Three methods in 13rticular are widely used. The first was suggested by a Derman P..hysician, W. •Veinbcrg. in 1912, only 12 years after the rediscovery of Mendel's work. He noted hat the c~'plete ascertainment sample was t.tu.nca'J·",b~J.:a!!$....QLa_h.ia.s in favgr of he affected progeny. These were the individuals that caIleJ attention to the sibship. hat is. they were the prohand.J.,.OI' propositi (singular, propositus). Another synonym n"indexcase." ~ - 0 81 0 0 0:0 O:U ";0 Thl' normal progeny, on the other hand, could not identify the family as 108 108 0:3 0:3 O:3~4 ,donging in the data: hence. they were deficient in the final totals. To correct this ~ 54 54 ~ 1:2 2:4 I08:~16 3 12 12 3 2: I 6:3 72:36 mr.alancc he suggested th..£.p.robands be removed,.and the count among the siblings 4 I t 4 3:0 12:0 t1:0 .1' the probands should show the correct ratio. This correction is referred to as TOlal_t~92:.U!.! I"cl1Ihl'l"g's proband mc/hod: unfortunately. the latter term has also been used for Ralio 1:3 lln''Pml·t>hod~s(}Ll'hat''theEconsensus nowadays is to call it the·"~t~.!!ib':':'.tne.t~Od.J ft. 'Per 256 four-children families of A/a X Ala marriages in the population. P'>' Z-,.
    • ,~----------~~.~.' '~"-'~">-"~~~-' 190 TEXTBOOK OF HUMAN GENETICS CHANCE AND THE DISTRIBUTION OF FAMILIES 191 Table 8-5 Suggested calculation of th.~ mI:e<:.lsib met.h~i. Illustration is for a 'I:, = (I - pl/)1~ .• or 1.' = r:,L( I - li1. If this were theTc,Q.~~cJ)o!llJ. the proportion thl'PH'ticdl truncate distribution of four-child families fro.J;l!..4~.d/.ll- of affected in the lotal data would be: ~1'~~"""0 _. 4!, . Numhl.'r of Numher of Numher of ('orrl'Cll.'d . number atTected Falml~ { lnafli.'c1ed/Family Affected/Family Familil's Nllrmal Affl.'ctcd q = l' Type (l II (A) " 3 This may be compared withthe theoretical prohability of affected. q. ,. , 108 324 0 2 2 54 216 This method may be illustrated by an_a~u.JaI..Gount_of_lli,l1e~€'h.jldmn_with lOS 1.1 I Fricdreich's ataxia among eight sibships of two each from normal parents. If Frie ­ 1;.1 0 3 12 36 n 4 I dreich's ataxia is inherited as a single locus recessive. Total 576 192 p = 3/4 Ratio .3 I q = 1/4 p" = (3/4)' = 9/16 1 - Ii' = I - 9/16 = 7/16 The llJ,ll'jol'i method derives from the observation thal.'<:l~e.eJ:.t!1in amount of T = (8)(2) = 256 16 6 data is n*singJ,t(UncatcdJ.frp!1l"1b.~,,S.i!!!lRIe. How much is missing depends on the , 7/16 7 ., :iSsumed ratio that the cO.llected data were expected to til. For example, in four­ children sibships. the t1ata missing consist of the families with.'!JI...[Q!I..!-!lQrmaL If q == _9_ . 36.6 = 0._. "46 IIll' expected ratio is 3: I. this class has a probability of(3/4)4. or 81/256 of the fam ­ lies: it: howC'er. the expected ratio were'''i":L the missing group ~uld be onlv (IL Since q = 0.250. this is a very close fit. 2». or 1(16 of the f'!!lJjlie..§" If the actual total observed is increa5.,ed hy the~~. ... '!'-,~ To extend the method for a series of t:'lmilies of different size. it is necessary ·~j&f(.pluhen the number of affected divided by the "true" total should equal the merely to calculate a 1: for each size. Adding these gives an overall 'I;. The total ----"' correct ratiH> In gem'ral. . i £..e.. =;:_Qr:()I;m.bitity~Qf~no fmal' . "­ affected divided by this ~I: constitutes the calculated q to be compared with thc a priori q from the hypothesis. The data of Table 8-6 may be used to illustrate the method (Table 8-7). Aga i !lA,!§J;.c!:.y_dQ~c...IQ.,t,hMhcFOct-i€al.q..and~tI:w.sTstx.C.t:)glile,llubc,;.b.};RO t h es is and q = probability ofatTected. ,thai this fo~m~Q[p.2!X£ytic kidneys is determined by the recessiv.e..aJI~e.Q.[aTsin8.le anln = the number in the sibship. and the proportion of data missing.. p "._ =. "'" ~Iocus. Calculating the di visor. I - p". each time is lahorious. Tahle 10-6 gi ves these -- thl' proportion of data present = ; - p". This is the pr?...Il!?rtion of a true total. T. ~~ Irthe actu"aLtQtaLdata...:=..1~,.,an,estlmate of T. 1;. may he obtained Table 8-7 The direct a priori method used to correct the truncate data of Table 8-6 Sill' of Numbl'r.~)1 Lhvls<;lr Sibship Sihship{.J( (I - f/1 T, AnCrll.'d Expcl'ted 1 ,'ble 8-6 .Qirl'ct sib correction in six families of spongy type polycvstic kidneys of ['MI' onset. (Data from Lundin and Olow, 1961) (.1) (II) 4f'J ) «() (.1· H/O (I»)-zJ&V T~4 , x , , '-2 llnatkcll'(l Atkcted i - , 2 .1 .2.. 13.7 f~";;" .1 3.4 16 Il.05 Family Ipr Pel' Family (til Per Family (Al Numlll.'r of .'(. liN j..r?.. :4 2 175 11.7 ~~4 29 ~'1 256 0.7J ;<- X' J J o '51-')... . 7 14197 8.7 1 2.02 I! 2 5 ~ ~ - , r 2 8 10 4 16.184 lUll I "Ial.s 8 ' fJ 5 ~ ft -----.~ 1 (, Total 33.5 9 8.3" {),78" (~~:) Halio 0.268 ()77f1. 0.222 Expl'l'Ied ralio 4= 0.75 0.25 EXIX'l'led numhers 20.15 6.75 EXPI.'Clcd l{ 0.25 x~ = 0.106 p> 0.70 "x' for diifcrcncc bclwccn ILl and 9 ~ 0.05 with I d.f.: P > (l.SO. "x'. with 3 dcgrC'Cs of freedom: P> 0.80,
    • 1/)0 TEXTBOOK OF HUMAN GENETICS CHANCE AND THE DISTRIBUTION OF FAMILIES 191 Table 8-5 Suggested calculation of theAirect sib_m.ethod. Illustration is for a 7:, =.J..!_=-P")1~ orl..~ = 7:/(1 - P.::); If this were the, £orrcct total. the proportion thl'()reticaUrutlcate distrib,utiojl of fQur'sh!ld famill£s from;!a X.A/£l 'Ofaffected in the total data would be: . Numherof Numher of Numher of C'orrCCll'd Progeny . number affected Family t Inalli.'ctcd/Family A!feclcd/Fam ily Families Normal ,,"eeled q= ________ ______~T.e Type (tIl ,lJAN ~(t._IJti. U This may be compared with the theoretical probability of affected, q. ,. , -' 108 .124 0 -. ~ 2 5~ 216 108 This method may be illustrated by an a:::tual count of nine chit<:!,f.£..n with 1:.' I Friedreich's ataxia among eight sibships of two each from normal parents: If Frie­ 11:-1 0 .1 12 36 n 4 I dreich's ataxia is inherited as a single locus recessive. TOlal576 192 Ralio 3 I p = 3/4 q = 1/4 = (3/4)~ = 9/16 l-p"=19/16=7/16 The a priori method derives from the observation that ll<:,!;tai.lJ_amounU)f T (8)(2) = 256 166 gata is niTSSing.O!:u.n.catcd) from !he.sal11P.le. How much is missing depends on the e. 7/16 7 ., asslImed ratio that the collected data were expected to tit. For example, inJ(}llf­ .dl~ldrt'n~s.bships. the data.missing consist of th" farT1ilies..Yl;ilh.aIU:Our~nqr:,!lal. If q ,. = (9. = 0.')46 36:6 -. thl' 9P.C(:t<::cI..f!ltiQ";s3.;,1. this class has a.probabilitY_QL(3/Ji. or !.!/256 of the fam­ ilies: if. however. the expected ratio were .1:1. the missing group would be only Since q = 0.250. this is a very close fit. or lLL6_.Q.Cth,cJ<:lm.ili~s. If the actua!JQ!atob..~c.t:..Y_ed.isjncr:cased by the missing To extend the method for a series of families of different size. it is necessary lhlCl.!on then the number of affected.cljyisi.ccl.. by_thc...:.true" total should e,9ual the merely to calculate a 7~ for each size. Adding these~ gives an overall l~ The total l1!!:JTC I.(a t io. affected divided by this l~ constitutes the calculated q to be compared with the a In general. fJ/'iori q from the hypothesis. The data of Table 8-6 may be used 10 illustrate the method (Table 8-7). i(l~pr:.obabjlity~of.normal Again gjs verL<:L(}~e_tPJhe.thc!J~etica.Lq and thus streng~hens Ihehyp~sis and q =probabilitv of affected .th;tllhisJorm_of.poly£ytic kidneys is determined by the recessi ve allele of a singlc' and-;/ = the numb~~~~ibship. -...- and the proportion of daHL!!ttssing-="I!, ";., tQ£..u,s. Calculating the divisor'~l!: each time is laborious. Table 10-6 gives these th... 'proportion of data present = '1 - l!::'. This is the P rrm9Jtio.n~qL!!"JI!I.£J.QtatJ.'. .. If the !lctJ.laIJQlal.gata~t" an estimate of T. 7'e. may he obtained by equating Table 8-7 The direct a priori method used to correct the truncate data of Table R-6 Size of Number of Divisor Sihship Sihships (I - (,A) Tt Aff('Cled Expected Ti'ble 8-6 QirectJ:il'!..l~qrre.ctiol1 in si'S (amili~.s of ~rong~XE£.r.2.!~.!;xstL( kidney..:" of (.1) (/1) :(N) «() (.1 ·11/() (IJf,;P;N Tc./4 JC , (l!l~t'I. (Data from Lundin and t I natk-, Il'll A ffi.·ctl'd HI '" 2 .1 7 1.1.7 I( 3= ,: J.4 r:;llllily ['er Family P('r Family 16 n.DS Numhcrof 212 :: hl'l' It (Al 2+2 " 4 2 175 11.7 4 2.9 Families (Nl )lM:-l ·j.;;_"_Jlt:! 256 0.73 = 1 .2 " t!: :; II u J :; 3 8 () 4 5 .. 2 :: 7 14197 16384 8.7 2. t I ~ 2.02 0.01 ,) ') tf It ____...In ') 1'11;11, R 5 Total 33.5 9 R,3" O.7K" 6 21 (, Kalin ()77R 0:;2:; . (ED) q = E1; 0.268 Expn'led ratio 0.75 0.25 EXPl'Clcd numbers 20.25 6.75 EJ(pecl~'d if 0.25 x2 = 0.106 "x 1 for diifcrcncc between 8.3 and 9 = 0.05 with I dJ.: f' > 0.80. I' > 0.70 h X1. with 3 degrees of freedom: P > 0.80.
    • 188 • TEXTBOOK OF HUMAN GENETICS CHANCE AND THE DISTRIBUTION OF FAMILIES 189 trait. The difference from expected is highly significant. It turned out that the sib­ An illustration of how it works is given in Table 8-4 for families of Table ,hips en' almost always identified through a patient with retinoblastoma and then 8-3. linding that he had had a parent who had survived the condition. Thus the ascer­ It should be noted thatJhe.mc.thodjn.vol,ve(Climinating the proband?but that winllwnt was by way of the affected offspring rather than through the surviving e.ach oft.h~ affected progeny must be considered a R~tentialI>roband. Thus. each of r. "~~~Ji: ____ c _ parent. Sibships with no affected individuals were not being counted at all. a clear the [QUJ~aJrecteQ.(pr9IJands) in the last row of Table 8-4 has !bxec_affecled..sibLings. caSl' of truncate selection for a dVlIlil1alll trait. To set the record straight, a few of ora Jotal of 12, that must be counted. Similarly. each of the Jhr:ee_proba!Jds.,in_lh,e the 1:lI1lilil'S of Table 1-3 were originally ascertained through an affected child. For (amilies wjJh three affected ha~U:)JJe,nQrmaLand.two_a[~!~~ sibs. or a iqtaI.oDhrse this table, however. sibships from these families were collected on the basis of an ngrmal and ~!2La!fected_pc.rJami!y; since there are~L2.sycb~fami.lies,per..2~6 in the atli..'cted parent. so that sibships having no affected children were as likely to be l'untl..'d as sibships containing one or more affected. . ; population. the PJogenysounted.m:c]2.affecteq:l~_Q9(1!I~1. etc. Calculation of the direct sib correction is simplified by handling the data as in For the e§~cLQfJruncate selection on a recessive trait. let us reexamine the Table 8-5. Usually the data contain a mixture of families' of different sizes. For mating oftwohetero~otes. A/a. studied in Table 8-2. Since both parents are nor­ example. Lundin and Olow studied an early onset form of polycystic kidneys. The maL how do we know they are heterozygotes so that their families should be disease is present at birth and the infant rarely survives more than a few hours. counted in our sample? Usually our only clue has come from the presence of at They found nine cases in six families. Although the ratio of normal:affected chil­ k'3St one affected child among' their progeny. In other words, the families with all dren in these families was 12:9. Lundin and Olow concluded from the direct sib fl)ur children normal being indistinguishable from the lest of the normal popula.­ ~. arenot counted a( all! ~ ~ correction that the data suggest a 3: I ratio (Table 8-6). The_we.a.k.ness.QUhe.dir:e.ctsii:> meJhQd lies in the difficuHy.:.oC.Ql:>!i!ining~data Table 8-3 shows the effect this has on the observed ratio even if the ascertained .o,yith.VI£hich_to.use...it. Generally. the literature does not describe the numbers of are obtained without bias. that is. families with four aftccted children are affected: normal per sibship as was provided by Lundin and Olow; rather. .the" total 110 more likely to be counted than families with only pile affected chil~l. ~tatajn_a_cQ!'-c::.ctign_of.~ibships ofgiven.size.is.gi:v"en. Breaking down the data into the :~~ I ratio should /101 be expected when data are collected in thi.s~. expected random numbers of different types is not only laborious but also suscep­ Very SOon after Menders laws were rediscovered in 1900 many geneticists real­ tible to error. Furthermore. the method wastes considerable collected data if the i/cd that truncate selection posed a major obstacle t.) attempts to determine bulk of it derives from sibships of three children or less. All one-sib families must wl'elher human traits followed the Mendelian ratios. Ifa trait was suspected to be be thrown out. Similarly. in a random sample ofALg,.2.<~,~l[aJIIarrii!g~. onlY.Kitems dominant. sampling bias could be minimized by increasing the size of the sample GLaffec:ted:Q~nprm,!1) can be salvag~QJron:uhe_I..4_progeny of each 7 two-sibling ;lIld taking as great pains to find the families with only one or two affected members f!!!!li!i~s. and 96 from the III progeny of 37 three-sibling families. Note that the 6 a<; 11ll' families ....·jth many affected. However. even the most assiduous and pains- children in the 3 two-child families in Table 8-6 contribute only 3 items of data. effort could not avoid truncate selection for a suspected recessive trait. Often These considerations hamper calculations of statistical parameters to tesl the ran­ there as simnlv no wi!Y to identify the !!eterozygotes who had no affected children. dom sampling error of the data and the validity of pooling data from various sources. The direct sib method does have the virtue of pointing out its own unbiased CORRECTIONS FOR TRU:t::!.cATE COM-'~LETE.S.ELlKIION ratio of affected:normal in the progeny. This is not true in the second method of correcting complete ascertainment data: the so-called "direct,p priW/' method. Forlunately for the history of human genetics. geneticists and mathematicians 9uickly developed methods to correct for this vexing problem. Three methods in particular are widely used. The first was suggested by a German p.Dysi<:ian, .!Y; Table 8-4 Analysis of the data in Table 8-3 by the direct sib method (Weinberg's "Y:ei,llhq:g. in 1912. only I 2 years after the rediscovery of Menders work, He noted general proband method) thaI the .completc..ascer:tait:l.!lenL§,a.!Jlple.w'as.twncate..b~;Jl.lse oi.gL,bia.sJtLfaYJ;~r~Qf ~.---....... Total Sihs of Corrccted -the aJft'.Cl,eAQ!:Qg£!!y. These were the individuals that called attention to the sibship, It" Expected "Obscrved"~ Sihs/I?roband Probands Progeny thai is. they were the llJ'ohand,f or[!!!Jposili (singular. pl"O[!osilJjs). Another synonym ~ Proportion' Proportion ~ ~cJe.d:~o~I1l')1 Alfcctcd:Normal 0.11 .. illJit:~_ca~." The normal progeny. on the other hand. could not identify the family as belonging in the data: hence. they were deficient in the final totals. To correct this 0 2 81 108 0 108 0 I 2 1( 0:0 0:3 ~ --+ 0:0 0:3 ' loi - 2:4 , 54- ........ 0:0 0: 324 - 54 54 >l 1:2 108:216 il11pnlallee he suggested the.l;!!obands bt?rel1]pv"C'<!. and the count among the siblings of tile probands should show the correct ratio. This correction is referred to as • Weil/h9;g~'.:.!)/:d)aud_melllQd; unfortunately. the latter term has also been used for J 4 12 I. 12 I 4 3 y ~ 2: 1 3:0 -. ~ 6:3 12:0 y y, 12 .-., Total Ralio 72:311 12:0 ..!.2.Llli ..!.1. other methods. so thnt the consensus nowadays is to call it Ihe "direct sib" method. al?er_256_four~c!l!).Q[~.~QLIL£l~Lq.marr:iagcsjn_the...pl!pula!i<!!!;
    • In TEXTBOOK OF HUMAN GENETICS CHANCE AND THE DISTRIBUTION OF FAMILIES 19: values to three decimal places when p = 1/2 or p = 3/4 and family size is 12 or less. Table 8-8 Divisors.J...:;;..I!,J9r the direct a priori correction when q = lL~ oJ,q,..,;:...tf,A Before leaving the directl!..priori O).!thod, it might be well to note a slightly -~~ " diffcrent manner of checking whether a truncate set of data fits the hypothesis. Sibship q 1/2 q 1/4 Table 8-3 states that the proportion Ofiiff'ected (a/a) in a random but truncate Size (II) p" 1- p" p" 1- p" sample of four-child sibships is I 0.500 0.500 0.750 0.250 2 0.250 0.750 0.5625 0.4375 64 3 0.125 0.875 0.422 0.578 qlr = 175 = 0.3657. 4 0.0625 0.9375 0.316 0.684 5 0.031 0.969 0.237 0.763 We can refer to this as the expected truncate q, q". This means that if there has been 6 0.016 0.984 0.178 0.822 7 0.008 0.992 0.133 0.867 complete ascertainment but we know the 4-child families in the sample do not 8 0.004 0.996 0.100 0.900 include the ones with 0 affected. we expcct 0.3657 affected and not onc-fourth. On 9 0.002 0.998 0.D75 0.925 ,ncrage the families that arc in the sample should contain 4 (0.3657) 1.463 10 0.001 0.999 0.056 0.944 atlccted instead of I. which is 4 (0.25). Instead of calculating the expected truncate II < 0.001 - 1.000 0.042 0.958 (/. as in Table 8-3. it can be calculated directly by noting that 12 0.032 0.968 13 0.024 0.976 14 0.018 0.982 a priori q 15 0.013 0.987 IS 3. . , ---- q" = I p" - where.. 11 is ' the size of the sibship. I Thus when the a priod q = 1/4. and sibship size . 16 0.010 0.990 may therefore indicate an inherited differential susceptibility to such infection. bUI 1/4 this has not been established with certainty. q" I (3/4)' 37/64 It is a bit disturbing that the two methods of correcting truncate complete = 16/37 ascertained samples in Tables 8-6 and 8-7 do not arrive at the same estimate or q = 0.4324. so that: from the data. This shows that they may be consistent in poinling to the underlying average expected (a/a) = 3(0.4324) = 1.297/t:1mily. Table 8-9 Expected truncate q(q,,) under complete ascertain ­ When 1/ bccomes larger than 4. the.ru.i"thcmclic~becomcs laborious. so the divisors ment and average number of affected per sibship when a priori 1/ = in Table 8-8 can be use~the .s!snomiJ:laJo.r,. The expected truncate q and the 1/2 or 1/4 average number of affected arc shown in Table 8-9 for sibships with I to 16 1111.'111 hers. Sibship ~~i.2!i q,;:: I.I.~ A [!riori If = 1/4 Size Expected Expected Number of ExpcciC'd" Expected Nli'inber of Note that q'r approaches the a priori q as the family size increases. This is (II) Aflceted/Sibship Aflceted/Sibshil) 1/" 1/" exactly what is expected from the previously noted fact that the divisor. I p", hecomes almost indistinguishable from I as 11 grows larger. Hence ..~Qrj~mjljg 1.0000 1.000 1.0000 1.000 2 0.6667 t.333 0.5714 1.143 $I,',!;..than 10: q,r - a wiori gJor the calculations. This would not be permissible 0.5714 0.43~4 3 1.714 1.:'97 if the a {lriori q were one-fourth and the data included a substantial number of large 4 0.533:1 2.133 0.3657 1.463 . I:wlilies. However. even if the data contained as many as five sibships of 10. the 5 0.5161 2.581 0.3178 1.639 error would not be very great. Using q = 1/4 leads to 12.5 expected. whereas the (, 0.5071 3.043 0.3041 1.825 correct expected number is 5 (2.649) = 13.2. a difference of only 0.7. 7 0.5039 3.527 0.2885 2.020 8 0.5020 4.016 0.2778 ') ,)" This form of the a priori method may be illustrated by data for the Australia 0.50 I0 <) 4.509 0.2703 2.433 lipoprotein antigen (Table 8-10). Note the similarity of observed and expected pos­ 10 0.5005 5.005 0.2649 2.649 itiH.''' (hom07ygous recessives). as well as the excellent fit of q to the theoretical 25 II 0.5002 5.502 0.2610 2.87 perccll I. 12 n. 500 I 6.001 0.:'582 3.098 The example should caution us that such a good fit docs not necessarily mean 13 0.50006 6.5008 0.2561 3.329 14 0.2545 3.563 the hypothesis is correct. Subsequent research by Blumberg and others has dem ­ 15 0.2539 3.808 ollstrated that the Australia antigen reflects infection by hepatitis virus. These data 16 O.2~g5_ 4.040 '"
    • --------------------_ ,",,". . .>' '1'94 TEXTBOOK OF HUMAN GENETICS CHANCE AND THE DISTRIBUTION OF FAMILIES 195 Table 8-10 C0rrection of truncate data by Bernstein's form of the II SINGLE SELECTION method: d,lta from Blumberg et al. (1966) or 24 sibships in which both parents Wl're negative for the Australia lipoprotein antigen but had ,11 Ipast one positive child Even when truncate selection is recognized. proper correction of the error by the methods described above assumes that the portion of the total sibships that have Sill' of Number of Ohserved Number of Expected been ascertained have been identified in a random manner. so that their relative Sih~hir Sihships Positive Children proportion will correspond well with the expected frequencies (108:54: 12:1. fbr R R.OOO example. in the four-children families of Table 8-2). In actuality, families with :I 4 .~ 3.429 many affected arc more likely to come to the attention of the physician or geneticist 3 :I 1R91 ..j 3 than families with only one affected. This is apt to be especially true if the data are 5 4.3R9 " :> 3 collected from the cases reported in the literature. Similarly. a family with many (, 3.17R 3 5 5.475 affected may be counted more than once, particularly if the data are accumulated 7 I I 2.0:m encountering an affected person and asking him how many normal and affected s 4 siblings he has. If there is only one affected in a family. this family would be 33 32.704" counted only once: if there arc three affected, however. each of the three may be x". I d.f. = 0.004 encountered independently by the investigators. in which cas" the family might I' > 0.90 be counted three times. "~lIll1l"'r of sinships mllltiplied"hy approp~i~le numnl'r from column 5 ofTanlc 10-1. To avoid some of these possible errors. but especially to avoid the onerous hll """ ne nOled als" that T,. as per column (.. Tanle S-1 is 4(J2.104J. so that ci ~ 3VIJO.1l1 (. choice of examining a whole population and locating every sibship of the desired type, some geneticists have used an incomplete ascertainment method often called "single selection of affected individuals," or simply "single selection." The pertinent sibships arc identified by studying all persons of a certain age. for ratio hut are not ('ry efficient from the statistical point of view. Several methods example, or all children in a certain grade in school. No family could be counted that have been developed to obtain more efficient estimates of q use ma:.:imum more than once (The rare exception. the family with one or more affected twins. likelihood mathematics that are beyond the scope of this book. Li and Mantel, could be identified and the data corrected accordingly.) Many collections of data 11Owev('r. have described arclalively simple method th:!t seems to be as efficient as based on affected first seen in hospital clinics fall into this category. Ihl' ma:.:imum likelihood methods. It harks back to W"inr.erg·s direct sib While considerably more economical. single selection also requires to be cor­ mcthod. I nstead of remo'ing all probands. however. it conccntrates on the sibships rected. In the single selection method, the relative number of families with different reporting hut a single affected child. and it removes the numher of these affected numbcrs of affected will depend not only on the relative proportions of these nml­ chi Idrl'n (rcfcrn.'d to as "singletons") from both the total affected and the total prog­ ilies in the population but also on the relative number of affected children in thl' l'ny in thl' reported families. It is not suitablle for small amounts of data such as sibship. To illustrate: if we try to ascertain by single selection the number ofafbinos thOSl' of Table X-ft. hut works well with large amounts. For example, one of the in four-children sibships born to heterozygous parents. we might survey all till' most extensive studies on albinism from normal parents f()lInd 864 albinos among fifth-grade children in a given town to determine- if any are albinos having three ~ ..1J5 children in 411 sibships of two or more children. 171 sibships contained only siblings and normal parents. According to Table 8-2. the expectation is that a sib­ "11l!;!iL'ton alhinos. Hence. ship with three normal and one albino would have 108 times as much chance or being represented by a child in the fifth grade as a family with all four children 864 171 albinos. However. the family with four albinos has fbur times as much chance that Est. £j = 171 their representative in the fifth grade will be an albino child as any family with three 693 normal and only one albino. Hence. only 27 families with 3 normal: I albino will = 1264 be counted for every family with 0 normal:4 albinos counted. The appropriate cal­ == 0.306. culations arc illustrated in Table 8-1 I. Single selection is automatically truncate. since the family with all normal ofl~ spring has no chance of being counted. This would be true even for situations that fhi" compares very well with the ma:':lmum liKelIhood estimate f()r the same data. would not be truncate under complete selection. (For some conditions. single selec­ 11 . .I0X. The standard error by both methods comcs out to 1)(' o.n II. The deviations tion might be truncate at both ends. Thus. for a condition in which the affected from fl. .:!; arc highly significant. presumably because of the concentration on "inter­ child is not available, ascertainment might be made by asking each normal person with many albinos in the older literature. in the sample whether he had at least one affected siblingJ Notwithstanding these
    • • 1'94 TEXTBOOK OF HUMAN GENETICS CHANCE AND THE DISTRIBUTION OF FAMILIES 195 " Table 8-10 Correction of truncate data hy Bernstein's form of the a ,WGLE $,~L,]£!.IqN nwthod: data from Blumherg et al. (1966) or 24 sihships in which both pan'nts were negative for the Australia lipoprotein antigen hut Ih1d .,t Ip,lst one positive child Even when truncate selection is recognized, proper correction of the error hy the methods described above assumes that the portion of the total sibships that have Si/l' of Number or Observed Number of Expecled Slhshir been ascertained have been identified in a random manner. so that their rdative Sihshirs Posilive Children Numbc'e) proportion will correspond well with the .c.~pcctcdJrequg.nciesJ.LQ8:5~:,l2:;, for 11 & example, in the (o.llr.:chitdr:en ..famiJies of Table 8-2). In actuality, families with &.000 J 4 J 3.429 many affected are more likely to come to thc attention orthe physician or geneticist J 3 3.&91 -l J than families with only one affected. This is apt to be especially true if the data arc 5 4.389 5 :? J collected from the cases reported in the literature. Similarly, <!Jaf!ljIY~ilh n~A!n 3.17& I> J 5 5.475 affe<;Je9.IJ1ayJ~e_c.oJ!Qtep',[l1Qr..c_t.han_once, particularly if the data are accumulated 7 I 2.020 by encountering an affected person and asking him how many normal and affected X 4 2.222 siblings he has. If there is only one affected in a family. this family would I'll.' 3J 32.704F) counted only once; if there are three affected, however, each of the three may be x'. I d.t: "'" (1.004 encountered independently by the investigators, in which case the family might P> 0.90 be counted three times. ~um!x.r (If sin,hips mulliplied by approprinle number 1;-001 column 5 of Tablc 10-7. To avoid some of these possible errors, but especially to avoid the onerous 111:1~ "it 11 '.:;;"he nOled alSt> Ihal Ttas per column o. Tanlc 11-7 is 4(.11. 7fl4). So thai Ii = 33/lJO.lllo choice of examining a whole population and locating cvery sibship of the desired type, some ge~.~cists ha~e u;ed a!!..incP.~np,~et5.asce~~aipl)~eI'lJ.meJ~.o<i.o,f).am~.!ill~" often called ':slngle selection/of affruC1!...ill.dlY.lduals, or SImply f§Jnglc_sc1ecllon.1 The pertincnt sibships are identified by studying all persons of a certain agc,I'or ratio but are not very efficient from the statistical. pointof view. Several methods example. or all children in a certain grade in school. No family could be counted '. that have been developed to obtain more efficient estimates of q usc maximum more than once (The rare exception. the family with one or more alfected twins. likelihood mathematics that are beyond the scope of this book. Li and Mantel. could be identified and the data corrected accordingly.) Many collections of dat;( 11OWCTr. 11<I'e described a relatively simple method th:lt seems to be as efficient as based on affected Ilrst seen in hospital clinics fall into this category. the ma:ill1um likelihood methods. It harks back t6 Wpin"erg's direct sib (proband) While considerably more economical, single selection also requires to be cor­ method. Instead of remo'ing all probands. however. jJ_l:()lJJ:~IJJLat~n the sibsJUPs rected.ln the single selection method, thc relative numheroffamilil's with different ITJ2.!.I:ti.ng.hul_a.sjngICltfc:cJcd~cbild. and it xemoxes_thc_pl,l/1lhcLoLthese.affe,ctcd numbcrs of affected will depend not only on the relativc proportions of these fam­ child n:n. trdcrr.cd.to. a s..:si.ng letons:;) from ltQtbJheJotaLaJfucJ.e.d,and t..!l£jQt;tLP1:og.. ilies in the population but also on thc relative number of affected children in the S!.!.~, in the...rep~QngdJamiLies. It is not suitablle for small amounts of data such as sibship. To illustrate: if we try to ascertain by single selection the .nlll1lh~,r of a!h.l.~ those of Table 1-6. but .':Vorks_y.cJI~V:i.th~largc_amounts. For example. one of the iJ:U::o~l,lr.::childr::eruibships born~to_hcterozygous_parcnts, we might survey all the most extensive studies on albinism from J1o.rmal.pa{enJs foun(L.§{t4~aIQi!lo.5,al)l()ng fifth-grade children in a given town to determine if any are alhinos having three 2:!,l~..YniIQr~l1jD_-tLLsibshi.ps.O[tFO or mQr<?_c:bildr~n ..I71 sibshi~ contained only siblings and normal parents. According to Table 8-2, the expectation is that a sib­ singleton albinos. Hence. -, ship withJbr::ec_oormal and olle.albino.would havc .108.times.as_much.chanc~e_.or ;!eingJ£Pl,cscn~ed by a.child_in_the ...ftfth..grade.. as.aJamily with alUour childrl:.!) 864 - 171 alb.inos. However, the family with four albinos hasJ'<WUin1Cs as.J1.l!!~·11.cha.nse_ll1<!t Est. q 2435 - 171 their representative in thefi'fth grade will be an albino child as any family ~ith-Ihree 693 normal <lnd_only_~~g:;lbiD~."Hence. only 27 families with 3 nOl~jilJ:~i},U2. will 2264 be countedJqr_e~cryJ(lmily with 0 n.Qrmal:~albinoscounted. The appropriate cal­ == 0.306. culations arc illustrated in Table 8-11. Single selection is automatically truncate. since the t:1mily with all normal ofi~ spring has no chance of being counted. This would be true eyen for situations that Thi ... Ull1lpafl'S very well with the maximum likelihood estimate for the same data. would not be truncate under complete selection. (For some conditions, single selec­ UOx. The standard error by both methods comes out to he 0.0 II. The deviations tion might be truncate at both cnds. Thus, for a condition in which the affected from 0.25 are highly signifkant, presumably because of the concentration on "inter.. child is not availablc. ascertainment might be made by asking each normal person ('ting" nlnlilies with many alhinos in the older literature. in the sample whether he had at least one aficcted sibling.) Notwithstanding these
    • 1?2 TEXTBOOK OF HUMAN GENETICS CHANCE AND THE DISTRIBUTION OF FAMILIES 19 values to three decimal places when~p = 1L2_orcp = 3L4 and family size is 12 or Table 8-8 Divisors,1---~ for the_dire~~q.luiQri less. c.2rrect!9JL whe'2...lL::' JI.£2L.a = 1.L~L Before leaving the direct ,a.PliQd..method, it might be well to note a slightly 1/2 q'" 1/4 Sibship l{ ditferent manner of checking whether a_truncatc_seLoLdata_fi.ts_thc_h.YPolhesis. p" p" 1 _ p" Size Table 8-3 states that the proportion of affected (a/a) in a random but truncate 0.500 0.500 0.750 sample of four-child sibships is 0.750 0.5625 0.4375 2 0.250 3 0.125 0.875 0.422 0.578 _ 64 = -:::;.:.::;:.:: 0.36,5:Z.. 4 0.0625 0.9375 0.316 0.684 qIf - I 75 O.oJI 0.969 0.237 0.763 5 6 0.016 0.984 0.178 0.822 We can refer to this as the expected t!l:!.!!fa1e~q, q,r' Thi!. means that if there has been 7 0.008 0.992 0.133 0.867 complete ascertainment but we know the 4-cllild families in the sample do not 8 0.004 0.996 0.100 0.900 9 0.002 0.998 0.075 0.925 include the ones with 0 affected. we..f2:.PCf' Q,J65.7_affecJeO and not cme-four!D...:. On 0.056 0.944 10 0.001 0.999 ;lwage the families that~in the sample should contain 'L(~) = .L±01 0.042 0.958 II < 0.001 - 1.000 alli:cl"djnstead_QfJ. which is,.i.(Q..;!5)' Instead of calculating~p.££!Sd truncate 12 0.032 0.968 il.,as in Table 8-3. it can be calculated directly by noting that 13 0.024 (1.976 .14 0.018 0:982 a priori q 15 OJll3 0.987 q" I - p" 16 0.010 0.990 where LLi~Jhe~size_of the sibshir? Thus when the.!!.pt;iqrLq--1/j. and ~Lb~hip"'.!!i~e i5-1.. may therefore indicate an inherited differential susceptibility to such infection, but this has not been established with certainty. 1/4 1/4 It is a bit disturbing that the two methods of correcting truncate compll'le q" I (3/4)' = 37/64 ascertained samples in Tables 8-6 and 8-7 do not arrive at the same estimate of q = 16/37 from the data'. This shows that they may be consistent in pointing to the underlying = 0.4324, so that: average expected (a/a) = 3(0.4324) = L297/family. Table 8-9 Expected truncate q(t/,,) under complete ascertain­ When }I.becomes.larger than 4. the <!Uthe.metic.becomcs.laborious. so the divisors ment and average number of affected per sibship when a priori If = in Table 8-8 can be used in the c!(;'J!Q.mirtator,. The expected ,!J:.tl.l)cale_q and the 1/20r1/4 acragG-f1umber_oLaffccted_are shown in Table 8-9 for sibships with LLo_L6 Sihship .t~I"l~i.Qri_l{,...;;:..Y1 ~!2:iori l(~/4 t!lgIlJ.he.(s.• SiZt' Ex peeled Expccled Number of Expected Expected Number of Note that q" approaches the a priori q as the family size increases. This is If,, Allccled/Sihship l>xal·tly what is expected from the previously noted fact that the divisor. I 1''', 1.0000 1.000 LOOOO becomes almost indistinguishable from I as 11 grows larger. Hence. fqr familieJ>. :! 0.6667 1.333 0.5714 1.143 l!!!g.£Ltha!!..1 O@r ~ a priori q:for the calculations. This would not be permissible 3 0.5714 1.714 0.43:!4 1.2Q7 ifthc a priori q were one-fourth and the data included a substantial number of large 4 0.5333 2.133 IU657 1.463 5 0.5161 2.581 0.3278 1.639 1;ll11ilil's. However. even if the data contained as many as five sibships of 10. the () 0.5071 104:1 0.3041 UI25 l'rror would not be very great. Using q = 1/4 leads to 12.5 expected, whereas the 7 0.5039 :U27 n.::!885 ::!.O::!O correct expected number is 5 (2.649) =0 13.2, a difference of only 0.7. !! 0.50:!O 4.016 0.:!778 :!.:!22 This form of the a priori method may be illustrated by data for the Australia 9 O.SOtO 4.509 0.2703 2.433 lipoprotein antigen (Table 8-10). Note the similarity ofobserved and expected pos­ 10 0.5005 5.005 0.2649 2.649 II I).SOO:! 5.502 0.1610 2.871 itivl's (homozygous recessives). as well as the excellent fit of (f to the theoretical 25 12 0.5001 6.00 I 0.2S!!2 3.098 percent. 13 ..Ql.Qill.l!> 6.5008 0.256 I 3.329 The example should caution us that such a good fit does not necessarily mean 14 0.2545 3.563 the hypothesis is correct. Subsequent research by Blumberg and others has dem­ 15 0.2539 3.808 onstrated that the Australia antigen reflects infection by hepatitis virus. These data 16 .Q~~)5.. 4J140
    • -196 TEXTBOOK OF HUMAN GENETICS CHANCE AND THE DISTRIBUTION OF FAMILIES 19' Table 8-11 Rl'lative proportion of si!:'ships counted ur.cier single selection, illustrated Rather than removing the proband from each sibship separately, it may b( ~ib~hirs of four for a recessive condition with both I'Jrenls heterozygous done from the total data in one step, as illustrated below for a theoretical composil! True Rdalive Relative Count Relali'{' of three- and four-child families: Numher Proporlion Frequency "roport ion Apparenl Count ,-lJcl'lrd (,1) A·B in Sample Affected: Normal True Sibship of Count I) 81 0 0 0 0:0 Proportion Each Size Frequency Relative 1 108 ~ ~ 109 /4 17 " I : IOi'-11" 27:81 54 ., . -') 108 Number in Its Rclalive under Number of Count .~ 14- 17. 2' lo'i"-5+ '" 54:54 Sibship of Sibship Proportion Single Sibships in 12 ~ 3 --'} 36 /4 9 '3:3,,-21- '27:9 Size Affected Size in 256 Selection Sample Normal Alfeeter -I I 'I 4 -') 4 14 " I c+,4-4 " I 216 :;!ri TOlal, 256 112: 144 3 J, 27/64 9/64 108 36 2 108 72 72 lOX I; 1. 144 Ralio 0.4.175:0.5625 l3 1/64 108/256 4 108 3 " 12 108 () 0>5 36 324 "3: 1 108 apparent deficiencies. single selection has proved to be a very useful tool in genetic 4 f2 54/256 54 2 " 108 216 1:2 216 analyses. As with complete ascertainment one need only be aware of the theoret­ ical hasis for expecting distortions of the true ratio in single selection studies to be 1! 12/256 1/256 12 I y . 3 4 = ~ 36 (jM> 4 36 1:3 108 0.4 16 ahle to understand the methods that have been developed to correct them. Total 864 ~@ CORRECTION FORITKUNCATE COMPLETESELEcrlPti ( -ftrtet! = norma.l + cri1fe c-ted Proband removal will also reveal the correct ratio in a sample coller ted by single For.ili!gL<; St;.u:Slipn,.,!llIlI!ber oiprC)ball9s = ~r...o(sih!ihip2. Hence. selection. Since it is sil1l.dc selection. however. onlvone affected need be eliminated from each family. .... . .' .. . 736 =448 288 I~ What we arc doing in effect is. co 11 '::c rt i ng.Q!!r.sllm nle Qf(o.ur=cbjldre~Jaroilies. q = 1600 - 448 = Ti52'= 4 which is truncate. to a "c.2illplete_l).scertaLnecC' .salDp.Je_oCt.ht:~~:chiJdren.famiJics. --.I ustification for this procedure may be readily understood by inspection of Table R-4. Note that the distribution of counted sibships is)1:41:9:). These are exactly .lLMI!IIUCJ~«-qBAB1LIT,Y the coefficients of the expansion JJL4 ..1/ - + 1/4 a/il):. which is: Our basic definition of probability assumed that once we have formulated a l1,t64(JI_::J~ + 27/64(JI- f(lJlqL+...,2/64t::!I- ~E/ af + 1/64(£1/£1)'. hypothesis about the system, mathematical or biological, under study, we knew the The conversion of the singly selected random sample offour-child f.:1milies to number of alternative events and the number that spell success or failure. Unfor ­ a completely ascertained sample of three-child f.:1milies by removing the single pro ­ tunately, we are not always able to formulate so precise a hypothesis. To take <l hand from each family looks like this: very important example from the field of insurance: we might ask what arc theQdd.~ ~given Pfrson_preseotly_.aged.20-,,'.illJiv.e_till.the..age.0[25,? Clearly, we have Before no theoretical hypothesis to rely on in answering this question,wllichis so in1p~;i­ ('orrccl ion tant in deciding how much premium he should pay for insurance. We feel certain Numhn Rl'Ialivc Frequency Progeny ('ount Afler Correction the insured will die. but cannot be certain how long he will pay thesc premiums. of in Singly Seicrled rer Sihship Progeny ('ount Total Similarly, in genetics we arc often faced with situations where inheritance seems .-lIcrled Sample (e) (D) LA) (8) (8. D) definitely to playa part, but we cannot understand the exact mechanism respon~ Normal:Alfccled Normal:Atfected Normal:Aflcrted sible. In such cases we are unable to make probability statements based on the () o 4:0 4:0 Mendelian ratios. I 17 .1:' -I --7 3:0 « 21 81:00 Ignorance of the underlying mechanism need not be a complete bar to making 17 2:2 -I --'.> ~: 1 2'1 " 54:27 3 q 7 '" some estimate of probability, however. The viability and robustness of the insur· 1:.1 -t --7 1:2 r q 9: 18 4 1 0:4 _I --'l l):3 x 1 ance industry all over the world sufficiently testify to that point: We must recog· Total nize, though, that the kinds of probability statement possible in these situations arc 144:48 ." Ralio 3:1 not the same as the kind we have thus far been making in this chapter. Instead 01 being based on a theoretical scientific formulation. ourpJ:0bf,lbiJitv statemenls nt "
    • ~ 1'98 TEXTBOOK OF HUMAN GENETICS CHANCE AND THE DISTRIBUTION OF FAMILIES 19 pragmatically on past experience as a Ryi<k. The assumpti,?n is that a given event two are heterozx~otes and ~ould p'roduce, 1h.rSS_!!.9rmal~<::.bildr:e)l and !!2....11J[~ will happen as often in the future as it did in the past. If. for,..~xalDple, in tl:!.£..past child is: t£!.L n.'ar~...iJ2ercen LQf.m£!J.L!!gcQ.20_digd.h~fo..r~.J.h~_agt:_of.2~. we would say that the prohahilily thai illly_p!lJ:tkllJar_2.O:y~ar.::.<lliLpe.tsJm wii! be a.!ive_o.Q~.bj~J)y(~JJ!Y.::,.fjJjh 2 2 3 3 3 3 l1irlhday_is35/JOO. etc. 3 X 3 X "4X4X4=U;' I~robability based OI)R.?st e_~~eri~nce_rathcr_tPal1_theol:¥~ is referred to as This is sometimes called a;jif!ril, probability. because it is a comiJ.ioalio..!1.ofJhg.g!,1o "i.{"i/'ii711mi.?fmiCTFfo probal;>jlity. Some workers refer to such statements as ':£ropiric prQbability~that Q,oJh.parents.are.car.dcrs and IhconditionaLprobl:tbiIiW. thaI if the. ~isk fi,g!IL~" (Chapter 18) rather than probability, but they manipulate them in the arc.caq:icrs. thcy_wouldhav,c"nor.mal.childrcn. The corresponding joint probabi Ii t. same way as probabilities based on knowledge of the underlying mechanism. To based on the assumption that at least Qll~Jl(the.parents is1jOi:-a heterozygQ.!£.is: continue the previous example, if we wished to calculate the l2r:o.ll<l.biIily_that t:!l:'9 Ill:rs{)ns_age~d.20_would_both.be_ali.ve~at_ag()_25.,_the answer would be (95/1 OOf or bl c both he-terozy<J<rles ~) ~ "'l Qt'e - 21l.25_pJ.~r:cenL This type of calculation would be l!:r!portan!.l9r gt:qupjlls~e. (I X I = .. i · ~ = ~ t- excll.de1:hIS Empiric probabilities have the disadvantage that they arc only as good as the methods of collecting the data responsible for them. Obviously, they are subject to 9 - - 9' Since these are now the only two possibilities, their ~m becomes the denominator that is, the posteriOl:. p.cobab.ility_that QQ!h..par~l]ts"a(e_helerozy.gous is: dlange whenever we collect more data. Despite this serious drawback we shall find them extremely useful in several areas of genetics. Even the application of stat is­ 3/16 _ ., liull tests based on//1<'(I£cfiral.l2I:Qha.l]i]jty_QisJdbuJio.ns we will find depends a great !J 16 + 5/9 - 0._52. deal on J.:!.!!.piri("practi~;!.Lf::)cJo.r:s. Therefore, the po.sJ~rio.LP.f.Ql:lab,i!iJy that the (Qur.th.. ('M9~ilIJK~ is only: I POSTERIOR PROBABILITY 0.252 X"4 = 0.063. _. __ II ~ ..l.. ",..L , _ _._"':!O,=.~.c...- - " " j - 3 'lI{. q !?roha.h.ility_based.onJm_analysls.. o(equaJJy_likdy_c.·eni;}and_what..proporJiQO_Qf Ihal is, about 1!..!2, as compared to@mor to the birth of the three nor'!l'-.!I chil ~hese wO~lld_sPl'IL~cessJ~~~.~()metimes.becn~called !!.../!/JpJ,·i o..!l~12~.probabil.i!y. dren. The co.ncept is clearly useful in genetic counseling. On the other hand, RroPahIIJJ~~hos::@p'proxlmatc valuee.ls oQlatne(1.:rr9m~actugl The reader should consider this chapter, as in the case of many o.thers. men'l~ .'it<Hi~lical,dala. from rc.peate(Lt£ials,was called !!...l)()s/('rio/'l· or l)o.'ilr~il!L.mohahiJ.iJ.X: an introduction 10 a very complex subject. It does not consider, fo.r example,' h() In recent years the term posterio.r probability has been expanded to include also to correct for data that have been collected by a mixture' of the··two· method" situl1tions where the prior probability sho.uld be modified by events subsequent to referred to. as ".mu.tiple:' or "incoT11pJ~Je" selection. Here the metho.d of cho.ice i. its j()rmulation. This is the application of what statisticians refer to. as Bay..£§.' the .!!!a.~iJ!lum .ikclihQod_fll.eJhod_o.CQililgy_( 195 I) o.r the "~g!£g1!.tion anal~' lh~()renl.", m~t.!1.qQ_pL~.Q(to..n (1958a. 1959. 1962). Those who arc mathematically adept will Rather than hurden the student with another fo.rmula let us illustrate the think­ findryt()rlQn:s.forrrHlI,!s to constitute a particularly comprehensive attacko.l1 thl ing and arithmetic involved with an example. A wo.man_ha~_a_b.ro~,cc!~by pro.blems of ascertainment. They are. ho.wever, to.o complex for this hook. a recessi ve diso.rdr,:r. s<ly._gala~!QseJlJi;l. J..pr[ori, W{' say the probability is ~/J* that ~IKis.hewwzyg()UsJOC.thi~Jrajt. ­ I f she were to ...!.!!E.!!y a mal) with acsirnil.i!Lbi~!9ry. the E...l!!iori pf.Qbabilit.y that APPENDIX they wol!i(L.b.a.Y.l' an atlccted_child is: 2 2~, I , ' '. l' TESTS FOR SIGNIFICANCE 3 X j 0J~ 9' . . i Over the years scientists have found that events having less than 5 percent proha­ II: however, they have J.l:1-Ieg_l:bJld.r.e.n. all free o.Lgl;l]:'.l£tJlS(11"Iia. what is the proba­ bility of occurring by chance ought to be loo.ked at with some suspicion. We should hilily !()r their f(H1rth child? wonder in such an event whether chance was really responsihle for Ihe o.bserved To. calculate this we must take into consideration that the three normal chil­ deviation from expected frequency. Perhaps our assumption as to the basic !l1n·h .. dren make it less Jik..£!y that tl~.y arc both heterozygQ..lli:. The probahility that the anism is wrong. Or perhaps the hypothesis was correct. but so.me bias factors wen at work between the fundamental biological activity and our ohservatio.lls. What ever the real cause of our aberrant result. we should no.t accept it (IS merely tht' ·'i!lf_(";,nnOI•.bl : !.!iWllllygous r.l~~.!:§.Si~t" so tht' odds arc l2..!.... hal site is hl'tcroz~:l!lolls. vagary o.f chance. though that is al! Ihat it might be.
    • ,i ~9S TEXTBOOK OF HUMAN GENETICS ~. CHANCE AND THE DISTRIBUTION OF FAMILIES 19/ f,'" pragmatically on past cxperience as a guidc. The assumption is that a given event two are heterozygotes and would produce three normal childrcn and no affecte( will happen as often in the future as it did in thc past If, for example, in the past child is: ten years 5 percent of men aged 20 died before the age of 25. we would say that the probability that any particular 20-year-old person wiil be alive on his twenty-fifth 2 2 3 3 3 3 hirthday is 95/100. etc. 3X "3 X 4 X 4 X 4=16' Probahility based on past experience rather than theory is referred to as This is sometimes called a joint probability. because it is a combination of the prio] (,lIlpiric or il/dueth'C' probability. Some workers refer to such statements as "empiric probability that both parents arc carriers and a conditional probability that ifthe~ risk figures" (Chapter (8) rather than probability. but they manipulate them in the arc carriers they would have normal childrcn. The corresponding joint probahilitl same way as probabilities based on knowledge of th~ underlying mechanism. To based on the assumption that at least one of the parcnts is not a heterozygote is: continue the pre:ious example. if we wished to calculate the probability that two (I ~)XI=~. persons aged 20 would both be alive at age 25. the answer would be (95/100),. or l)O.25 percent This type of calculation would be important for group insurance. Empiric probabilities have the disadvantage that they are only as good as the Since these arc now the only two possibilities. their sum becomcs the denominator methods of collecting the data responsible for them. Obviously. they are subject to that is. thc posterior probability that both parents are hcterozygous is: change whenever we collect more data. Despite this serious drawback we shall fmd them extremely useful in several areas of genetics. Even the application of statis­ tical tests based on th(,oretical probability distributions we will find depends a great 0.252. deal on (,1II1'iric practil:al factors. Therefore. the posterior probability that the fourth child will be affected 1S only: I POSTERIOR PROBAB.ILlTY 0.252 X 4 = 0.063, Prohahility based on an analX§is~of:'equaJI.y-li1<.eJy-eventsand what l2mp-orti.2.,n of that is, about 1/16. as compared to 1/9 prior to the birth ofthc three normalrhil ~ ~ ,--. the:;~mld.spellwslJc~QmCJiJlleS been called a Jll"iori or prior probability"" ~" On the other hand. probability whose approximate value IS obtained from actual '. dren. The concept is clearly useful in genetic counseling .. The reader should considcr this chapter, as in thc casc of many others. merd~ statistical data from repeated trials was called a posteriori or IW.lterior prohability. an introduction to a very complex subject. It docs not considcr. for example, hIm I n recent years the term posterior probability has been expanded to include also to correct for data that have been collected by a mixture of the two mcthods situ,llions where the prior probability should be modiried hy events subsequent to referred to as "multiple" or "incomplete" selection. Here the method of choice i' its fbrmulation. This is the application of what stalistieians refer to as Bayes' the ~imuJJ~lik,eJihood method of Bailey (1951) or the ':segregation· analysb' tlworem. mcth-od of Morton (1958a. 19590;-1 <J61). Those who arc mathematically adept wi I Raiher than burden the studcnt with another formula let us illustrate the think" find~orton~s formulas to constitute a particularly comprehensive attack on tht and arithmetic involvcd with an example. A woman has a brother affectcd by problems of ascertainmenL' They are, however, too complex for this book. a recessive disorder. S<lY. galactosemia. A priori w(' ~ay the probability is 2/3* that Shl' is hcterozygous for this trait. Ifshe were to marry a man with a similar history, the a priori probability that APPENDIX they would have an aflccted child is: 2 2' I I TESTS FOR SIGNIFICANCE x X 3 3 4 9' Over the years scientists have found that events having less than 5 percent proha It: however. they have three children. all free of galactosemia. what is the proha­ bility of occurring by chancc ought to be looked at with some suspicion. We shoul{ t()r their fburth child? wonder in such an event whether chance was really responsible for the observl'{ To calculatc this we must take into consideration that thc threc normal chil­ deviation from expected frequency. Pcrhaps our assumption as to the basic nwch drcn makl' it Icss likely that thcy arc both heterozygous. The probability that the anism is wrong. Or perhaps the hypothesis was correct, but some bias factors Wl'n at work between thc fundamcntal biological activity and our observations. What ever the real cause of our aberrant result. we should not accept it as merely th, 'She cannot he homozygolls rrressi "c. so the olilis are 2: I Ihat ~ht' is helerozygous. vagary of chance. though that is all that it might be. ~.
    • -196 TEXTBOOK OF HUMAN GENETICS CHANCE AND THE DISTRIBUTION OF FAMILIES 19 Table 8-11 Rl'lative proportion of sibships counted under single selection, illustrated Rather than removing the proband from each sibship separately, it may h !"ib~hips of four for a recessive condition with both done from the total data in one step, as illustrated below for a theoretical composit True Relative Relative Count Relative of three- and four-child families: Number Proportion Frequency Proportion Apparent Count .tll'rled (,1) (8) A·S in Sample Am~eted: Normal True Sibship of Count 0 81 o ~. 0 f.. < O· ~ .q;,O Proportion Each Size Frequency Relative ~~.. f"'~ '~-~i.~1:81 [ 108 ~ 54 I. :. 108 108 r;-. ¥ ra<f 27,. I. 0:. 27 1''1. J. jfR-f6"':54 Sibship Number of in lis Sibship Relalive Proporlion under Single Number of Sibships in ('nunt J 12 3 36 • ~ 9 'X. 1)"- t lJ27:9 Size Affected Size in 256 Selection Sample Normal AflcCI('li 4 1 4 "--==s:...., 4 I . I lC't.! 'T ~ 27/64 108 I 108 ::!16 lOX - 112:144 f Totals 256 3 1 9/64 36 2 72 n 144 Ratio 04375:0.5615 l3 1/64 108/256 4 108 3 1 12 108 314 0 36 108 apr>arent deficiencies. single selection has proved to be a very useful tool in genetic r ; 54/256 54 ::1 108 216 216 4 analyses. As with complete ascertainment. one need only be aware of the theoret­ ical basis for expecting distortions of the true ratio in single selection studies to be I I I 1! 12/256 1/256 12 I 3 Tota~~(fp 36 36 -­ 0 864 lOX ~C;3~) 16 ahle to lInderstand the'methods that have been developed to correct them. ------- CORRECTION FORftUNCATE COMPLETE SELECTION --Proband removal will also reveal the correct ratio in a sample collected by single selection. Since it is sillS/c selection, however, only one atfected need be eliminated from each lamily. ­ . " , ; r.)~.~' For single selection, number of probands 736 - 448 pr~~~n!1s) q = atTc,cJgd,..- total - probands . = number of sibships. Hence, 288 What we arc doing in etfect is converting our sample of four-children families, q='600 448 1152 4' which is truncate, to a "complete ascertaine(j" sample of1hree-children families. '~~~~~~~~~~~~~~~~~~--~~ JlIstification for this procedure may be readily understood by inspection of Table 8-4. Note that the distribution of counted sibships is 27:27:9: I. These are exactly EMPIRIC PROBABILITY till' coefficients of the expansion (3/4..1/- + 1/4 d'/a)<'"WliiCl1iS: -27/64(A/ )1 + .-=. 27/64(A/- )~(a/a) + -­ 9/64(.'1/- lea/a)' The conversion of the singly selected random samplC""Oftour-diild tamWC:., to a completely ascertained sample of three-child families by removing the single pro ­ + 1/64(a/a)'. Our basic definition of probability assumed that once we have formulated <II hypothesis about the system, mathematical or biological, under study, we knew the number of alternative events and the number that spell success or failure. Unfor tunatcly. we are not always able to formulate so precise a hypothesis. To take a band from each family looks like this: very important example from the field of insurance: we might ask what arc the oddsl that a given person pJ..esc.11J1Y~i!ggd 20 will live till the age of 25?C1ca!JX, we haw Before no theoretical hypothesis to rely on in answering this question, which is so impor~ Correction tant in deciding how much premium he should pay for insurance. We feel certain Number Rclalivc Frequency Progeny Count Aller Correction the insured will die, but cannot be certain how long he will pay these premiums, of in Singly Selected per Sibship Progeny ('ounl Total Similarly, in genetics we are often faced with situations where inheritance seems Ili.'ctcd Sample CO (D) (I:! . D) definitely to playa part, but we cannot understand the exact mechanism respon­ fA I (I:! I Normal:A/fected Normal: A freeted Normal:Aflccll'tl sible. In such cases we are unable to make probability statements based on thl'l () o 4:0 4:0 Mendelian ratios. 1 :'7 3:1_ ..:0., t1 81:00 Ignorance of the underlying mechanism need not be a complete bar to making 3 4 n <.J I 2:2 1:3 0:4 ,- :':1 I::! 0:3 .;: ":;:<;'-:z.,) 7'" ",... :-:r-­ 4' . 54:27 9: IX some estimate of probability, however. The viability and robustness of the insur­ ance industry all over the world sufficiently testify to that point. We must recog­ Total • nize. though, that the kinds of probability statement possible in these situations are 144:48 not the same as the kind we have thus far been making in this chapter. Instea.d of i -. Ratio 3:1 i ..,.-' being based ,on a theoretical scientific formulation, oULDrobabilitv statements fI''': L '1. -
    • S ""~-ti ',~ rt~ rV'D llc...bd;T~ t+o.-t .r-.--o ~ ~B 1s heTen "4_ C'4...$ If r ~) ..'c:~ f~" ke.Te",,'Z~~O'4S ? •• ~; c.~",o~~ tif B A~ AA ~ '/2­ 'I:l. ' --f (""0 c- r-C'Cb.J, ;I.T~_ .'k I j'i l~1. 2 ~ , 1;'nJ._L.~-r_-Q)J.~.bl~ ,~~.( ,, 0:.. I i , Dy L( no~W-' ~ns ~ ~ j k 1?4. Y~ frioe- ~ conc4:r.~ ('SOint:.) I~ f?o.Tio ~ fNdJS, ~,., . PD,!,te(""ior pri)b~b;!rl~ - ........... - .. - p." b~ a. hI' r-2, 1te;1i.("o'&.~~~",s ~ ;y~ I IC.. 3 he.~e.t'"o'loo~o~':-o/$ 1'....-0 b -.:» ~ II1,: J!J-I 1.41 i " "e. f'~ + .z.".II'S ~ te :II""• . .. . . . ... 'C') 0 JlI-z.. .. ­ ,t< " •• . ~ L. ~. It, 'I .. U .. " tt .. - t . r:;,rJI"-3' : rho, ~n>ba.i..l,r~ ::~!. l4.'2.f3 = 'il, .) CtJ1'd..'Do'!.. l PfO"O::b.i·.T~::' 3/'f1<.~'f"~f:=.2Jf~ , -Yo l -t p~ ;.A..~; l;T~ -:. ~;"r" CCfnJ.."t.~.:: 'Y,,c. l.'l'"", = 31th ­ Y A5S",me:.. erne,... ~4...eYr not: k~e<"CI'l..~."'.s; ,_ 1I1f ~ ~7, So~ """ k..Tet'"c:';IJ,~"",s +'u~ ~~ I'CT keTe("()'Z-~~c>t.s. :" I .-r;tA-l r'f"1lDloa,.bil~tj':' 'i/1 To .::z.7/;.1( t-- S7,.:f --r~' t'O)Io..Jo.:,"T~ ev-::.r¢ >M;II ~"'"'" P·ndaQ. bi'" ~ • It, r'. ' / <s C'C .C.y'''''ce ib-c.J f~' _ .p &i:!J;/./4. Q.J I , ' ¥ . , .,. .4 ~"( OCCw ro.racQ,j : ' 0 " ) 'II, '" ....Jt1 .,. 7'1 . P,.b.. bd·:t,? .f :JJ1,.-'f h~l~j I,."'''''~~o...s ",teSS;""- ~ 4 '*,<,_ X 11;- ~ . """ ... :I ?~ t' .,.. Sf, -­ ....
    • - ~ .~ ~ClW;l;es ~'V f6.m ,,,) =*" A.f.fec.~ • f'<"O~4n!..s. .:+'" ' , u."ot.~ f'l"ow,,",,4s! S~ bs -ro,;j f'.ro ....'A4S' S: 1,$ A (A ..V 'L.( A(A"l): u. Ir(A-4)N : u.N W A 3 ~ I 0 I 0: I 0:.3 ~ 'i ~ :l ~ d.-:Lf "f:S I ;2.. .:t 5'"' ;t.: 10 ,~~ 10 &> P"ObClvo.clS: ;l.l U""f'f.e"T4 , i ToT£!.-= ~i h / A. '7 = 0 .. :2.:2,. r=r'<'Oba.lod;T," -f YO~Md.{ ~ ': f-(t)~Q.Ia;l.rd 4#f'C.1"e& ('-f') V ;:. VI..'Mb-ec-:", Sl~.s"'·f f 'fDff)~t:,c:.Y' o-f m'SS.' ~~ d~"tt. == f" (.. If ,~ '5~ bs4 "" ~) (f't..... -~ 1 ~ A,,~·.loJo~ J)'t't~ == I'l..:: Q- f:Yx,c :::~6mfJe.r;. dA"'tt.)(Wl;SS(~ .d-0 (o)t f l e.1'e.- da.""C:t. -= k .: : la../'.. f ..... ('2 ,,4l~I4.b(-e dt:iGy<l'ts.s'rI~ d4.~) (~ ~4.< "C J .-f Q .f .f'ec-teJ / 11•.,.~ 0.-' .uess'~e»;: n",.. ~ ,. 4 ;-.fe. tc%. : - . -_.. ::::::::... " ;
    • ~. L . . ,.. ... AA '. '~ ... I 'I:z. , I.d XA= I Cet"J.'t.~ r<'C b..bd.7~ '(&&416,. . ,j i": o"yL(_~ nof''M..( SDnS i t' ., _ ~ .i"ll ";'Y,- .ec:.i~'("" ~_(D""I',~:t~"", e~ ~ ,:.~ '-t- 'fD:: 1 'K'A.-uO ~ f(1)d~$ ... T, "'/'7. f" PD,!;te(""lcx-Pr'&')b~b;IrT~ : -' , ._;::= p~'b-4::'b, 10[11,( :0;-:2. "f!.1&<"o'Z;~~o"", -: ~S , ;',:"p ~ Jr.. 3 h'f.Te.C"o'J.~~o"4..S :'V.s'v ' ,1);1 , p...ef.~;'-iiir~: N-I /..CIil(' be. iOf'm..{ i'+ :lI"'-2.. ~.I1"3 ~~ . ' .!J!-z..":'" ' '" " . Ie .. It ... .. ., • to .t .... If .
    • "Da.Ta... -Pf""0W -r~le. g-~ - " , " -,. '" ' t~" :.V e-t­ ~'Y' ~ " ~ ~ ic"-i rQW,;e.s YA'Ci,",) Af~~ , P<"O»~~..s ' tl'{(Il~ P,..c~~'t4s! sa~~ f.r.ob~"'4S' 5: 1,$ A A (A ..V t..( It (A ";i): u.A ' IrlA~)N: U.NA W ., , , 3 ,;t i 0 J 0: I o :.3 ,.:Lf cl , 'i :l. .~ ~ .2. -< 5"" ~: ,0 '1:g .:t; to tt x p &, P~b~~5: "L fi14:fFc,T4 5€ 1.... 1'0 . '- Jf: : ­- , ' ~. 7 r. )( :2.11. 2...>. :21 :::-tota.! =;2.2.r- . ToT£i ':: ;).7 h /,A. 7 ::: 0, :z.:z" p= 21 lC ~ ::: 20.25 f.s, • '7.r; .:!: if • S'"' '~ ='21" t '" ",15 (.,. , ) - I:> .t>0 ~ C). z.S" <.<. 'f.r "* ~' - I :J. L;d~4 c.a.c,A.a+ed AceeeF Y't: Y'I (atl1Aal) SE x 2 == , ( qS 'I.. c..otrPtdeV7c:e) tt ~o 6.:..1S - (.g. 0 ~ d . .l s­ 1 SE­ = r::. f~~'drr" -f ",,<"'MAt 't -= 1'-n>~Q.led6r~ 4#"e."ted. ('-r) '" :::... "t'-''M 1o.eC""~""" S, ~ ,,~. f f'(Ofo~t:'b~ o'f ~.ss.,~ eia.'"tt.:::: r' (",~l ~ cs,bs~ ~~.) {pY....... ,t~ A"£t~lGLo-e.. :V~'tec..... == Ill...::: Q- f:YxlC =(ftllrl.,le:re JA~){wt;SS6~ J4t~ ~ f(e,l'e,.. da."'C::t. -::. ~ ;: l~/''''f''' (tlll... ~ltlk{'f J':lt~rn.s.sn1~ d4.~) CO (..e ~ 4.< >IIe:j .-f q -f.pecteJ. (h...,o 21~ ",,3 .. ee~;"e»" n"'... Ix r 4 Fled;,%.
    • ~. ' Appendix A Some Elements of Probability and Statistics When a phy~icisl make~ measurements ~u.'h a~ tempetalure, Ihe 1Il'l~~rvallllO'" <Ire IlInll..:d ill al:cural.'Y only by Ihe inMruments u~ed. If beller inslrumenl~ wer..: a ailahlt', more al.'l.'u, rale ob~er'alions could be made. Temperature i~ a measure of Ihe meallllOetK' enerl!Y nl a populalioll of molecules, Very fcw of Ih.: molecules aelually p-l~~e~~ energ! ejuill tn the mean. howcver. fOrlhe majorily move fasler or slower than Ihe mean alue. If une wo.:re til ,ciCCI len IllUh':l'ules al randum. Ihey would very likely nol induJe ~ mlllcl.'uic with tho.: mean energy of Ihe parent p-lpulalion. FUrlhermme. Ihe mean of Ihe k'n mnh:.:uh:~ would almoM eenainly vary slighlly fmml~epllpulaljon mean. .. ',... Physil.'isls cmnmonly ~Jeal With' IO~' molecule~ al a lillle ralher Ih.tn len The I.'hilnl.'': devialions Ihal o.'cur wilh len mlleo.:ule~ do nol happen "'hen uhservJlillll~ an: madl: ,'n sUl:h immense numbers. and sampling error can Ihercfore lk ~afcly i.;!nored Blologlsls, however. curlllnnrily make observations on small number, ""here o.:h.lOo.:e do.:iaiilln Illtl occur, They must recognize Ihe possibility of chance dC'ialion and estlmah: the reliahihlY of their observations. GenelicrSls in parllo.:ular have been fa.'ed wilh Ihls prohlcrtl Mendel recognized Ihe chance nalure of gene tran~mission and Ihl: pos~ibilll~ of lIevlalilln fron!. ideal ralil's. ahhOllgh Ihe slatiMieal proo.:edures fur evaluating lhe ~igmlicance 'of devia. tion~ were nol available 10 him, This appendix will be concerned "'lIh ~ome of the l'la~ic l.tws Ill' pml'labihty and hili!. arc used to solve lome genetic problems. Many of the Malisllcal lel:hnrques COlli. monly in usc reqUire more backl,!round than can be assumed or provllIeJ in Ihi, treatment, The materiall,!iven here should provide an appreciation of the nalure 01 ~lati~lil.'al inferen!:o.: as well as means of solving many of Ihe simpler pruhlems c,n..:oumcred m gcnctll':~. SOME BASIC CONCEPTS OF PROBABILITY The Meaning of Probability Statements A stalemenl (If probability is Ihe likelihood of a "favorable:" even! among all p-"slblc events. PtI·oruMe as used in this context merely denotes the event m e'ent~ uf interest. whether or n(llthey might be raled desirahle by lither criteria. Probabili!)' can Jry fmm I} 647
    • SIImt' f/,·ml'nr.' ()f P",hQhilirv lJnd SW'I' 41 ". fo I. 0 indicating no possibility of the favorable event and I indicating no alternate possi­ The last eltample illustrates the important principle that if there arc two llf lOIre favur· bility. that is. the event is the only one possible. If P = 1. then half of the time a favorable able and mutually exclusive events. lhe 10la1 probability can be obtained hy summin~ Ihe event will occur. probabilities of the separale events. Probability can be expressed symbolically as Another imponant principle is the independence of chance event~ f rom previuu~ trials. When a coin is flipped. the pmbability of a head or tail is 1lor each. If a Slven Illal - ~II!'( yields a head. the next Hip still has equal chances for a head or tail Even If a sequence uf P(II!'/) - ~ • ~II!'. live heads is thrown. the sixth still has a 1chance of being a head This fact permits the compulation of probabilities for complex M:quences and wm· where P(II!',) is the probability of a favorable event (<<'I)' ~II!'I is the sum of all events that are binations of events. The probability of a sequence of events is simply the cumbined prod· favorable, and ~II!'. is the sum of all events that an: possible. The values «', and II!'. may be in uet of each evenl taken separalely. The probahility of throwing three heads is any units that npress the relative frequency of each event. Since the units appear both in numerator and denominator. they cancel oul. P(3H~ "" PIH) . PIH) . P(H) Consider a die with six sides. each equally likely to appear when the die is thmwn. If we ask what the probability is that a three will be lhrown. the answer is PO) l. The = = 1"'1 '" ~'!!6... ~'Z probability of any.,sp«ific side is , and there are six sides. The tOlal number of choices is r. " npressed by fi x.,,1 = . .,1" Therefore. /V f; tr. This procedure is the same whether one coin is tossc:tl three times or three coins arc tuselller. In either case. each component event conlribute5 its pmbability t"~'4:d ~ . rb -.,1 / . ' I I ('<;) P(3)=6 X l;'6' db To ealculale the probability of obtaining an even number. one must sum the favorable The Binomial Distribution outcomes. in lhis case two. four. and six. Then. Statemenl, that the toss uf a coin has two equally pmhable outcomes Of that there arc live 3 x I __ colors of mamles, each with a certain likelihood of beinl!! drawn. are sIRlplc furm, of P(even) = 6i<T - 2' probability distributions. It is useful to express probability distributions in mathematical form. hoth to aid in thinking more dearly and because eenain 1lJ1Clallon~ can he applied tn The various possibilities IIC1:d not be equally likely. Suppose we place 100 marbles mathematical statements that would not be possible otherwise. inlo a jar. Ten are yellow. 30 are red. 15 are white. 20 are black. and 25 are green. The A type of prohlem frequently arises in scnclics that can be readily handled by a bino ­ probability of drawing a red marble is mial di.~tributiOft of the type (.( + .11)·. In the preeedin~ ClIamplc. the pfllbahi lity ';;rrnrce-­ ~l'iCa(Is was shown to be !. TfiiS"'lsiirso the probability fOf three tails (lr for any specific 30 sequence. su.'h as a head fllllllwed hy two tails. Often the intcreM is in the .'umhination ­ Plred) = 10 + 30 +_i,S"-+-20-:r2~ U.JU in this case. one head and two tails-rather than in the order by which it was ubtained If the black marbles are remove~~ro~:~ j~;~'~robability of drawing a red marble The same totals could have been obtained by two other sequences: tail. head. tall; and tail. . tail. head. Each sequence has the probability A. and since cach is a favorahle uuteome. the becomes t(lal probahility of two tails and a head can be obtained by addins them hI set a. Such enumeration is simple in the uample but may become complell for lar~er P(red) = 10 + 30 ~ 15 + 25 = 0.375. numbers. Fllnunately. the binomial expansion permit~ calculation of the terms appropri­ ate f(lr any combination of events. The two probabilities associated with tW('I alternate Thus. reducing the number Of~sible unkdlle eve:s increases the likelihood of a events-for eumple. heads versus tails-are designated p and q. The hin(miali~ then favorable event. ellpressed a~ Suppose thaI in the original series of 100 marbles we wish to know the probability !If drawing either a while or black marble. In this ca'IC. IwO possible outcomes are favor­ I,. + q)". able. and where n is the number of events in thc combination. In Ihe abuve eUIRple. n was 3. curre· i sponding to three e(lins orlhree to~scs. The ellpansi()n of I,. t q)' i, . Plblack or while) = 10 + 30 15 + 20 20 + 25 '" 0.35. + IS + ,.' + 3p'q + Jpq' + q' I The !lime answer is obtained by calculating the separate prubabililie~ for black and For p = q '" L the ellpansion becomes while. followed by addition; for eltample. PI black) = 0.15. P(white) = 0.20. therefore PI black or white) = 0.35. ! + a+ i + 4 = 1. .' , .p
    • Slim" fI.'mI'IIu of P",bohllilv "lid SI..,,,' fI. -to I. !linQicating no possibilil): of the favorable evenl and I indicating no ahemate possi­ The last eumple illustrates the important principle thaI ir there arc two or m(lre favllr· bili!y. that is. the event is the only one possible. tr P = l. then balf of the lime a favorable able and mutually exclusive events. the lotal probability can be ohtained hy summinl! the Fvent will occur. --­ probabilities or Ihe separate even 15. Probability can be expressed symbolically as Another important principle is the independence of chance evenh from prcviuu~ trials. When a coin is flipped. the probability of a head or tail i~ I lor each. JI' a l!,ven Illal P(I!'.f) = :l:i!'f . :l:t... yields a head. the next Rip still has equal chance~ for a head or tail. Even if a sequence Ilf live heads is thrown, the siKth still has a 1chance of being a head. This fact permits the computation of probabilities for complCl sequences and mOl where P(r.,l is the probability of a favorable evenl (ff)' :l:qisthe sum of all eventslbal are binations of events. The probability of a sequence of events is simply the combined prud­ favorable. and :l:ti is the sum of all evenls that are possible. The values tjand tt may be in uct of each event taken separately. The probability of throwing three heads IS any unil~ lhal upRss the relative fmtuency of each event. Since lhe unils appear both in numeralor and denominalor. they cancel out. P(3H) -= P(H) • P(H) • P(H) Consider a die wilh siK sides. each equally likely 10 appear when Ihe die is thrown. If we ask whal lhe probability is that a three will be thrown. lhe answer is P(3) =(9 The -= 1'1'1 = ©". '/~ probability of anyspecilic side iihnd there are ~. The lotal number of chores is eKpRssed by 6 x ~ I. Theref~ t 1;; ..L This procedure is the same whelher one coin is tossed three times or three coins arc tt'l!ethcr. In either case. each component event contributcs its pl'lll1ability tl~"'d j.. ,y" I fc, fo P(3) ;: ~ = '6' ~IJ" The Binomial Distribution To calculate the probability of obtaining an ~ven number. one must sum the favorable outcomes. in this case t~. !Q!!r. and ax. Then. Statements thai the !1'ss of a !;gin has t~all)' pfIlbable UIltl;illDI;.' or Ihat Jhere arc live 3x l __ colors "f marllles. each with- a certain likelihood of being drawn.·are Slinple ftlfm~ of P(even) = 6_~1 .-:-_2' probability distributiuns. It ili useful to express probability distrihutions in mathematical f"rm. bolh toaid in thinking more clearly and because certain tlJ'C,allons can he applied hI The various possibilities need not be equally likely. Suppose we place 100 marbles mathematical statements that would not be possible otherwise. !.nIO a jar. 1!!L~klw. 30 are red, .I~ ~ white. 20 are black. and 2S are green. The . A type uf pl'llblem frequcnlly arises in genctics that can he readily handled by a bino ­ probability or drawing a red marble is mial distribution of the type t., + .v)", In the preeediltl! cumple. Ih': probability of three heads was shown to be 1. This is also the probability for three tails ur for any specific 30 sequence. such as II head followed by two tails. Orten the interest is. in. the cnmbinalion ­ P(red) ;: ,10 + 30 + IS + 20 + 2S. = 0.30. _in this.case. one head-and IWlrtails";'::-rather thanin'he order by which II wa, obtained. "'ICO The same totals could have been obta'l'II.-d by two other sequences: tall. head. tail; and tail. If the .bl~~s an: removed from-lhe jar~ theprobabiliiy of dra-wing a red marble tail. head. Each sequence ha!'> lhe probability A. and since each is a favorable Ilutcnme. the beConles-' lolal probability of two tails and a head can be obtained by addinl! thcm to l!ct 1. Such enumeroltion is simple in lhe er.ample but may become comple.. for larl!er P(red);: 10 + 30 ~ ~ IS + 25 = 0.375. 1 numbers. Fortunately. the binomial expansion permits calculation ollhe terms appropri· IOO-20=~o ate for any combination of events. The two probabilities associated with two alternate Thus. reducing the pumber of possible unfavorable evenls i.!lcreascs ,be !ikeli~a events-f"r c..ample. heads versus tails-are designated p and 1/. The binumial is then favorable event. eltpressed as Suppose that in the original series of 100 marbles we wish 10 know the probabilily uf drawing either a white or black marble. In this ca'le. two possible outcomes are favor­ able. and where n is Ihe number of events in the combination. In the abt've eumple. n was J. eorre· i ! sponding to three coins or three tosses. The e ..pansil;n of (p ~ q J' is . 15+20 =0.35. P(black or while);: 10 + 30 + 15 + 20 + 25 p' + 3p'q + 3pq' + 1/' The same answer is obtained by calculaling the separate pmbabilities (or black and white. followed by addition~ for eKample. P(black) = 0.15. P(white) = 0.20. therefore Ptblack or white) ;: 0.35. I • Fur p = 1/ ~ ~. the e ..pansion becomes ~ +i + j + ~ = I. f i'l
    • ,~ I . CJ Appendix A - o ( Y) - o c­ O Some Elements of Probability and Statistics When a phy,j,isl make~ measurcmenls ~ul.·h a~ temperature. Ihl: IIh~..:rvatllln' arc 1i111111:1 in a,euraq (lnly hy the inslruments u~ell If bello.:r instrumo.:nt~ .....ere iI iIllabk. nlllre al.l.u· - rate (lbserVali(in~ could be made. Temperature I~ a mea~ure of Ihe mean 1."100.:111. e'nergy ,;1' a populalion Ill' molecule~. Very few of the molecule~ aClually p<ls'e~s energy e4ual III the mean. however. for Ihe majority move faster or sluwer than the mean alue. II one "'we 10 "cieet len 1l101c..-ulcs al random. they would very likely not include a mok,uk with the Olean enecltY of the parenl population. Furthermore. Ihe mean of the II.·n lIlolecul.:" would Jlml~1 certainly vary slil!thlly frum Ihe population mean Physichls cummonly deal with 10" molecules at a tillle ralhl:r Ih.m len Tile I.'han,·c ucviations Ihal o.:cur with len mole ... ule~ do nol happen ",hen or.~er'.lli'llIs arc made un su.:h inllnen,e numbcr~. ami sampling error can Iherefore be ~afely I,;!nmed Bloh.ll!tI~I~. however. commonly make observation~ on small numhcr, where eh.mce dc latinn <fill IlI:l.'ur. They must re.:ognile Ihe possibility of chance deviallOn anu estlmah: Ihe reliahihty of Iheir ubservaliun~. Genelici~1li in parti,ular have been faced wilh Ih" prookn., Mellud rceognized Ihe chance nalure of genc transmission anu the po.lssibiht~ of dniallOn fmm ideal ralit's, although Ihe stali,lical procedure~ f,lr evaluating the ,igmlicance of dC,la· tion~ were not available 10 him. This appendix will be wnccrned with MIme or the ha~ic Idw~ of pmhahtlll) and h,,'" Ihey arc used tn ~olve some genelic problems. Many of-Ihe ~tatl~ti~al technique, com­ monly in use require more background than can be a~~umed or provided in thi, trealment The material given here should proviue an appreciation of the nature 01 'lali~li,al inlcrenle as well as means of solving many of the Simpler pmhlem~ cm:uuntered In ),!cncllcs SOME BASIC CONCEPTS OF PROBABILITY The Meaning of Probability Statements A statemenl of I1mbqbilif)' is Ihe likelihood of a "fayorable" n'I'oLJ!.nJoog all JlO".bl" events. I-'(Il'orahle as used in this contexl merely tlenOles.!he eVClI1 or evcnt~ ,If jntcrcM. whether or not Ihey milth! be rated desirat>le by olher crilcria. Probabllily can Jr>' froOi 0 647
    • Somt fltm"", 01 P",hllhilm ullJ Slull)/Il ,·IIt/i. A 65: .246 .. .246..1 .~ .205 ] .1: " ~ c. 20 '<0-- .656 ---I " OJ 'c: U " -·-50% ~ ~ u .117 ~ 10 '0 ~ , .890 --I .. :B .ll ·····1 ···95%· ~ ··J····99%· .. 0 ____ _ .044 -3 -1 o -I I' 2 -.3 -258 -0.68 -0.68 .1.96 ·258 +_------.978 '" Figure A·} Normal di.lribulion pinned againM standard deviallUn 1.1 I The Intal area unJcr ~ .998 ~OOI Ihe curve appr"achc~ I as the curve i~ e~tended on ooth sides. althoUllh Ihe h(ll!hl of the ,une ~ ~ 2 13 ' 4- I 5·- 1. 6 I 7 I 8 9iO approachc, ler". The area under a particular portion of Ihe cur'c " the !,rnhah"lt~· Ihal a deviation fwm the mean will fall within .hat purtion. for eumpl~. the arca hounJeJ tw number of tails -0.68 .1 and +0.611 .< ~ 0 ..50. Therefore. 50~ IIf the lllne. Ihe mean valu~ of " ,ample '" iii faU wilhin :!:0.6!I s of .he lrue populallOn mean. The b<lundaru', fllr 'I'f~ .r.d '1'1'1- "r the .rea figure A.' Prubabilily dislribuliun for a loss of len «';'s. There an: (:1'" = Im.4 p.."ible arc als.. indira.cd sequences. givlOll .11 .umbinaliuns in Ihe relalive fcc,!ucncies ,huwn. The Iclt IIrdinalc is Ih,' number of sequences ,urccsp"nding hi a Iliven ctlmbinalilln of heads and laih. The rillhl urdi­ ! ,..... . -' nale is Ihe same divided by 11124 III ,"unvcn 10 fraclillns. The lillurcs inside th~ di,wbuliun arc sums uf areas as indicated. These values represenl the IIIlal prubabilily or any combination J j_,(t, .fr Within the area. . where J is the standard devialion . .t, are the observations •.f i, Ihe mean Ilf the Ilbst:r,J· el!!:.!:!!!!rll£~) is cqualto t The likelihood of all possible oulcnmes is shuwn in Figure A-I. lions. and n is the number of obs~Igru;.,.IClheoBservalions arc prilpurli'.lns, SUe'n as beads versu:~ tails. ralher Ihan measured quaniilics, the ~Iandard dC'lalltlO i, ca!.:jlalcd h} The distribulion is I:enlered awund 5. Ihe m,lsl likely oulcome. butlhis outUlnte is ex­ peded lnly: of lhe time. However. 89% of the lime the oull:ome will be nn more deViant J <= Ypq/n. than J: 7 tlr 7 : 3. As one exlends the number of trials, Ihe curve becomes smoolher until il can be ap­ where p and q arc the probabililies of the two alternate e'ents. proll,imatcd mathemalically by a conlinu;us curve Inal assumes an iiilinilc number of Distribulions may differ either in their mean~ or inlheir vanalion at-nul Ihe mean lrials. Such a curve is shown in Figure A-2. I is the so-called normal distribution IIr (or bOlhl. Figure A-3 Illustrates bolh of these possibilities. ""here Iht.:' .tbscis,,, i, ph)lIcd (j!ussian distribution Ihal describes lhe chance deVlalion a6nul snme mean valiiC'itlal in terms of some IInit; of mea~uremenl. The normal curve, as sh"wn in Fi!!ur{, A-~. h charaClerizes many biological and olher syslems and is Ihe limit of the binomial distri­ pillued again~t .f. This ha~ the effect of removing the scale dimcnsi!lll~. I)peratlon Ihal IS an butiun as n approaches intinity. The diMribulion can be used in very mut:h the same way useful fllr many slati~tical operations. ilS the discontinuous distribulion of Figure A-I. If une seleels any IWO points along lhe The likelihood of an observed deviation from an cxpct:led mean, In Ihe ..:as" of pro· hnriwnlal axis. Ihe area under Ihe curve delined by these boundaries relalive to the entire poriiilns. is given by z. the normal d t ' l l a t t ' : ­ 0::___ .. area under the curve is lhe probability Ihat a parlicular event will fall wilhin the bounda­ p.. - p } ries. This aSliumes that the normal distribution is an appropriate model and has the mean and deviation from lhe mean lhat ~re characterislic of Ihe biological situation. s f; Varia.!!!~i!y of a ~el til' observalions aboul the mean is nrdtnarily expressed·as.th!:-,.!i.H''!;.... where P.. is an observed prnpnnilln nf cvenl~. ~uch as hcad~. ami n !h., h" .....' .... ·; daJ:!.dt'viation. This is calculated hy the formula ....
    • ,. t, 't; A"I',ndi.• A S"m~ U~m('nl.1 ofP,,,huhdlll un,/ S,,,/I • 6SO If I' is the probability of a head. then the four terms are the probabilities of three heads. p(4H.:m == 4!2! I' . ' 6! q' two heads and a tail. one head and two tails, and three tails. The binomial expansion for any value of n can always be obtained by multiplying the 1·2')·4·,·6 ,; quantity (p + q) by itself n limes. This is nol necessary, however. Eaeh term in the ellpan­ --" 'I 1 • :2 •., ,4, 1 . :! sinn consists of two parts-the algebraic terms I' and q with their ellponents. and the coefficient. The ellponents correspond to the events or probability I' and q. respectively. 15p'q', For ellample. a combinatbn of 10 heads and 20 tails would give p'''q"'. This ellpression, which i~ the same answer given in the Pascaltrianglc for n f-, Sub,tltuting for l' anJ however. is the probability of a specific sequence of 10 heads and 20 tails There are many q gives possible sequences. the ellact number bcrng given by the coefficient. There are two convenient ways to arrive at cocflieients for inJividual terms in the PI4H.2T) "" ISdl't~)' cltpansion. For small values of n. and particularly if lhe entire ellpan~i(ln is dcsired. thc PI/.Klli rriunRle is helpful. Consider the eltpansion for small values uf /I. A{ Similarly, the prohahility of throwlOg any wmhinatiull 01',0 head, <lnJ ~o !JII, 1 n -- 0 1 II' ± Iq , f(~)1 " 2 II" + 2pq + 1'1' P(50H.50T) ;. 51l!S07' I' 4 3 II" + Jp'q + 31'11' + 1'1' 4 II" + 41"'1 + 6p'q' + 4"q' + I q' =0 O.U!t Regularities in the eltponents of I' and q are obvious and familiar. The cocflidents abo iii Thi~ I~ a 'mall number. which can be cakulaleJ readily e.en Ihnuj!h Ih.: l"lllul<llilln' 111 'olvc lIery large numbers. inlo a regular pattern somewhat less obviously, Eaeh is the sum of the two nearest coefti­ dents in the line above. With this information, one can eonstru.:! a Pascal triangle that yielJs any Jesired coeflicient: The Multinomial Distribution n =0 ffiing the same reasoning as fnrthe hinomial Jlstnbulhln. (1m: can ,et up a multinomial I distribution with any number of term" 1', 'I, r. J . • Where thc ,um III the term' I' 2 2 equal hI I. The general formula for the distrihution is 3 3 3 4 4 6 4 (I' + 'I + r, ,J" L --_I/~ ,,'£1' r 5 5 10 10 S x! y'!:' 6 6 15 20 15 6 etc. wherc .t. y. : •... arc the elponents associatcd with p. q. r. . anJ add up t" n For clumple. fur a three-allele sy~tem. tbc di~tribution of gcnotypes IR the IXlpulallun would be The Pascal triangle is very useful for small values of n. But n may be very large. and there are (n + I) terms in the expansion. Furthermore. interest may be limiteJ to only a l"cw terms. The general formula for any term in the expansion is '-". (I' + q + rl' "" 1" + 'I' t r' + 21"1 I 2f1r t 24' 'l ,¥'-r-·" The Jistrihuli<ln of genotypes for a three-allele lucus on chromosome 21 In a ,,"pulaHun II! Y' • ) <ttl with [)own syndrome (assuming independent origin of the three chmm<l~ome.'" woulJ he: also wrillen .tllll-=--:;)! P'I/'- " ·-C- '2))' "vvl l'~ V . II' + q + r)' ,,' + q' + r' + 31"'1 + 'CIpq I ~~,~T Jp'r + 3pr' + 3q'r + Jqr' + opqr. (~) 1"'1 0 -', The Normal Distribution <==: where ;tj:Wb!!,£lI.l19.m:J1LapRmprja~eJor;.p.""LnA.~)mplete elpansiun . .t varies from /I to O. For small values of n. one can graph the prohaoility lor ootalnlR!! any mlllhlnallOn .. I Suppose we wish to calculate the probability of tussing lour head~ and twu tails. In this 'huices wilhoul ,·;t'''t effort. For elamplc, consiJcr Icn tm,es III a I.'IlIn ."r II:n ,:hlldn:n ,a~e, .{ .. and n =0: 6. The appmpriate term in the expansion of (I' + £I)" i~ fwm a halkcrn, .. IJlil.';I. Th.; likclihoud uf hcaJ~ or tail~ (or J"nllnant ',cr'u~ r':cc"IC
    • "'<. ,f, A",U'ndi.• A S..mt' Ut'nU'n/. "I P",/lu/lli.h un,I.'iI,1II • 650 If f!. is the probability of a hel!d. then the four terms are the probabilities of Ihree heads. o! ,. t~Jleads and a tail. one head and tWQ tails, and thrcclails. P(4H.2T1 '" 4!2! I' 'I' The binomial expansion for any value of n can always be oblained by multiplying the quanlity (l! + q) by itself n times. This is not necessary, however. Each term in the expan­ 1'2'3·4'5'6. : I ' 2·3·4· I "-I' q . :! sion consists of two parts-the algebraic terms I' and q with their exponents, and the coefficient. The exponents correspond to the events of probability I' and q. respectively. 15p'q', For example. a combination of 10 heads and 20 tails would give _p'Ug'''. This expression, however. is the probability of a specilic sequence of 10 h!;aJs and 2Ql;llh;. There arc many which IS the same answer given in the Pascal triangle for n ,.., Sub'lllllling f(l~ " anJ possible se~uences, the exact number being given by the coefficient. q gives There arc two convenient ways to arrive al cocflicienlS for individual terms in the PI4H.2TJ = 15(ll'(~): expansion. For small values of n. and particularly if tbe entife expansion is desired, the PlUnJilriungir is helpful. Consider the ellpansloll for small values of II. '" U' n '" 0 Similarly. the prllbability of throwing any comi:linatioll of 51l head, and )0 I,ul, I, ! ". "'" '! ">~ I II' + Iq ... -! IOO~ ,_ 2 II" + 2pq + Iq' P150H.50TI 5"0150' I' 'I .' II" + .'p'q + 31''1' + Iq' 4 11'· + 4p'q + op1q' + 4pq' + Iq' "" U.UK Regularities in the exponents llf I' and q are obvlllw·and familiar. The (llefliClents abo lit This I, a ~lIIall number. which can be calculated readily ewn IhllLl)!h th.: c,lkulatiun.' tn inlo a regular pattern somewhat less obviously. Each is the sum of the two nearest coefli­ lolvc very large ~umbers. cients in the line above. With this information, on.ecan ennstruct aPascaHriangle that yields any desired' coeffiCIent: .' The Multinomial Distribu.tioD n "" 0 Using the ,arne reasuning a~ for the binomial ,lIstnbulllln. one l:an wI up a multinjllni~1 I distribution wilh any number of terms, p. q, r. s . ~hl:re Ihe ,um III the term, 1 2 2 c4ualto I. Thc general formula for Ihe distribulion is 3 3 3 4 4 6 4 I ---,- . 5 5 Jll to - -5 '~'I II' .1'- q + r .. ,J • "I' II. .. , -,,' x.y.z_, Y " • I' qIf " • 0'­ r­ il 15 20 15 0 etc. where x. y. 1, . are Ihe exponents aSSOCiated with p. q, r. . and autl up 10 n For ell.ample, for a thrce·allcle syMem, the dlstributiun of genotypes in the p(lpUlalHln The Pas£,allriang!e is verY-useful for small values of n. But n may be very lar]le. and would be there arc Vt-±~ terms in the !!~(ln. Furthermore. interest may be limited 10 unly a few terms. The general formula for any term in the expansion is II' + 'I + r): .= 1" + q' r' + 2p</ I 2"r + 2</, The distrihution tlf genotypes for a three-allele locus on chromosome 11 tn a popUlation "I .f!(11 - x)! p'q.", '111 )( ! tl'~)C) ~ px,,,-J. with Down syndrome (assuming independent origin of the Ihree chmmuMlme,J w(luld he also wrillen (I' +q + r)! "" 1,5 + q' + ,. + 3p:'q + lpq" ... 3p'r + .'1','" + 3qtr + 3qr'" t 6pqr ~)pJ~.-. G) p'q--', The Normal Dislribution where x i~ the exponent appropriate for f!.. In a complete expansinn. x varies from /ltn O. For small values of n. one can graph the probability tor "blalOlO)! any '·"1I10IOali"n or Suppose we wi~h to calculate the prubability of tossing lour head~ and two tails. In this choke, without 91"elIt effor!. For example. consider len h)~scs HI a ,1)10 tOf Iell ,hlldr~n I:<lSC. ,{ -= 4 and n ""' b. The appropriate term in the cxpansion nf (I' + q)" is frum a bllCkcrnss mating). Th..; likelihood uf head~ or lall> tor ,1"nHnant versus rc,e"IC
    • -] Wlldi.{ A SO"" Clfntn1/< ol Pro;,,,;,t/,II "lid SI"IIlI" , . 65. .,. '" 210~ a . 2521­ R r .205 j .: ·30 ~'~l u c '& ~ ~ :; ., 0. 20 0 u :s ".. g > ~ -··50%· . u c 120 .117 :; g 890~ :> ~" 0 10 '¥ .~ '" ::a 95%' 1: '" .0 0 . _ . j ..•. l'i . ·~- .. 99%-1· - ... i. ;:--..--­ 45, L-..........., -1.044 -3 -I I 0 I -I "2 -3 -2.58 -0.68 +0.68 + 1.96 -2.58 .978 I~~ ~~.OOI Figure A·l Nnrmal ilislribuliun plone.-d against standarddcviahon {., 1 The l()t~1 ~re. under 0 r=;=1 2 I 3 I 4 I .998 5 I 6 I 8 I 9 I 10 the curve approaches I a~ the curve i~ cxtended on oOlh sides. althuui!h [ho: hCI,;hl "llhe cune approache!> leru. The area under a particular portiun of Ihe.- curve" Ihe l'mhahtlll~ Ihal a deviation frum the mean will fall within that p"rtiun. fur eumpk, Ihe ar~a hounded t>~ number of tails -0.611 sand +0.68.1 = 0.50. Therefore. 50ll: of Ihe lime. the mean alu.: IIf;l 'ample I, III FiKure A-I I'mhabilily dislribuliun for a loss Ilf len cuins. There arc (~IIO = 1024 p""ihlc fall within :!:0.611 s "f Ihe lrue populatiun mean. The hnundar"" r,'r IJ~'. ar,d IJIJ'!, pf thc .red ... quem:e!>, givint! II ,umbinallllns in Ihe relalivc frcljuencies shown. The len Ilrdinalc i, Ihe' - arc als" iniJi..aleil . J. . numhcr uf !Cljucnce!> ~llm:sp..nding III a tliven cumbinaliun Ill' heads and laib. The fli!ht ordi· nale i!> Ihe same divided by 11124 III ClInverttu fraeli"n!>. The lii!ures in!>ide the di,trihul,,,n arc ft - .um. of areas a. indicated. These values represcnl Ihe h'lal probahiltly ur any c"mhinalilln 3 '" _,(x, - 'i)& wllhm Ihe arca. n - ,­ where s is the standard deViation . .t, are Ihe observalions. j i~ the m.:an lIf the Ilb~q.I' rlJ.l:!1l!1tpe~1 is equaltn ~. The likelihuod uf all possible outcome, i~ shown in Figure A-I. t!lli:ls. and n is the number or observations. If the ohservations arc prnpmtilns. SUch a~ The di~tributiun is centered amund 5, the most likely outcome. but thh nul;ome is ex­ heads versus tails. rather than measured quantities. the ~tandard deviatl'lO i, (Jk.Jlate,1 1-» pc.ted tlRly tor the time. However. 119% or the lime the outcome will be no more dcviant than 3:7 or 7:3. I = v;;;;t;;. As one extends the number or trials. the (;urve bewmes smolltl1!r untJ.Ut .an be ap­ ....here p and q arc the probabilities of the two alternate events. proximated mathematically by a continuous curve that assumes an intinile number of Distributions may differ either in their means or in their variation at-out the mean trial~. Such a curve is shown in Figure A-2. 11 is the so-called normal distribution or (or bothl. Figure A-3 illustrates both of these possibilities ..... here (he .Ih~ds-a 1 pl""ed ~aus.'iiian distribution that describes the chance deviation about some mean value thai in terms or some I'nit.; of measurement. The normal curve. as shown in Fij!u£(: A·.:!. i, .haral:teril.es many biological and other systems and is the limit of the binomial distri· plolled against of. This ha, the effect of removing the scale dimenSIOn,. In operation that i, hution as n appruachcs inlinity. The distribution can he used in very mUl:h the same way userul for many statistical operations. 'I~ th.: disl:llOlinullus distribution or Figure A·I. If one selects any two points along the horiwntal axis. the arca under the curve detincd by these boundaries relative tn the entire The likelihood or an observed deviation rrom an expected mean. In thc .:; ... s,~ ,,' pro. portions. is given by z. the normal devialr: area under the curve is the probability that a parlicular evenl will rail within the btlunda­ rie.~. This a~sumes that the normal distribution is an appropriate model and has the mean : p".- P ., and dcviation frum the mean that arc characteristic of Ihe biological ~iluation. Varial'lility of il ~et of observations aboultbc meal] is urdinarily expres~ed as the ~­ <I, dard deviation. This is cakulah:d by the formula where Po is an observed proportion of even!~. ,uch as hr"d~. :1m! pthe hypc1ttett(Q. .
    • , ' :, Sunf(' 1:'J('I1U'Il/ III Jlroh"htlu, wad .,a/HI11 . ('lul". A ~ss (a' sis Ihat p = 0.5. with only a 5% ri~k of making an error ifp~II.5 in fael Had the devia· tiun been large enough 10 yield: > 2.5H. the risk wou"i1liavc ticc~ly I'k ~ TESTS OF SIGNIFICANCE In genetin. as in many uther sciencc~, the question frequently an~c~ whether a sd Ill' ohservations conforms to a ,ertain hyputhesis. Thi, ,'Ikn tal,s the hlrlll HI ."J..lOg M M2 whelhcr Ihe dasliiheatitlO of pcr~llns as affected or nonafk:tco Yield, ;1 r,II,,' l'lImpatlhle wilh simple Mendelian inherilance. Thi~ problem i~ the ~ame a, asJ..I111= , h.::th,'r Ihl' d"lfI' hut ion of heads and lails In a number of win tm~es i~ (nmp;lIi!lle , Ith Ih..: 1'1 p'.theq, lIt equal ,ham:e" fllr ealh to lurn up. In order III answer Ihe questIOn, one must formulaIC a null hpolhesis: Iii;,' 1'. thl'rl I ' no difference Th:.lwee.!1!j:ls.,lbeill£,!ical value of l1..and the value In rhc I!'rulallon ,tudlL'd In iCmlS of a !:oin. we would say the ralio of heads and lails is not dlffcrenl from Ihe c~ pcl.'lt:n ­ if head~ and lalls arc I.'qually likely. If we L'an pmVe the hypothesi, uillikely, rhcn ,,' 111.1 Wish 10 I.'unclude that heads and laih arc 1101 equally likely and Ih.1I thl.' ("in l;tnJ~ nn one side more (lfll;n Ihan on the olher. . For example, if .a CIlin were lossed len times, a lit:!) dlslntlullon w;;uld tl":lur only (I,Ott:?M Iheolime when headli and taib arc in facl equally IIJ..cJy. 111111' dl,lrlhutl>O ""cre aClually tmsed, one could eoneluue thaI ihe coin resulls were hiased amI e'pl."1 Itl be wmng only O.:?'k of the time if in facl the .oin is nol biaM'd. All: I lIi~lrihuti"n ""Iluld .I1s1l lead one to di~l:ard Ihc null hyputhesis. !lut in this -:<I'C, an errllr would be nhld, :!.:!':; III the time. An II: 2 di~lri!lution would lead 10 error 11'1, of the linll' if th.: hYp'Ih':si, .were rO:,je,led. hlr mO~1 purptl"es this error is IO(llargc, '1Il1lth. hypolhcslS 'ouhl Ix "I h I' cd to sl.md pendin!! l1Iore dal.1. AI Wh<l'.llllinl sho.uld the null hypt'lhesi~hc rejccled','The an"'''l'r d.:p<.'nd, I1n Ihe ItJ..e· lihmld of alternate nplanalions and the conscquem;cs of a wmn!:! dCl'l,inn For mil! 10· Figure A-J (al TWl,nurmal distributions with thesarne arc;i~ andlhc samc variatiuns a"'.ut !estigation~, a result Ihat ""ill happcn ks~ Ihan .'i'1, of the lil111: hy I.'han" I~ ";"l1s..l.:rl.'d the' meanllutwith sh~hlly different means. (bl TW(I normal di~tributi(lns wilh the !iame mcans aJl!quale In Tl'J.'l:1 a'hypolhesis, II' the rl!suh could happen onl} I '.,f •• llhc IImc hy dl.lm..-. and same arcas hUI With different variatiun a"',ut lire mean. Ihe hypothesi~ can be rejected with greall'r cunlidcOl:e. On lll:l:dSinn,. crroneou, rCJcLlion of a hYP'llhesi may have serious consequences. A mnrl: l:on"er'<lllvc Ind. ,u.:h JS I), I '.f . ~'tr , ~' Evaluation of : can be iIIu~lraled by Ihe follllwing. A coin is losscd IOU timcs. Heads miJ!ht he ucd. Therc arc many tests for delermining whClher or not a parti"ular s.: of "h'l.'rvJlilln, ;;,:: lurns up 40 times and tails 60. Can we conclude thai the chances. of heads versus tails arc i deviates from an expcl:ted distrihulion (and Iherefore fnlln ~I parlil:ular h~ p'llhe,is (lr not equaP lei us sct up Ihe hYIX'theliis Ihatthe chances are equal; that is. p = 0.5. The model!. Ihe various teMs heing appropriale for partil.'ular kinds .. I' uh,crvallon, dnd (lh~erved 1'.. '" tl.4. The observed deviation. expressed as unils of standard devialion is models, In Ihe ;Ih"vc dlscus~ion, exael probabilities were ,'akulalcti fur hinollll;11 1I1,ln· Ihcn t hUlions for small values of /I. In the ca~e of normally di,triOliled variahle" Ih, n.lrmal 114 - 0.5 J d<:'iale : i~ sometimes appropriate. and indeed 010,1 ,'tatIStical Ic'" Ulllll1all.'l) arc hased on Ihe norlllal deviate, Of Ihe various tests. the followin)! are mHsl oltcn u,cd In prohklll' Y(O,5)(0.5)IIUO encountered in elementary genetics. = -2.00 Chi-squared Analysis Ikfcrring 10 Figure A-2. we sec thai a deviation of l.rN) alung Ihe ahscissa includes juSI liver 1l5'* of the area of Ihe curve. Therefore. a 40: 60 deviation will occur less Ihan 5% or Thc x' (l'hi·squaredl analy~is is relaled in thcllry tn the nmlll .. : ,k'lal.' anaIY'I' '.. uthn~J the lime hy chance when the true value of p '" 0.5. We may thererore rejecllhc hypothe­ .. hove, Hnwcwr, il ha~ some spccial application, Ihal jU'lif~ ,,'par~lt.: ':lInllkr;,II./o
    • , App~ndi,( II Somt' f.lt'nft'n!. II' I"to/1.uhtl,! (inti ./1i 656 'Chi-squared analysis can readily be applied 10 data eonsl~linli of mure than one cal~ Goodness or ti!. A measure of deviation from theory of a set of observations can be gory. In the cross AuBIl x uubb. four types of offspring are expectcd: At/lib. uIIBh. ,4,jll/,. measurea6YX '. gi veif Dy"ti1e'formuiii" and lIabb. If thc.t..~nJll!,;lar~cgrcgating,iRdcpcndenlfY'!"lhc,fHur.t~:~~,l!1~1J~P..r,l!I~.'h"~11 1_ L (E - 0)1 ,hi: in cqlli!l~p-!,of!lJni!Jn~. ActuallJh~crvations gave 45,.,At/Hb. ~tillBh. NI Aann. and 'il. X- E~ /lIllln, Arc Ihe~c values compatihle with a I: I . I : Irati,,' Thl e'I";lc<l value r"f (Jeh category i~ 50: hcnce, where E = eXP.!icted value. 0 = observed value. and ~ indicates summation for all cate­ ~ , ~ gorics. In the coin-toss problem of the previous section. the expected values (on the hy­ (so - 45.' i50 - 4(11' /50 t.lll' ,<,0 .. ~"l' X =­ + ----+ pothesis of p = 0,5) would be 50 heads and 50 tails. The observed values were 40 and 60. 50 50 ",) Thcrefore. -= S.no. df = J. , _ (50 - 40)' j50 - 60)' Ft Ir 3 df. ,Urr~t~,:ll;c~jJ~1Jl.inTorder~HMn-'aliuaICxlhc~b,~:i'" Jlhe,,·' X- 50 + 50 = 4.00. Contingency. Another importanl u~e of )(' i~ tn IC~lInl:! f(lr contingfnq "f as,""da­ tion. The eJlamplc jU~1 Cilcd is it tcst uf itssuciailon between Ihc A Ineu' anu Ihc H Ii '1.'11' This value of 4,00 is the square 0(2.00 obtained by the earlier method. since. for the Sinec a sp"citie ems, was lesled and simpk Mendelian inhenlanu: WJ' in'llcu. II ".1' special case of two alternatives (heads or tails), X' = Z', pm,ihk 10 arrivc al Ihc expected valucs nn Ihcurelic.!1 !!f<lunus .1Od inucp"ndcnlly "r rhc In order to decide whether the value of Xl is too large to. represent chance occurrence. d'lla 10 be Ic~leli. In nlhcr cases, il is desirahle 10 IC~I for a"m:laU"n hctwl'cn IWII trail' 10 it is necessary 10 consult a table of X' values. such as is given in Table A-I, To use the the ahs,'n..:e of cxlernal hypothc~es fmm which to dcrie cpcdcd 'aluc," table. one must know the degrees of freedom. The ,!!9rees orrreedom (dO is ~~ An cJlalllplc would be Ihc ass'lClation hetween hair (olnr ;mu e)I' ",lnf. (XI I:>lon.l, of categories that can be varied independently withuut chang,ing the total number of ooser­ tcnu til havc blue cycs more ulkn than bmwn cyc~" Or pcrhap' Ihey t'~'nd hi h."c I:>w" n ValiOiir'Iii"thealXive pro6lem. ttlere'ls one ilegree of free&;'in. since only one category eye~ lOorc lIftcn than I:>lue. Obscrvatiun lIf IUO pcr~lJn~ !!ac .'5 wllh hluc eye, .IOJ hlonJ ·can be va;ied: either heads or tails. If the proportion of heads is arbitrarily specified to be hair. 10 with hrown eye~ and hlond hair. 15 .... ith blue l'YC' and hr.l"n hair. Jnll "'0 wllh 0,45. Ihen the proportion of tails must be 0.55. If one were ~tudying the distributiun of die hrown eyes and hrown hair. A wnltngcncy lahle is ~l'1 up as f"",,.. . ,, scores. there would be five degrees of freedom. since a dip. hl'~ sill. sides, ............--~ . for one degree of freedom. a X' of 3.84 is larger than 95% of the values obtained by chance. The X' of 4.00 exc.e,e~.t~is~"I!I~. indicating that ,!L4!!.:,6P,dis!!i.I}l!lj.l-ln_w.ouJ~­ Ohserved Hlue TlIla) , cur less than 5% of the time by' chanc~,.Jhe same conclusion reached by the earlier ~methijU. Hlunu hair 3" III ... ' Urown hair IS ...0 5: Tolal> 50 ~o HIO Tabl~ A·' Distribution or Xl Th~ Palu~s of P are the probability thal X' will exceed by chance the Palue given in In tlrucr tn,akl!!.,IC X'. it is nccessary In wmpulc a tahle 01 epcclcJ alul". hom th~ tab/~.* i (hl' marginal tolals ahove, {~I. of Ihe persons have bluc eye, ami ;:;:. arc hl"nd~. "C .... I,h III Degrees p I tcstthe null hyfllJlhcsis of no associ'llion. Abwnn' or :Js"><:lalilln would kaJ u, 10 cXJlC" thaI i:::. (If the 45 blonds wluh.l have hlue cyes, Similarly. i~!. of the 5~ per".n' .... ,th "now n I of Freedom 0.99 0.95 0.50 0.10 0.05 0.02 0.01 hair .,hould have bluc eyes. and so on. A lablc of expected value, l:an h..: con'trw:t,'.! I:>y I 0.00016 0,0039 0.45 2.71 3.84 thcsc procedures a~ fulluws: 5.41 6.64 2 0.0201 0.103 1.39 4.61 5.99 1.82 9.21 J 0.115 035 2.37 6.25 7.82 9.84 I.)5 4 0.291 0,11 3,36 7.78 9.49 11.67 I3.2K Expecled Blue eycs Brown eyes Tnl.I:, 1~'-eSi~5 5 0.554 1.15 4.35 9.24 11.117 13.39 15.09 10 2.56 3.94 9.34 15.99 18.31 21.16 23.21 Blund hair 22.S ...S Brown hair (",)(.l3'17.5 27.5 5~ ." . • Ahrillt!cd from Table III in R. A. Fisher. 19511. Slali.flicai M,·lhoJ.I/or R~s~aT<'h WlJrAu,f. nih ell.. Oliver Totals 50 511 )Ot) Buyll. Edinburgh. by pcrmi~si()n ()f the aUlhor and publi~hers. ;11111 .-'
    • -Stmh" l:J.'fff«·1I:. 01 I'fllhuhllrl 0/11/.10 APPf'fldLI i 656 Chi-squared analy~is !.:an readily be applied to data ..:onsl~tin~ lIf more than one catl' Goodness of fil. A measure of deviation from theory of a set of observations can be gory. In the cross AuBIJ x uubb. four types of offsprm!! arc c~p.:t.:led: A1I8b. wlBh. AII/>Il. measured by X'. given by the formula and (wbll. If the t.~~I~i af£2£~@atin!.l independenlly. thc -'i1ur typ.:, nlllfhprmg ,holul.! X' = L (E - 0)' be in equal proportions. Actual observations ga'e~. ~. fA.) Allhh. ~ ~ !!El!P. Arc thcse values compatible with a I: I : I : I ratio? lh, c'p.::led alue f,'r .:",·h catc~ory IS 50. hcnt.:e. where £ '" expected value. 0 '" observed value. and ~_ indicates summation for all cats:­ gflries. In the coin-toss problem of the previous section. the expected values (on the hy­ ... _ ISO - 45)' (50 40)' (511 .. 0I1l' 1)0· ~~,' J( - 50 + + ----­ pothesis of p = 0.5) would be 50 heads and 50 tails. The observed values were 40 and 60. 511 50 ~" Therefore. '" 5.00. df J. , _ (50 - 40)' 150 - 6Ol' Fm Jdf. Xl must cx.:!.:":d 7.11 in order III invalidale the h"lllllhc"'. x- 50 + 50 .. 4.00. Conlingen<:l', Anolher imporlant use (If Xa. IS m le,hnl; 101 conlinge!!,9 "I as!oI(l('ia­ ~. The cl(amplc jusl cited is a tcsl ofassociation t>clween Ih.: .- I.'CU' and Ihe H h"u, This value of 4.00 is the square of 2.00 obtained by the earlier method. since. for the Smce a sp.:cilil: CfI)ss was tested and simple Mendelian inhcrtliln.:e .... J, Involcd. II ".1' special case of two alternatives (heads or tails), X' = Z'. . .' possible hI arrive at the el(pecled values nn IhenrCIK.lII!TIlund, .lnLl ·mdcp.:ndcnlly .,llh.: In order to decide whether the value of X' is too large to represent chance occurrence. data to t>c tested. In other cases. it IS d.:sirahlc lolesl for a"onalion t>clw,'!.:n (.... " Iralh In it is necessary to consult a table of X' values. such as is given in Table A-I. To use the thc ab,cncc or cl(ternal hy~lthcses from whICh 10 deric c'p.:dcd salu,;,. table. one must know the degrees of freedom. The degrees of freedom (dfl is the number An cl(all1ple would be the ass(l<.:latiun.bc(wccnhalr. '!.:olllr -!.nd-"'-~l' «11~lr Do hlllnd, of categories that can be varied independently without changing the tlltal number of ohser­ tcndtoh'avcbluc cyes more ohcn than bmwn eycs" Or p.:rhaps th,'y knd hI h.ivl' hr .. " n lations. In the abole problem. there is one degree of freedom. since .only one category eycs more often Ihan hlue. Obserlati(ln (If 1tl0 p.:rsons !!ae .15 .... llh hlu.: ne, ilnd hl .. nJ ~laried: either heads.ortails.lfcthe propOrtion of headsis arbitrarily specilied to be h"ir. I~~ith hrown"£yc~ and blond hair. 15 .... ilh bluc ,'vc' and hr...... n hjl1r. In.t 't!..~h 0.45. then the proportion of tails must be 0.55. If one were ~tudying the distrihutilln llf die brown cyc~ and hl"wn_hiilr. A ,on~~ngeR:y lable i~ Sl'. up a fllll.1WS scores. there would be .fl.J,e degrees of freedom. since a dip. hl'~ six sides. For one degree of freedom. a X' of 3.84 is larger than 95% of the lalues ohtained by !.:hance. The X' of 4.00 exceeds this value. indicating thaI a 40: 60 diSlribution would oe­ Ohcrvcd Blue .. J"I"h ~ur less than 5% of the time by chance. the same conclusion reached by the earlier method. '-IJI<Ii''-trhiiTi J" 10 4; Brown hair 15 411 5:i TlIlah 5ll 'iO lUll Table A-I Distribution of X' Tire valiles of P aTe tire probability tlrat Xl will exceed by clrance tire vallie given in , In order tn cakl!!.lle X'. it is necessary 10 WOlpute a lahlc 01 nl'ednl salul" hlllll tire table." Ihl' 1l1arginaltutals above. "al.of Ihe per~ons hale hlue eyes ami i~;:' arc hltlnds. e '" "h III lest the null hypnlhcsls of no as~ncialion. Ah"t'nH' of as,..cialilln w .. uld lead u, hl 0('1:1:1 p Degrees that f:::. uf Ihe 45 blunds wlluld hale bluc cyc,. Similarly. ,'m. III the 55 persons", Ilh hr.'''' n Ilf Freedom 0.99 0.95 0.50 0.10 0.05 0.02 0.01 hair. should have blue eycs. and so nn. A table nf expected value, .:an h.: !.:onslru!.:lcJ hy I 0.00016 these pnl<.:cdurcs a~ follows: 0.0039 0.45 2.71 3.84 5.41 0.64 2 0.0201 0.103 1.39 4.61 5.99 7.82 9.21 J 0.115 0.35 2.37 6.25 7.82 9.M 11.35 4 0.297 0.71 3.36 7.78 9.49 11.67 13.211 • hpcctcd Blue eyes Bmwn c)'!.:s TOI,I:' 5 0.554 1.15 4.35 9.24 11.07 13.39 15.09 10 2.56 3.94 9.34 15.99 18.31 21 16 23.21 Blond hair SOL45" 22.5 = 22.5 45 Brown hair SOY.,;s- 27.5 ". 275 )5 , • Abridgcd from Table III in R. A. Fisher. 1958. S/Il/i.t/ielll Mt'/h()d.~ /lIr RfSflJrrh Worker... nih ed .. Oliver Tutals 50 50.", ' '" 100 ;md lloyd. Edinburgh. by permission of the aUlhor and publishers. .1 '.:, ......
    • C"lIflix A , ,. 501114' 1:"1""11'111. III J!rohtl;,.. J,h (Jlltl ' ,ltllll11 6SS '. , tal sis Ihal p = 0,5. wilh only a y~, risk of making an error if p _'" 0.5 in fact Had Ihc ucna· lion been large enough 10 yield :_~2J.R. Ihe risk would ha'e been pnly TESTS OF SIGNIFICANCE In gcneti.:s. as in many other seicncc~. the ~ueSlion fn:~uenlly ansl:s ,helhl:r a 'el of observatiuns conforms 10 a (ertain hypothesis. This pften lal..:, Ih.: l"rl1l 01 ,1'l.mg M, M2 whelher Ihe dassilicalion of persons as affecled or nonalleo.:h.:d YlelU' ,I r.llill o.:omp"lthle wilh simple Mendelian tnherilanee. This problem i~ .he ~ame ;I' ,,'1.111).: 'h'::lh.:r .Ill' dlSlrI­ hution of heilUs"nd lail~ tn a number of win los,cs is t.:ompallhle 'lIh Ih.:: n~ polh':'I' III e~ual t.:ham;e, for cach hi lurn up, In order 10 answer Ihe ~ue'llon. nnc II1U,1 formu;I": a null h pllthesis: tlld! I'. ~ " no difference between the Ihcllrclical value (If (I and Ihe value 10 .hl: popular".n ,Iuuleu III terms of a t.:llin. we would say the ralio of head, and [aib I." n,,1 dllh:rent from Ih,' .:p..'o.:h:u if heau, anu lalb arc e~ually likely, If we I:an prove Ihe hyptllilesis UIIIJI..:!y, Ihen ",,' rn,l~ wish 111 wm:lu.!c Ihal heaus and [ail, arc IlOt e~ually IIk.:!y an" Ih,Jl IIll' n'ln lilOJ, ..n on.: ~.ide mon: tlllen than on Ihe IIlher. For e,ampic. if a coin were lossed len limes, a I(l:() dl,lnhutllm Wllulu ....:t.:Uf '>nl~ (U)O:! of Ihe lime when heads and lails arc in fa!;1 e~ual!y hl.cly. It 1111, ulslnhullOn ","ere aClually I""eu, one enuld conclude Ihal Ihe coin resulls were h .."cu anti "'peel hI bc wrung only n.:!'k of Ihc lime if in facllhe cllin i, not blase'd. A l). I dislrihulioll ", ..uld .lis.. kad one hI diward the nul! hypOlheMs, bUI in Ihis ~asc. an errllf woulu bc madt.: :!~'" 01 Ihe lime. An K: 2 distrihution would leau III ermr II 'It of tht.:' time if Ih.: hYp'l1he'l' wen: rcjcl'lcd. I'm 010,1 purro,e, Ihis crrnr is hlolarge. alld the hypolhc,.,i .. 'lIuld bc alloweu 10 ,I<lnd pending mOil' data, AI whal poinl should Ihe nul! hYp'lIhc,is be rcjet.:'led" The ilO'",.:r d..:p..'nu,. ,'n Ihc Ilk,', lihood of alternale ell planations and the wnsc~ucnl'es of a wrung ue..:"i,,". F,'r mils I lO' "'iIlUt'e A.J (al Twu nllrmal..!islrioutillns wito toe ~amc arcas an..! the same variati ..ns ao"ut ve'ligalion,. a result Ihal will happen Ie" than ~';' of the time hy ..:h,lO':c is ,'pn",krcu Ih~ mean hUI with slightly different means. thl TW(lIlOrmal dlSlnbutions wilh the same means aJe~uale 10 rejed iI hypothesis, If the result~ could happen onl} I c; .. I' Ihe' Btn.: I>y 0.: h;11;e . an..! same "rca, oul with different variali,1n ahoul Ihe mean, Ihe hyp(llhesi~ can be rcjeo.:ted wilh grealer conhdcn:c. Dn 'I:o.:a,inn,. erroneuu, reje~llon of a hypolhesi, rnilY hallC seri(lu~ e(lnsc~uences, A more :Jnscrvallve h:'cI. ,u.:h j , I), I ',f , mighl he u'ed, balualion of z can be iIIustraled by Ihe followin!:!, A I:oin is lo.~sed IOU limes. Heads lurns up 4() limes and lails 60. Can we cundude Ihallhe cham:es of heads versus lails arc There arc many le>l, fur uelermining whether or 11> a partiLu)ar s.: "r ohwrvalion' I deviales from an e~pe:ted distribulion (and therefore frotn a parti,'ular h~ polhe't- or nol e~ual'! LeI us scI up Ihe hypothesis Ihal the chanl:es arc equal; thaI is. p =- 0.5. The model!. the various lesls being approprialc lor partiLular Kiml, "I' oh,ef'allltn' .Inu observed {'.. :0 0.4. The observed deviatinn. e,.;presscd as units of standard deviation is then 0.4 - 05 I j models. In Ih.: ahllve disco,sion. clIact probabililies were rakulalcli lur hlllollll;11 111,111­ I>lItiuns fllr small values uf II. In the ca,e of normally ui'lrthuteu 'ariJI>h:" Ihe nllrmal dtviale ; i, Mml;lirne;. appropriatc, and indecd mosl slali,lie-al Ic,l, ull'lIlatd) arc hawu 11[1 the norlilal deviale, Of the various leSis. the folillwing arc nlll,1 olten U'l'" III I'roh"'01' ; '" V(U.5)(O.5)/ItlO en:(Iunlered in elemenlary )!enelie,. - 2.00 Chi-squared Analysis I{dcrnng 10 Fi!:!urc A-2. we see that a deviation of ~.()() along the abscissa includes just liVer '15'* of lhe area of Ihe curve, Therefore. a 40: 60 deviation will occur less Ihan 5% of The.xllt.:hi'S4uaredl analy,is is rdah':u in Ihcory 10 Ihe nmllla: d':I"le analY'h '.>ulllm:d Ihe lime by l'ham;e when Ihe lrue value ofl' =- 0,5. We may therefore rejel:llhe hypolhc· ah(lve. Howcver. it hi!' 'lime spel'ial appli,alilln, Ihat jU'lif~ 'C'PiH;ltc' .:..n,ld.:r.. llln
    • .J ," Sit/II(' f.lrl'l'U'f1h III Pron(Jh'/11 ,m.I IIPf't'"di.( II 651 Note that the marginal totals above must agree with those in the lable of observed data. II example. IQs of wives and husbands nr height and weitlht III a ":ric~ of indimluab is neccssary to calculate only one of the entries by multiplication. The remaining three can can be calculatcd by the formula be obtained by subtraction from the marginal totals. For this reason. there is only I df in a ~(X - X,4Y - y) 2 x 2 contingency table. Calculation of the X' is as before: r = y'~IX - xn:IY -- h , (22.5 - 35)' (225 - 10)' (21.5 - 15)' (21.5 - 40)' whcre X is an obscrved value tlf nne Irall. Xi~ the mean value. Y i~ a I:llrre'lXlRdinj! ub" X "" 22.5 + 22.5 + 27.5 + 21.5 vation on the second trait, and P is the mean value. nhe denvallun uf rand ib retail.: ship to other statistical parameters arc j:!iven in te"'lblloJk~ III ,1.lt"IIC, I 25.25. The coefficient' can allsume any value frum - 1.0 III + 1.0. The l'IIrrclalion wd This value clearly ellcecds that required for 1% level of significance for I df. We must cient may best be thought of as a measure of the variation ~harcd In commun by Iwu v·al conclude that the null hypothesis is incorrect and that there is indeed association betwecn abies. If none of the variation is sharcd. thcn r is lero and a pille uf unc variable OIl!lainM II hair color and eye color. there being an ellcess of blue-eyed blonds and brown-eyed bru­ other shows a random scalier. If the mrrclation b ,ignilkanel~ fl"~ile or negallve. lh.: nelles. and a deficiency of blue-eyed bruneucs and brown-eyed blonds. the two variables are somewhat related. and knllwJ.:dge uf tine i, u!>Clul in preLikein!.! th Contingency tables can be constructed larger than the 2 x 2 that is iIIustraled. The other. There is still unshared variation unless r -= + 1.11 nr - 1.1I. degrees of freedom in an m x n table is (m - I l'n - I). The reliability of a correlation cnefticient is depcoocnl un the numhcr of ubcrvallon on which it is based. Tests uf significance are available for dccidin~ hethcr a corrclatltll V.lts correc:lion. The X' table is derived from a continuouli curve. although the c(lcllicicnt is significantly different from lew or frum an"thcr cakulated curr.:ialllli distribution of X' itself is discontinuous.. This approllimation is of lillie significance unless !';ocfficient. the numbers of observations arc lImall. in which ca.'IC X' is likely to be erroncously large and a hypothesis may be ineorreclly rejected. F. Yates has suggested a means of correction for comparisons in which onc or more o( the cxpected values is small. The correction consists in-reducing the (E - 0) values by 1. Thus in a goodness of 111 tesl. the observed values might be 4. 1. 6. 3. with expected values of 5. 5. 5. 5. The X' would be calculated I , _ (4.5 - S)' (6.5 - 51' (5.5 - 5)' (3.5 - 5)' I x- 5 + 5 + 5 + 5 1.00,3 df. 1 Withoutlhe correction. X' ;: 2.00: 1 There is no fa.t rule to decide when 10 use Ihe Yates correction. If the expected values in each cell are under 10. then the correction may make a difference. CORRELATION II is oflen useful to have a means of upressing quantitatively the resemblance among relalives. In other inMances, it lIIay be desirable to express the relationship between two measures made on a series of individuals. This can be done by the correlalion coefticlenl, r. We will be concerned only with the simplest case of linear correlation, that is, when the relationship betwecn two variables X and Y is such that for any change in X there is a proportional change in Y. If Y is plotted against X, a straight-I:ne relationlihip is observed. For the ~ituation in which there are two serie!> of measurements to be compared-for ~--~------- ----
    • II! Appt'ttdix A Souu' f.lrm~'nl 01 P'(1huh,I'I und 658 NOle Ihal the marginallotals above must agree with those in the table of observed data. It uampk. IQs of wives and husband~ or hcight and weight of a ,eric' of indiIIJuab is necessary to calculate only one of the entries by multiplication. The remaining three can can be calculated by the formula be obtained by subtraclion from the marginal totals. For this reason, there is only l!!! in a 2 x 2 contingency table. klX - X)(Y Yl Calculation of the X' is as before: r = YklX _ X):llIY _ y,1' l. _ (22.5 - 35)' (22.5 - 10)' (27.5 - IS)' (27.5 - 40)' where X is an!!h~ed valuc (lLQl1c..llait. X i~ L"LI!l!:.an valuc.Xj~!.JUJJrrc:'Jlllfidinl!rUh" X - 22.5 + 22.5 + 27.5 + 27.5 va!ion'in...II:l(aCClJrui1rait. and f is thc mcan value, /The dcrivallun (If r and Il~ rclalit: ship to other statistical parametcrs arc ~iven in le"'tbl<lk~ If ~I.lItlC'.1 = 25.25. The ~CJlLr_can assume any valuc frum .. 1.0 to + 1.0. The ,'unclatll," l'Ud This value clearly ClIceeds Ihat required for -' % level of significance for I df. We must cicnt may best be thought of ilS II measure of the variation ~hared in common hy IWIl '011 conclude that the null hypothesis is incorrect and thaI there is indeedj~l!.~ill!is!IlJ~t~~ ables, If nunc of the variation is shared. thcnL.h.n:m and a pllli (If un.: variable a~alnslll hair color aJHl~QIQt. there being an ~ess (lLblue-9~ blonds and brow.Il.:n~­ other shows a random scalier, If the corrclation is ignilkanll~ Jl<l~ilie IIr nCji!illIVl'. Ih.: ~. and a deficiency of blue-eyed brunelles and brown-eyed blonds. the two variables arc slmewhat rclated. and knllwledgc IIf on,' i, u!.Clul In prcdi~linl! Ih Contingency tables can be constructed larger than the 2 x 2 that is illustrated. The other, There is still unshared variation unlcss r = • 1.11 or - I .11. degrees of freedom in an ~ table is (m - 11'(11 - I>' The r~liability of a correlation cocfticiclIt i~ ~J'CndcntllR lhc numbcr III ubservallon ~h_it is bas!:.,d. Tests If significance are available for dCCldin~ I!.hclher a nlnclallc 1, Vates corftClion, The X' table is derived from a continuous curve. although the clldlicicnt is signilicantly diffcrent from lCro lr (rum annlher ..:allulalcd cunclaill" distribution of X' itself is discontinuous. This ~lIimation is of lillie significance unless r.:ocflkienl. the numben of observations are small, in which case X' is likely to be erroneously large and a hypothesis may be incorrectly rejected. F. Yates has suggested a means c;lf~orrection for comP!lrisons in which ~ or morcof . ~lIpccted values is small. The correction consists in reducing the (E - 0) values by I. Thus lifa goodness of fit test. the observed values might be 4. 7. 6. J, with ellpccted values of ~~5-.: The X' would be calculated 1 , _ (4.5 - 5)' (6.5 - 5)' (55 - 5)' (3.5 - 5)' I X- 5 + 5 + 5 + 5 1.00,3 df. ---1--­ Withoulthe cOm:l=tion.)(' = 2.00. ·T'bCreis·~ fast rule to decide when to use the Yates correction. II' the ellpcctcd valucs I In each cell arc~ then the correction may make a dilference. ­ CORRELATION It is oftcn useful to have a means of ClIpressing quantitatively the rescmblance among relatives. In other instances. it lIIay be desirable to ClIpR:SS the relationship between IWO measures made on a series of individuals. This can be done by the correlation coellidenl. 0We will be concerned only with the simplest case of linear correlation. that is. when the relationship between two variables X and Y is such that for lI!'Y change in X Ihere is a proportional change in Y. If Y is plolted against X. a SCr!ight·l:pe relationship is observed. ~r the situalion in which there are two serie5 of measuremenlS to be comparcd-for
    • J J>VD~ (C0W'b. oc. -e~tG if"'lr~i~1l' To' c) Pt-n:: f..... b 'Cw, 11 S ~ ~3 ~_r.:t.. "'0 " . I .:- PY'Ob (. ~o~"...#- f""iIS" 1,.0 '- 'f"'r.·;/.... I t- Wlo"Co' .. ~",?~d,.~c.i l~eI'.11.r.y I P"b Ct.,.,~ .,.t. t:...,..~r.s J 4=-di~fj1>J.·cJ ... rd ~ f,....l. 1:... , AS ....... Df ;Z-3'3.;r,,,- ~ I '" ">"1:... 1.. U -r... :T ) ..... PpL /,rl1'''''J:.. 6"fltt.a;i f*" l 'II- ... l' ......'"I~DT.·' ',JFdl....~J7c./ ffeJ .1i "t. IXe.':!, "D~ ~ >II ""0:. '; ~...rTj -= ~' . I -l. M.,,'1"k~.,. is " -0 ., .f-",11t:~~ - is U A · , ! . . I"'l , ', ~ 1>---0 of1t..,.. S O ! I ... II S,'slv- r s 0 ,.",..- .. M M" ",.-thf:t" - .pOt~'Ct'- S i <s; r~i"M. tV - r3~~.... -MN .. ..... -. I . I ""~~~ .... (0; l:i~ L'J "/,-,,1. f ...~ e 1" ~ "1> (,,'" -IL",- ..... s-i'1Tc r ~"'b"" L~"'IL.l4.b . B~jkef" ~- b"1 L",b/,... " I ~~CY""- R,'/Y' 4>,·t1,,~c ~ R ~W'A- is"i,Ier _ ~'~ -Q '/'Y' I i b ....til «" _ , I I ,
    • · I " ~ I
    • ~ '5'~D o.c. -C"",ln.$ ct¥C. U'W' d<e. "~x. (~"'C's-t q~e.. ~H d(Z'6T~7C.J b.~ - ,h- ~ ~(~ ")' ~ ~i':' - b." ~1."'2.) ~ , i'r4. s - 'a ' '''' ~l-"!&.. 4: Y2,. b(.)~ - bo~ Y2'?< ~ ) ('1'"A1 rt"'W~ci'",f c:I~~~"T.. ci ~ J.1i k"e. ~. ':':n.' I<-t St'{ "$w;;s/o J 1:l."'6-riC~ ..~ I '<"c.. :: ~~ "70 --6 di%".r.is I IKe. (}f "10% ,..I,,.,),, T·.s : 30"7" ",o".'Il~.r.cs ~~ f' ... ~) Wlo~e'("' A../-i. Q9 :r~- i.:ftter- , ~ ~~ II '/~ ~.ip, ~.t ~. s. A ... 1 :L"/.c:. ( A) CO) 'tthe..t'l se c.,1cl cJ,...loI IIAJ f'",b,. y~ i.c d i'':laa,,7,c.. ~""Db I 4~.p. Nl'Ol'O ~')~r.'c. ® ':MN ~+~t rr ,. off '>;.. .• '$' --t;.h )S M -t1I'Q' S' C f (>,,~ . eit i IJ h. > 1',... Y.', +- d i 2.,,61. c. I rf' WC1 0 'Z:'11. T.c. LeA ../L~ b tti. r~er L~b/1-&4 ~ -:ri. ,~ t;-""ln 1-; q;l.+ Lu.. '0 11-*4'" (4:- b"-) ":;-e~o"d ch' lei has 1'.-., I i.f. cl,z"'ioT.c P...a:.1;, I tr ,""ort. 'Z~,or.(. ~ ~1"Slrll'li'" is' R/'v­ z!:Ia~ Sfrord rwh- h4; f'~. k, ,.{? cl,z.. ~ P~,; I i~· WJ01lO2.'1~J (
    • ~. ., .~j /
    • · f"VOb ~ C,,~tOlT~c:JY + i NOY''' ''"a t'T.'c,. i ': _ ~)""IIb 6+ C~~ 1~t.tTu~ .::: @' ~~ (6,>;) CO ..,) (.) (O.<S-) .:;: 6.0,,2,. ,+, J:2-ti"i0T'~ ?~b ~ OWl 0 "Z-'1 oa';J; '( - O.3p @. ~i+ ( 0.027..) - 0, '13 I l o.OZ2. ?.ob J;z~~~T.·~ &.Ji> ~ (o.02.y - (>.07 I .
    • · '