SlideShare a Scribd company logo
1 of 5
Download to read offline
6.867 Homework 1
Fall 2005
Due: Wednesday, September 28, in class. 10% off per day late.
Don’t spend time doing hairy integrals by hand. If that’s the answer to a question, you can
just leave it un-integrated.
1 A Tale of Two Estimators
In this problem, we’re going to explore the bias-variance trade-off in a very simple setting. We
have a set of unidimensional data, X1, . . . , Xn, drawn from the positive reals. We will consider two
different models for its distribution:
• Model 1: The data are drawn from a uniform distribution on the interval [0, b]. This model
has a single positive real parameter, 0 < b.
• Model 2: The data are drawn from a uniform distribution on the interval [a, b]. This model
has two positive real parameters, a and b, such that 0 < a < b.
We are interested in comparing estimates of the mean of the distribution, derived from each of
these two models.
Question 0: What’s the mean of each of the distributions?
1.1 The Right Stuff
Let’s start by considering the situation in which the data were, in fact, drawn from an instance of
the model under consideration: either a uniform distribution on [0, b∗] (for model 1) or a uniform
distribution on [a∗, b∗] (for model 2). If you are, like Commander Toad1, “brave and bold, bold and
brave,” then avert your eyes from the rest of this section and compute the bias, variance, and mean
squared error of ML estimator of the mean of each model, on data drawn from this distribution, as
a function of n. Otherwise, let’s take it step by step.
As we saw in recitation, the ML estimator for b, ˆb = maxi Xi. By a similar argument, the ML
estimator for a, ˆa = mini Xi. To understand the properties of these estimators we have to start by
deriving their PDFs. The minimum and maximum of a data set are also known as their first and
nth order statistics, and sometimes written X(1) and X(n).
1
Commander Toad In Space, Jane Yolen, Putnam, 1996
1
1.1.1 Using Model 1
In model 1, we just need to consider the distribution of ˆb. Generally speaking, the pdf of the
maximum of a set of data drawn from pdf f, with cdf F, is
fˆb(x) = nF(x)n−1
f(x) .
The idea is that, if x is the maximum, then n − 1 of the other data values will have to be less than
x, and the probability of that is F(x)n−1, and then one value will have to equal x, the probability
of which is f(x). We multiply by n because there are n different ways to choose the data value
that could be the maximum.
Question 1: What is fˆb in the particular case where the data are drawn uniformly from 0 to
b?
Question 2: Write an expression for the expected value of ˆµ, as an integral.
In fact, there’s a nice closed form expression, which you can use in the following questions:
E[ˆµ] =
b∗
2
n
(n + 1)
.
Question 3: What is the squared bias of ˆµ? Is this estimator unbiased? Is it asymptotically
unbiased? (Reminder: bias2
(ˆθ) = (ED[ˆθ] − θ)2.)
Question 4: Write an expression for the variance of ˆµ, as an integral.
The closed form for the variance is
V [ˆµ] =
b∗2
4
n
(n + 1)2(n + 2)
.
Question 5: What is the mean squared error of ˆµ? (Reminder: MSE(ˆθ) = bias2
(ˆθ) + var(ˆθ).)
1.1.2 Another estimator for Model 1
We might consider something other than the MLE for Model 1. Consider the estimator
ˆµ =
x(n)(n + 1)
2n
.
Question 6: Write an expression for the expected value of this version of ˆµ as an integral.
Then solve the integral.
Question 7: What is the squared bias of this estimator for ˆµ? Is this estimator unbiased? Is
it asymptotically unbiased?
Question 8: Write an expression for the variance of ˆµ as an integral.
The closed form for the variance is
V [ˆµ] =
b∗2
4n(n + 2)
.
Question 9: What is the mean squared error of this version of ˆµ?
Question 10: What are the relative advantages of the estimator from the previous section and
this one?
2
1.1.3 Using Model 2
Now, let’s do the same thing, but for the MLE for model 2. We have to start by thinking about
the joint distribution of MLE’s ˆa and ˆb. Generally speaking, the joint pdf of the minimum and the
maximum of a set of data drawn from pdf f, with cdf F, is
fˆa,ˆb(x, y) = n(n − 1)(F(y) − F(x))n−2
f(x)f(y) .
Question 11: Explain in words why this makes sense.
Question 12: What is fˆa,ˆb in the particular case where the data are drawn uniformly from a
to b?
Question 13: Write an expression for the expected value of ˆµ in terms of an integral.
Here’s what it should integrate to:
E[ˆµ] =
a∗ + b∗
2
.
Question 14: What is the squared bias of ˆµ? Is this estimator unbiased? Is it asymptotically
unbiased?
Question 15: Write an expression for the variance of ˆµ in terms of an integral.
The closed form for the variance is
V [ˆµ] =
(b∗ − a∗)2
2(n + 1)(n + 2)
.
Question 16: What is the mean squared error of ˆµ?
1.1.4 Comparing Models
What if we have data that is actually drawn from the interval [0, 1]? Both models seem like
reasonable choices.
Question 17: Show plots that compare the bias, variance, and mse of each of the estimators
we’ve considered on that data, as a function of n. (Use the formulas above; don’t do it by actually
generating data). Write a paragraph in English explaining your results. What estimator would you
use?
Now, what if we have data that is actually drawn from the interval [.1, 1]? It seems like model
2 is the only reasonable choice. But is it?
We already know the bias, variance, and mse for model 2 in this case. But what about the
MLE and unbiased estimators for model 1? Let’s characterize the general behavior when we use
the estimator ˆµ = x(n)(n + 1)/(2n) on data drawn from an interval [a∗, b∗].
Question 18: Write an expression for the expected value of ˆµ in terms of an integral.
The closed form expression is
E[ˆµ] =
a∗ + b∗n
2n
.
Question 19: Explain in English why this answer makes sense.
Question 20: What is the squared bias of this ˆµ? Explain in English why your answer makes
sense. Consider how it behaves as a increases, and how it behaves as n increases.
Question 21: Write an expression for the variance of this ˆµ in terms of an integral.
3
The closed form for the variance is
V [ˆµ] =
(b∗ − a∗)2
4n(n + 2)
.
To save you some tedious algebra, we’ll tell you that the mean squared error of this ˆµ is
(apologies for the ugliness; let us know if you find a beautiful rewrite)
b∗2
n − 2a∗b∗n + a∗2
(2 − 2n + n3)
4n2(n + 2)
.
Question 22: Show plots that compare the bias, variance, and mse of this estimator with the
regular model 2 estimator on data drawn from [0.1, 1], as a function of n. Are there circumstances
in which it would be better to use this estimator? If so, what are they and why? If not, why not?
Question 23: Show plots of mse of both estimators, as a function of n on data drawn from
[.01, 1] and on data drawn from [.2, 1]. How do things change? Explain why this makes sense.
2 Bayesian Estimation
2.1 Dirichlet Prior
Exercise borrowed from Stat180 at UCLA.
The Dirichlet distribution is a multivariate version of the Beta distribution. When we have a
coin with two outcomes, we really only need a single parameter θ to model the probability of heads.
But now let’s consider a “thick” coin that has three possible outcomes: heads, tails, and edge. Now
we need two parameters: θh is the probability of heads, θt is the probability of tails, and then the
probability of an edge is 1 − θh − θt.
The random variables (V, W) ∈ [0, 1] and such that V + W ≤ 1 have a Dirichlet distribution
with parameters α1, α2, α3 if their joint density is
f(v, w) = vα1−1
wα2−1
(1 − v − w)α3−1 Γ(α1 + α2 + α3)
Γ(α1)Γ(α2)Γ(α3)
.
This is a direct generalization of the Beta distribution.
Question 24: If (θh, θt) have a Dirichlet distribution as above, what is the marginal distribution
of θh?
Question 25: Suppose you are playing with a thick coin, and get results X1 . . . Xn, resulting
in H heads and T tails out of n throws. Given θh and θt the random variables H and T have a
multinomial distribution:
Pr(h, t) =
n!
h!t!(n − h − t)!
θh
hθt
t(1 − θh − θt)n−h−t
.
Assume a uniform prior on the simplex of possible values of θh and θt. What is the posterior
distribution for θh and θt?
Question 26: In this same setting, what is the predictive distribution for getting another head?
That is, what’s Pr(Xn+1 = heads|X1 . . . Xn)?
Question 27: Now assume a Dirichlet prior for θh and θt with parameters α1, α2, α3. What is
the posterior in this case?
Question 28: What is the predictive distribution?
Question 29: If you assume a squared-error loss, what is your estimate of θh and θt?
Question 30: What happens as n → ∞?
4
3 Gaussian distributions
Question 31: (DeGroot) Consider a normal distribution for which the value of the mean µ is
unknown and the variance is 4, and suppose that the prior distribution of µ is a normal distribution
whose variance is 9. How large a random sample must be taken from the given normal distribution
in order to be able to specify an interval having a length of 1 unit such that the probability that µ
lies in this interval is at least 0.95?
4 Presidential debate
(Gelman) On September 25, 1988, the evening of a presidential campaign debate, ABC News
conducted a survey of registered voters in the United States; 639 persons were polled before the
debate, and 639 different persons were polled after. The results are displayed below:
Survey Bush Dukakis No opinion/other Total
pre-debate 294 307 38 639
post-debate 288 332 19 639
Assume the surveys are independent simple random samples from the population of registered
voters. We will ignore the “no opinion” cases and model the data with two different binomial
distributions. For j = 1, 2, let αj be the proportion of voters who preferred Bush, out of those who
had a preference for either Bush or Dukakis at the time of survey j.
Question 32: What is the posterior probability that there was a shift toward Bush?
5 Decision theory
Question 33: (DeGroot) Suppose that a person is going to sell Fizzy Cola at a football game and
must decide in advance how much to order. Suppose that he makes a gain of m cents on each quart
that he sells at the game but suffers a loss of c cents on each quart that he has ordered but does not
sell. Assume that the demand for Fizzy Cola at the game, as measured in quarts, is a continuous
random variable X with PDF f and CDF F, show that his expected profit will be maximized if he
orders an amount α such that F(α) = m/(m + c).
(Duda, Hart, Stork) Suppose we use a randomized decision rule, which chooses action ai with
probability Pr(ai|x).
Question 34: Show that the resulting risk is given by
[
m
i=1
R(ai|x) Pr(ai|x)] Pr(x)dx .
Question 35: What choice of Pr(ai|x) minimizes R in general?
In many problems, one has the option either to assign the pattern to one of c classes, or to
reject it as being unrecognizable. If the cost for rejects is not too high, rejection may be a desirable
action. Let the cost for a correct answer be 0, for an incorrect answer be λs, and for a reject be λr.
Question 36: Under what conditions on Pr(ci|x), λs and λr should we reject?
5

More Related Content

What's hot

An Introduction to Mis-Specification (M-S) Testing
An Introduction to Mis-Specification (M-S) TestingAn Introduction to Mis-Specification (M-S) Testing
An Introduction to Mis-Specification (M-S) Testingjemille6
 
Probabilistic Reasoning
Probabilistic ReasoningProbabilistic Reasoning
Probabilistic ReasoningJunya Tanaka
 
Conference on theoretical and applied computer science
Conference on theoretical and applied computer scienceConference on theoretical and applied computer science
Conference on theoretical and applied computer scienceSandeep Katta
 
Optimization of Fuzzy Matrix Games of Order 4 X 3
Optimization of Fuzzy Matrix Games of Order 4 X 3Optimization of Fuzzy Matrix Games of Order 4 X 3
Optimization of Fuzzy Matrix Games of Order 4 X 3IJERA Editor
 
Chap05 continuous random variables and probability distributions
Chap05 continuous random variables and probability distributionsChap05 continuous random variables and probability distributions
Chap05 continuous random variables and probability distributionsJudianto Nugroho
 
A. Spanos Probability/Statistics Lecture Notes 5: Post-data severity evaluation
A. Spanos Probability/Statistics Lecture Notes 5: Post-data severity evaluationA. Spanos Probability/Statistics Lecture Notes 5: Post-data severity evaluation
A. Spanos Probability/Statistics Lecture Notes 5: Post-data severity evaluationjemille6
 
6334 Day 3 slides: Spanos-lecture-2
6334 Day 3 slides: Spanos-lecture-26334 Day 3 slides: Spanos-lecture-2
6334 Day 3 slides: Spanos-lecture-2jemille6
 
Stability criterion of periodic oscillations in a (9)
Stability criterion of periodic oscillations in a (9)Stability criterion of periodic oscillations in a (9)
Stability criterion of periodic oscillations in a (9)Alexander Decker
 

What's hot (18)

Lecture 2
Lecture 2Lecture 2
Lecture 2
 
adalah
adalahadalah
adalah
 
Ch06 se
Ch06 seCh06 se
Ch06 se
 
Unit 2.7
Unit 2.7Unit 2.7
Unit 2.7
 
An Introduction to Mis-Specification (M-S) Testing
An Introduction to Mis-Specification (M-S) TestingAn Introduction to Mis-Specification (M-S) Testing
An Introduction to Mis-Specification (M-S) Testing
 
1561 maths
1561 maths1561 maths
1561 maths
 
Probabilistic Reasoning
Probabilistic ReasoningProbabilistic Reasoning
Probabilistic Reasoning
 
Zahedi
ZahediZahedi
Zahedi
 
Conference on theoretical and applied computer science
Conference on theoretical and applied computer scienceConference on theoretical and applied computer science
Conference on theoretical and applied computer science
 
Probability distributionv1
Probability distributionv1Probability distributionv1
Probability distributionv1
 
Unit 7.5
Unit 7.5Unit 7.5
Unit 7.5
 
Numerical approximation
Numerical approximationNumerical approximation
Numerical approximation
 
Optimization of Fuzzy Matrix Games of Order 4 X 3
Optimization of Fuzzy Matrix Games of Order 4 X 3Optimization of Fuzzy Matrix Games of Order 4 X 3
Optimization of Fuzzy Matrix Games of Order 4 X 3
 
Chap05 continuous random variables and probability distributions
Chap05 continuous random variables and probability distributionsChap05 continuous random variables and probability distributions
Chap05 continuous random variables and probability distributions
 
A. Spanos Probability/Statistics Lecture Notes 5: Post-data severity evaluation
A. Spanos Probability/Statistics Lecture Notes 5: Post-data severity evaluationA. Spanos Probability/Statistics Lecture Notes 5: Post-data severity evaluation
A. Spanos Probability/Statistics Lecture Notes 5: Post-data severity evaluation
 
6334 Day 3 slides: Spanos-lecture-2
6334 Day 3 slides: Spanos-lecture-26334 Day 3 slides: Spanos-lecture-2
6334 Day 3 slides: Spanos-lecture-2
 
Stability criterion of periodic oscillations in a (9)
Stability criterion of periodic oscillations in a (9)Stability criterion of periodic oscillations in a (9)
Stability criterion of periodic oscillations in a (9)
 
Unit 7.1
Unit 7.1Unit 7.1
Unit 7.1
 

Similar to HW1 MIT Fall 2005

The Use of Fuzzy Optimization Methods for Radiation.ppt
The Use of Fuzzy Optimization Methods for Radiation.pptThe Use of Fuzzy Optimization Methods for Radiation.ppt
The Use of Fuzzy Optimization Methods for Radiation.ppteslameslam18
 
Math 235 - Summer 2015Homework 2Due Monday June 8 in cla.docx
Math 235 - Summer 2015Homework 2Due Monday June 8 in cla.docxMath 235 - Summer 2015Homework 2Due Monday June 8 in cla.docx
Math 235 - Summer 2015Homework 2Due Monday June 8 in cla.docxandreecapon
 
Machine Learning
Machine LearningMachine Learning
Machine LearningAshwin P N
 
Introduction to Evidential Neural Networks
Introduction to Evidential Neural NetworksIntroduction to Evidential Neural Networks
Introduction to Evidential Neural NetworksFederico Cerutti
 
Precalculus 1 chapter 1
Precalculus 1 chapter 1Precalculus 1 chapter 1
Precalculus 1 chapter 1oreves
 
A. spanos slides ch14-2013 (4)
A. spanos slides ch14-2013 (4)A. spanos slides ch14-2013 (4)
A. spanos slides ch14-2013 (4)jemille6
 

Similar to HW1 MIT Fall 2005 (20)

Statistical Method In Economics
Statistical Method In EconomicsStatistical Method In Economics
Statistical Method In Economics
 
Math Algebra
Math AlgebraMath Algebra
Math Algebra
 
The Use of Fuzzy Optimization Methods for Radiation.ppt
The Use of Fuzzy Optimization Methods for Radiation.pptThe Use of Fuzzy Optimization Methods for Radiation.ppt
The Use of Fuzzy Optimization Methods for Radiation.ppt
 
Math 235 - Summer 2015Homework 2Due Monday June 8 in cla.docx
Math 235 - Summer 2015Homework 2Due Monday June 8 in cla.docxMath 235 - Summer 2015Homework 2Due Monday June 8 in cla.docx
Math 235 - Summer 2015Homework 2Due Monday June 8 in cla.docx
 
Probabilistic systems exam help
Probabilistic systems exam helpProbabilistic systems exam help
Probabilistic systems exam help
 
Probabilistic systems assignment help
Probabilistic systems assignment helpProbabilistic systems assignment help
Probabilistic systems assignment help
 
statistics assignment help
statistics assignment helpstatistics assignment help
statistics assignment help
 
Statistics Homework Help
Statistics Homework HelpStatistics Homework Help
Statistics Homework Help
 
Mathematical Statistics Homework Help
Mathematical Statistics Homework HelpMathematical Statistics Homework Help
Mathematical Statistics Homework Help
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Statistics Homework Help
Statistics Homework HelpStatistics Homework Help
Statistics Homework Help
 
2 simple regression
2   simple regression2   simple regression
2 simple regression
 
Introduction to Evidential Neural Networks
Introduction to Evidential Neural NetworksIntroduction to Evidential Neural Networks
Introduction to Evidential Neural Networks
 
Data Analysis Assignment Help
Data Analysis Assignment Help Data Analysis Assignment Help
Data Analysis Assignment Help
 
Signals and Systems Homework Help.pptx
Signals and Systems Homework Help.pptxSignals and Systems Homework Help.pptx
Signals and Systems Homework Help.pptx
 
Excel Homework Help
Excel Homework HelpExcel Homework Help
Excel Homework Help
 
Precalculus 1 chapter 1
Precalculus 1 chapter 1Precalculus 1 chapter 1
Precalculus 1 chapter 1
 
A. spanos slides ch14-2013 (4)
A. spanos slides ch14-2013 (4)A. spanos slides ch14-2013 (4)
A. spanos slides ch14-2013 (4)
 
Probability Homework Help
Probability Homework Help Probability Homework Help
Probability Homework Help
 
Stochastic Process Exam Help
Stochastic Process Exam HelpStochastic Process Exam Help
Stochastic Process Exam Help
 

Recently uploaded

ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Planning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxPlanning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxLigayaBacuel1
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPCeline George
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 

Recently uploaded (20)

TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"Rapple "Scholarly Communications and the Sustainable Development Goals"
Rapple "Scholarly Communications and the Sustainable Development Goals"
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Planning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxPlanning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptx
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
What is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERPWhat is Model Inheritance in Odoo 17 ERP
What is Model Inheritance in Odoo 17 ERP
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 

HW1 MIT Fall 2005

  • 1. 6.867 Homework 1 Fall 2005 Due: Wednesday, September 28, in class. 10% off per day late. Don’t spend time doing hairy integrals by hand. If that’s the answer to a question, you can just leave it un-integrated. 1 A Tale of Two Estimators In this problem, we’re going to explore the bias-variance trade-off in a very simple setting. We have a set of unidimensional data, X1, . . . , Xn, drawn from the positive reals. We will consider two different models for its distribution: • Model 1: The data are drawn from a uniform distribution on the interval [0, b]. This model has a single positive real parameter, 0 < b. • Model 2: The data are drawn from a uniform distribution on the interval [a, b]. This model has two positive real parameters, a and b, such that 0 < a < b. We are interested in comparing estimates of the mean of the distribution, derived from each of these two models. Question 0: What’s the mean of each of the distributions? 1.1 The Right Stuff Let’s start by considering the situation in which the data were, in fact, drawn from an instance of the model under consideration: either a uniform distribution on [0, b∗] (for model 1) or a uniform distribution on [a∗, b∗] (for model 2). If you are, like Commander Toad1, “brave and bold, bold and brave,” then avert your eyes from the rest of this section and compute the bias, variance, and mean squared error of ML estimator of the mean of each model, on data drawn from this distribution, as a function of n. Otherwise, let’s take it step by step. As we saw in recitation, the ML estimator for b, ˆb = maxi Xi. By a similar argument, the ML estimator for a, ˆa = mini Xi. To understand the properties of these estimators we have to start by deriving their PDFs. The minimum and maximum of a data set are also known as their first and nth order statistics, and sometimes written X(1) and X(n). 1 Commander Toad In Space, Jane Yolen, Putnam, 1996 1
  • 2. 1.1.1 Using Model 1 In model 1, we just need to consider the distribution of ˆb. Generally speaking, the pdf of the maximum of a set of data drawn from pdf f, with cdf F, is fˆb(x) = nF(x)n−1 f(x) . The idea is that, if x is the maximum, then n − 1 of the other data values will have to be less than x, and the probability of that is F(x)n−1, and then one value will have to equal x, the probability of which is f(x). We multiply by n because there are n different ways to choose the data value that could be the maximum. Question 1: What is fˆb in the particular case where the data are drawn uniformly from 0 to b? Question 2: Write an expression for the expected value of ˆµ, as an integral. In fact, there’s a nice closed form expression, which you can use in the following questions: E[ˆµ] = b∗ 2 n (n + 1) . Question 3: What is the squared bias of ˆµ? Is this estimator unbiased? Is it asymptotically unbiased? (Reminder: bias2 (ˆθ) = (ED[ˆθ] − θ)2.) Question 4: Write an expression for the variance of ˆµ, as an integral. The closed form for the variance is V [ˆµ] = b∗2 4 n (n + 1)2(n + 2) . Question 5: What is the mean squared error of ˆµ? (Reminder: MSE(ˆθ) = bias2 (ˆθ) + var(ˆθ).) 1.1.2 Another estimator for Model 1 We might consider something other than the MLE for Model 1. Consider the estimator ˆµ = x(n)(n + 1) 2n . Question 6: Write an expression for the expected value of this version of ˆµ as an integral. Then solve the integral. Question 7: What is the squared bias of this estimator for ˆµ? Is this estimator unbiased? Is it asymptotically unbiased? Question 8: Write an expression for the variance of ˆµ as an integral. The closed form for the variance is V [ˆµ] = b∗2 4n(n + 2) . Question 9: What is the mean squared error of this version of ˆµ? Question 10: What are the relative advantages of the estimator from the previous section and this one? 2
  • 3. 1.1.3 Using Model 2 Now, let’s do the same thing, but for the MLE for model 2. We have to start by thinking about the joint distribution of MLE’s ˆa and ˆb. Generally speaking, the joint pdf of the minimum and the maximum of a set of data drawn from pdf f, with cdf F, is fˆa,ˆb(x, y) = n(n − 1)(F(y) − F(x))n−2 f(x)f(y) . Question 11: Explain in words why this makes sense. Question 12: What is fˆa,ˆb in the particular case where the data are drawn uniformly from a to b? Question 13: Write an expression for the expected value of ˆµ in terms of an integral. Here’s what it should integrate to: E[ˆµ] = a∗ + b∗ 2 . Question 14: What is the squared bias of ˆµ? Is this estimator unbiased? Is it asymptotically unbiased? Question 15: Write an expression for the variance of ˆµ in terms of an integral. The closed form for the variance is V [ˆµ] = (b∗ − a∗)2 2(n + 1)(n + 2) . Question 16: What is the mean squared error of ˆµ? 1.1.4 Comparing Models What if we have data that is actually drawn from the interval [0, 1]? Both models seem like reasonable choices. Question 17: Show plots that compare the bias, variance, and mse of each of the estimators we’ve considered on that data, as a function of n. (Use the formulas above; don’t do it by actually generating data). Write a paragraph in English explaining your results. What estimator would you use? Now, what if we have data that is actually drawn from the interval [.1, 1]? It seems like model 2 is the only reasonable choice. But is it? We already know the bias, variance, and mse for model 2 in this case. But what about the MLE and unbiased estimators for model 1? Let’s characterize the general behavior when we use the estimator ˆµ = x(n)(n + 1)/(2n) on data drawn from an interval [a∗, b∗]. Question 18: Write an expression for the expected value of ˆµ in terms of an integral. The closed form expression is E[ˆµ] = a∗ + b∗n 2n . Question 19: Explain in English why this answer makes sense. Question 20: What is the squared bias of this ˆµ? Explain in English why your answer makes sense. Consider how it behaves as a increases, and how it behaves as n increases. Question 21: Write an expression for the variance of this ˆµ in terms of an integral. 3
  • 4. The closed form for the variance is V [ˆµ] = (b∗ − a∗)2 4n(n + 2) . To save you some tedious algebra, we’ll tell you that the mean squared error of this ˆµ is (apologies for the ugliness; let us know if you find a beautiful rewrite) b∗2 n − 2a∗b∗n + a∗2 (2 − 2n + n3) 4n2(n + 2) . Question 22: Show plots that compare the bias, variance, and mse of this estimator with the regular model 2 estimator on data drawn from [0.1, 1], as a function of n. Are there circumstances in which it would be better to use this estimator? If so, what are they and why? If not, why not? Question 23: Show plots of mse of both estimators, as a function of n on data drawn from [.01, 1] and on data drawn from [.2, 1]. How do things change? Explain why this makes sense. 2 Bayesian Estimation 2.1 Dirichlet Prior Exercise borrowed from Stat180 at UCLA. The Dirichlet distribution is a multivariate version of the Beta distribution. When we have a coin with two outcomes, we really only need a single parameter θ to model the probability of heads. But now let’s consider a “thick” coin that has three possible outcomes: heads, tails, and edge. Now we need two parameters: θh is the probability of heads, θt is the probability of tails, and then the probability of an edge is 1 − θh − θt. The random variables (V, W) ∈ [0, 1] and such that V + W ≤ 1 have a Dirichlet distribution with parameters α1, α2, α3 if their joint density is f(v, w) = vα1−1 wα2−1 (1 − v − w)α3−1 Γ(α1 + α2 + α3) Γ(α1)Γ(α2)Γ(α3) . This is a direct generalization of the Beta distribution. Question 24: If (θh, θt) have a Dirichlet distribution as above, what is the marginal distribution of θh? Question 25: Suppose you are playing with a thick coin, and get results X1 . . . Xn, resulting in H heads and T tails out of n throws. Given θh and θt the random variables H and T have a multinomial distribution: Pr(h, t) = n! h!t!(n − h − t)! θh hθt t(1 − θh − θt)n−h−t . Assume a uniform prior on the simplex of possible values of θh and θt. What is the posterior distribution for θh and θt? Question 26: In this same setting, what is the predictive distribution for getting another head? That is, what’s Pr(Xn+1 = heads|X1 . . . Xn)? Question 27: Now assume a Dirichlet prior for θh and θt with parameters α1, α2, α3. What is the posterior in this case? Question 28: What is the predictive distribution? Question 29: If you assume a squared-error loss, what is your estimate of θh and θt? Question 30: What happens as n → ∞? 4
  • 5. 3 Gaussian distributions Question 31: (DeGroot) Consider a normal distribution for which the value of the mean µ is unknown and the variance is 4, and suppose that the prior distribution of µ is a normal distribution whose variance is 9. How large a random sample must be taken from the given normal distribution in order to be able to specify an interval having a length of 1 unit such that the probability that µ lies in this interval is at least 0.95? 4 Presidential debate (Gelman) On September 25, 1988, the evening of a presidential campaign debate, ABC News conducted a survey of registered voters in the United States; 639 persons were polled before the debate, and 639 different persons were polled after. The results are displayed below: Survey Bush Dukakis No opinion/other Total pre-debate 294 307 38 639 post-debate 288 332 19 639 Assume the surveys are independent simple random samples from the population of registered voters. We will ignore the “no opinion” cases and model the data with two different binomial distributions. For j = 1, 2, let αj be the proportion of voters who preferred Bush, out of those who had a preference for either Bush or Dukakis at the time of survey j. Question 32: What is the posterior probability that there was a shift toward Bush? 5 Decision theory Question 33: (DeGroot) Suppose that a person is going to sell Fizzy Cola at a football game and must decide in advance how much to order. Suppose that he makes a gain of m cents on each quart that he sells at the game but suffers a loss of c cents on each quart that he has ordered but does not sell. Assume that the demand for Fizzy Cola at the game, as measured in quarts, is a continuous random variable X with PDF f and CDF F, show that his expected profit will be maximized if he orders an amount α such that F(α) = m/(m + c). (Duda, Hart, Stork) Suppose we use a randomized decision rule, which chooses action ai with probability Pr(ai|x). Question 34: Show that the resulting risk is given by [ m i=1 R(ai|x) Pr(ai|x)] Pr(x)dx . Question 35: What choice of Pr(ai|x) minimizes R in general? In many problems, one has the option either to assign the pattern to one of c classes, or to reject it as being unrecognizable. If the cost for rejects is not too high, rejection may be a desirable action. Let the cost for a correct answer be 0, for an incorrect answer be λs, and for a reject be λr. Question 36: Under what conditions on Pr(ci|x), λs and λr should we reject? 5