Uploaded on


  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. 3The Gauge Integral3.1 IntroductionMuch of calculus deals with the interplay between differentiation andintegration. The antiquated term “antidifferentiation” emphasizes the factthat differentiation and integration are inverses of one another. We willtake it for granted that readers are acquainted with the mechanics of in-tegration. The current chapter develops just enough integration theory tomake our development of differentiation in Chap. 4 and the calculus ofvariations in Chap. 17 respectable. It is only fair to warn readers that inother chapters a few applications to probability and statistics will assumefamiliarity with properties of the expectation operator not covered here. The first successful effort to put integration on a rigorous basis was un-dertaken by Riemann. In the early twentieth century, Lebesgue defineda more sophisticated integral that addresses many of the limitations ofthe Riemann integral. However, even Lebesgue’s integral has its defects.In the past few decades, mathematicians such as Henstock and Kurzweilhave expanded the definition of integration on the real line to includea wider variety of functions. The new integral emerging from these in-vestigations is called the gauge integral or generalized Riemann integral[7, 68, 108, 193, 250, 255, 278]. The gauge integral subsumes the Riemannintegral, the Lebesgue integral, and the improper integrals met in tradi-tional advanced calculus courses. In contrast to the Lebesgue integral, theintegrands of the gauge integral are not necessarily absolutely integrable.K. Lange, Optimization, Springer Texts in Statistics 95, 53DOI 10.1007/978-1-4614-5838-8 3,© Springer Science+Business Media New York 2013
  • 2. 54 3. The Gauge Integral It would take us too far afield to develop the gauge integral in fullgenerality. Here we will rest content with proving some of its elementaryproperties. One of the advantages of the gauge integral is that many theo-rems hold with fewer qualifications. The fundamental theorem of calculus isa case in point. The commonly stated version of the fundamental theoremconcerns a differentiable function f (x) on an interval [a, b]. As all studentsof calculus know, b f (x) dx = f (b) − f (a). aAlthough this version is true for the gauge integral, it does not hold for theLebesgue integral because the mere fact that f (x) exists throughout [a, b]does not guarantee that it is Lebesgue integrable. This quick description of the gauge integral is not intended to imply thatthe gauge integral is uniformly superior to the Lebesgue integral and itsextensions. Certainly, probability theory would be severely handicappedwithout the full flexibility of modern measure theory. Furthermore, the ad-vanced theory of the gauge integral is every bit as difficult as the advancedtheory of the Lebesgue integral. For pedagogical purposes, however, one canargue that a student’s first exposure to the theory of integration should fea-ture the gauge integral. As we shall see, many of the basic properties ofthe gauge integral flow directly from its definition. As an added dividend,gauge functions provide an alternative approach to some of the material ofChap. 2.3.2 Gauge Functions and δ-Fine PartitionsThe gauge integral is defined through gauge functions. A gauge functionis nothing more than a positive function δ(t) defined on a finite interval[a, b]. In approximating the integral of a function f (t) over [a, b] by a finiteRiemann sum, it is important to sample the function most heavily in thoseregions where it changes most rapidly. Now by a Riemann sum we mean asum n−1 S(f, π) = f (ti )(si+1 − si ), i=0where the mesh points a = s0 < s1 < · · · < sn = b form a partition π of[a, b], and the tags ti are chosen so that ti ∈ [si , si+1 ]. If δ(ti ) measures therapidity of change of f (t) near ti , then it makes sense to take δ(t) small inregions of rapid change and to force si and si+1 to belong to the interval(ti − δ(ti ), ti + δ(ti )). A tagged partition with this property is called a δ-fine partition. Our first proposition relieves our worry that δ-fine partitionsexist.
  • 3. 3.2 Gauge Functions and δ-Fine Partitions 55Proposition 3.2.1 (Cousin’s Lemma) For every gauge δ(t) on a finiteinterval [a, b] there is a δ-fine partition.Proof: Assume that [a, b] lacks a δ-fine partition. Since we can construct aδ-fine partition of [a, b] by appending a δ-fine partition of the half-interval[(a + b)/2, b] to a δ-fine partition of the half-interval [a, (a + b)/2], it fol-lows that either [a, (a + b)/2] or [(a + b)/2, b] lacks a δ-fine partition. As inExample 2.3.1, we choose one of the half-intervals based on this failure andcontinue bisecting. This creates a nested sequence of intervals [ai , bi ] con-verging to a point x. If i is large enough, then [ai , bi ] ⊂ (x − δ(x), x + δ(x)),and the interval [ai , bi ] with tag x is a δ-fine partition of itself. This con-tradicts the choice of [ai , bi ] and the assumption that the original interval[a, b] lacks a δ-fine partition. Before launching into our treatment of the gauge integral, we pause togain some facility with gauge functions [108]. Here are three examples thatillustrate their value.Example 3.2.1 A Gauge Proof of Weierstrass’ TheoremConsider a real-valued continuous function f (t) with domain [a, b]. Supposethat f (t) does not attain its supremum on [a, b]. Then for each t there existsa point x ∈ [a, b] with f (t) < f (x). By continuity there exists δ(t) > 0 suchthat f (y) < f (x) for all y ∈ [a, b] with |y − t| < δ(t). Using δ(t) as agauge, select a δ-fine partition a = s0 < s1 < · · · < sn = b with tagsti ∈ [si , si+1 ] and designated points xi satisfying f (ti ) < f (xi ). Let xmaxbe the point xi having the largest value f (xi ). Because xmax lies in someinterval [si , si+1 ], we have f (xmax ) < f (xi ). This contradiction discreditsour assumption that f (x) does not attain its supremum. A similar argumentapplies to the infimum.Example 3.2.2 A Gauge Proof of the Heine-Borel TheoremOne can use Cousin’s lemma to prove the Heine-Borel Theorem on the realline [278]. This theorem states that if C is a compact set contained in theunion ∪α Oα of a collection of open sets Oα , then C is actually contained inthe union of a finite number of the Oα . Suppose C ⊂ [a, b]. Define a gaugeδ(t) so that the interval (t − δ(t), t + δ(t)) does not intersect C when t ∈ Cand (t − δ(t), t + δ(t)) is contained in some Oα when t ∈ C. Based on δ(t),select a δ-fine partition a = s0 < s1 < · · · < sn = b with tags ti ∈ [si , si+1 ].By definition C is contained in the union ∪ti ∈C Ui , where Ui is the set Oαcovering ti . The Heine-Borel theorem extends to compact sets in Rn .Example 3.2.3 A Gauge Proof of the Intermediate Value TheoremUnder the assumption of the previous example, let c be an number strictlybetween f (a) and f (b). If we assume that there is no t ∈ [a, b] with f (t) = c,then there exists a positive number δ(t) such that either f (x) < c for all
  • 4. 56 3. The Gauge Integralx ∈ [a, b] with |x − t| < δ(t) or f (x) > c for all x ∈ [a, b] with |x − t| < δ(t).We now select a δ-fine partition a = s0 < s1 < · · · < sn = b and observethat throughout each interval [si , si+1 ] either f (t) < c or f (t) > c. If tostart f (s0 ) = f (a) < c, then f (s1 ) < c, which implies f (s2 ) < c and soforth until we get to f (sn ) = f (b) < c. This contradicts the assumptionthat c lies strictly between f (a) and f (b). With minor differences, the sameproof works when f (a) > c. In preparation for our next example and for the fundamental theoremof calculus later in this chapter, we must define derivatives. A real-valuedfunction f (t) defined on an interval [a, b] possesses a derivative f (c) atc ∈ [a, b] provided the limit f (t) − f (c) lim = f (c) (3.1) t→c t−cexists. At the endpoints a and b, the limit is necessarily one sided. Tak-ing a sequential view of convergence, definition (3.1) means that for everysequence tm converging to c we must have f (tm ) − f (c) lim = f (c). m→∞ tm − cIn calculus, we learn the following rules for computing derivatives:Proposition 3.2.2 If f (t) and g(t) are differentiable functions on (a, b),then αf (t) + βg(t) = αf (t) + βg (t) f (t)g(t) = f (t)g(t) + f (t)g (t) 1 f (t) = − . f (t) f (t)2In the third formula we must assume f (t) = 0. Finally, if g(t) maps intothe domain of f (t), then the functional composition f ◦ g(t) has derivative [f ◦ g(t)] = f ◦ g(t)g (t).Proof: We will prove the above sum, product, quotient, and chain rules ina broader context in Chap. 4. Our proofs will not rely on integration.Example 3.2.4 Strictly Increasing FunctionsLet f (t) be a differentiable function on [c, d] with strictly positive derivative.We now show that f (t) is strictly increasing. For each t ∈ [c, d] there existsδ(t) > 0 such that f (x) − f (t) > 0 (3.2) x−t
  • 5. 3.3 Definition and Basic Properties of the Integral 57for all x ∈ [a, b] with |x − t| < δ(t). According to Proposition 3.2.1, for anytwo points a < b from [c, d], there exists a δ-fine partition a = s0 < s1 < · · · < sn = bof [a, b] with tags ti ∈ [si , si+1 ]. In view of inequality (3.2), at least oneof the two inequalities f (si ) ≤ f (ti ) ≤ f (si+1 ) must be strict. Thus, thetelescoping sum n−1 f (b) − f (a) = [f (si+1 ) − f (si )] i=0must be positive.3.3 Definition and Basic Properties of the IntegralWith later applications in mind, it will be convenient to define the gaugeintegral for vector-valued functions f (x) : [a, b] → Rn . In this context, f (x)is said to have integral I if for every > 0 there exists a gauge δ(x) on[a, b] such that S(f, π) − I < (3.3)for all δ-fine partitions π. Our first order of business is to check that theintegral is unique whenever it exists. Thus, suppose that the vector J is asecond possible value of the integral. Given > 0 choose gauges δI (x) andδJ (x) leading to inequality (3.3). The minimum δ(x) = min{δI (x), δJ (x)}is also a gauge, and any partition π that is δ-fine is also δI and δJ -fine.Hence, I −J ≤ I − S(f, π) + S(f, π) − J < 2 .Since is arbitrary, J = I. One can also define f (x) to be integrable if its Riemann sums are Cauchyin an appropriate sense.Proposition 3.3.1 (Cauchy criterion) A function f (x) : [a, b] → Rn isintegrable if and only if for every > 0 there exists a gauge δ(x) > 0 suchthat S(f, π1 ) − S(f, π2 ) < (3.4)for any two δ-fine partitions π1 and π2 .Proof: It is obvious that the Cauchy criterion is necessary for integrability.To show that it is sufficient, consider the sequence m = m−1 and compat-ible sequence of gauges δm (x) determined by condition (3.4). We can force
  • 6. 58 3. The Gauge Integralthe constraints δm (x) ≤ δm−1 (x) to hold by inductively replacing δm (x) bymin{δm−1 (x), δm (x)} whenever needed. Now select a δm -fine partition πmfor each m. Because the gauge sequence δm (x) is decreasing, every partitionπ that is δm -fine is also δm−1 -fine. Hence, the sequence of Riemann sumsS(f, πm ) is Cauchy and has a limit I satisfying S(f, πm ) − I ≤ m−1 .Finally, given the potential integral I, we take an arbitrary > 0 andchoose m so that m−1 < . If π is δm -fine, then the inequality S(f, π) − I ≤ S(f, π) − S(f, πm ) + S(f, πm ) − I < 2 .completes the proof. For two integrable functions f (x) and g(x), the gauge integral inheritsthe linearity property b b b [αf (x) + βg(x)] dx = α f (x) dx + β g(x) dx a a afrom its approximating Riemann sums. To prove this fact, take > 0 andchoose gauges δf (x) and δg (x) so that b b S(f, πf ) − f (x) dx < , S(f, πg ) − g(x) dx < a awhenever πf is δf -fine and πg is δg -fine. If the tagged partition π is δ-finefor the gauge δ(x) = min{δf (x), δg (x)}, then b b S(αf + βg, π) − α f (x) dx − β g(x) dx a a b b ≤ |α| S(f, π) − f (x) dx + |β| S(g, π) − g(x) dx a a ≤ (|α| + |β|) . The gauge integral also inherits obvious order properties. For example, b a f (x) dx ≥ 0 whenever the integrand f (x) ≥ 0 for all x ∈ [a, b]. In this bcase, the inequality |S(f, π) − a f (x) dx| < implies b 0 ≤ S(f, π) ≤ f (x) dx + . aSince can be made arbitrarily small for f (x) integrable, it follows that b a f (x) dx ≥ 0. This nonnegativity property translates into theorder property b b f (x) dx ≤ g(x) dx a a
  • 7. 3.3 Definition and Basic Properties of the Integral 59for two integrable functions f (x) ≤ g(x). In particular, when both f (x)and |f (x)| are both integrable, we have b b f (x) dx ≤ |f (x)| dx . a aFor vector-valued functions, the analogous rule b b f (x) dx ≤ f (x) dx (3.5) a ais also inherited from the approximating Riemann sums. The reader caneasily supply the proof using the triangle inequality of the Euclidean norm.It does not take much imagination to extend the definition of the gaugeintegral to matrix-valued functions, and inequality (3.5) applies in thissetting as well. One of the nicest features of the gauge integral is that one can perturban integrable function at a countable number of points without changingthe value of its integral. This property fails for the Riemann integral but isexhibited by the Lebesgue integral. To validate the property, it suffices toprove that a function that equals 0 except at a countable number of pointshas integral 0. Suppose f (x) is such a function with exceptional pointsx1 , x2 , . . . and corresponding exceptional values f 1 , f 2 , . . .. We now definea gauge δ(x) with value 1 on the nonexceptional points and values δ(xj ) = 2j+2 [ f j + 1]at the exceptional points. If π is a δ-fine partition, then xj can serve asa tag for at most two intervals [si , si+1 ] of π and each such interval haslength less than 2δ(xj ). It follows that ∞ 2 1 S(f, π) ≤ 2 f (xj ) j+2 [ f ≤ = j 2 j + 1] j=1 2j band therefore that a f (x) dx = 0. In practice, the interval additivity rule c b c f (x) dx = f (x) dx + f (x) dx (3.6) a a bis obviously desirable. There are three separate issues in proving it. First,given the existence of the integral over [a, c], do the integrals over [a, b]and [b, c] exist? Second, if the integrals over [a, b] and [b, c] exist, doesthe integral over [a, c] exist? Third, if the integrals over [a, b] and [b, c]exist, are they additive? The first question is best approached throughProposition 3.3.1. For > 0 there exists a gauge δ(x) such that S(f, π1 ) − S(f, π2 ) <
  • 8. 60 3. The Gauge Integralfor any two δ-fine partitions π1 and π2 of [a, c]. Given δ(x), take any twoδ-fine partitions γ1 and γ2 of [a, b] and a single δ-fine partition ω of [b, c].The concatenated partitions γ1 ∪ ω and γ2 ∪ ω are δ-fine throughout [a, c]and satisfy S(f, γ1 ) − S(f, γ2 ) = S(f, γ1 ∪ ω) − S(f, γ2 ∪ ω) < .According to the Cauchy criterion, the integral over [a, b] therefore exists.A similar argument implies that the integral over [b, c] also exists. Finally,the combination of these results shows that the integral exists over anyinterval [u, v] contained within [a, b]. For the converse, choose gauges δ1 (x) on [a, b] and δ2 (x) on [b, c] so that b c S(f, γ) − f (x) dx < , S(f, ω) − f (x) dx < a bfor any δ1 -fine partition γ of [a, b] and any δ2 -fine partition ω of [b, c]. Theconcatenated partition π = γ ∪ ω satisfies b c S(f, π) − f (x) dx − f (x) dx a b b c ≤ S(f, γ) − f (x) + S(f, ω) − f (x) dx (3.7) a b < 2because the Riemann sums satisfy S(f, π) = S(f, γ)+S(f, ω). This suggestsdefining a gauge δ(x) equal to δ1 (x) on [a, b] and equal to δ2 (x) on [b, c].The problem with this tactic is that some partitions of [a, c] do not splitat b. However, we can ensure a split by redefining δ(x) by ˜ min{δ1 (b), δ2 (b)} x=b δ(x) = min{δ(x), 1 |x − b|} 2 x = b.This forces b to be the tag of its assigned interval, and we can if neededsplit this interval at b and retain b as tag of both subintervals. With δ(x)amended in this fashion, any δ-fine partition π can be viewed as a con-catenated partition γ ∪ ω splitting at b. As such π obeys inequality (3.7).This argument simultaneously proves that the integral over [a, c] exists andsatisfies the additivity property (3.6) If the function f (x) is vector-valued with n components, then the in-tegrability of f (x) should imply the integrability of each its componentsfi (x). Furthermore, we should be able to write ⎛ b ⎞ b f (x) dx a 1 ⎜ . ⎟ f (x) dx = ⎝ . . ⎠. a b a fn (x) dx
  • 9. 3.3 Definition and Basic Properties of the Integral 61Conversely, if its components are integrable, then f (x) should be integrableas well. The inequalities n √ S(f, π) − I ≤ |S(fi , π) − Ii | ≤ n S(f, π) − I . i=1based on Example 2.5.6 and Problem 3 of Chap. 2 are instrumental inproving this logical equivalence. Given that we can integrate componentby component, for the remainder of this chapter we will deal exclusivelywith real-valued functions. We have not actually shown that any function is integrable. The mostobvious possibility is a constant. Fortunately, it is trivial to demonstratethat b c dx = c(b − a). aStep functions are one rung up the hierarchy of functions. If n−1 f (x) = ci 1(si ,si+1 ] (x) i=0for a = s0 < s1 < · · · < sn = b, then our nascent theory allows us toevaluate b n−1 si+1 n−1 f (x) dx = ci dx = ci (si+1 − si ). a i=0 si i=0This fact and the next technical proposition turn out to be the key toshowing that continuous functions are integrable.Proposition 3.3.2 Let f (x) be a function with domain [a, b]. Suppose forevery > 0 there exist two integrable functions g(x) and h(x) satisfyingg(x) ≤ f (x) ≤ h(x) for all x and b b h(x) dx ≤ g(x) dx + . a aThen f (x) is integrable.Proof: For > 0, choose gauges δg (x) and δh (x) on [a, b] so that b b S(g, πg ) − g(x) dx < , S(h, πh ) − h(x) dx < a a
  • 10. 62 3. The Gauge Integralfor any δg -fine partition πg and any δh -fine partition πh . If π is a δ-finepartition for δ(x) = min{δg (x), δh (x)}, then the inequalities b g(x) dx − < S(g, π) a ≤ S(f, π) ≤ S(h, π) b < h(x) + a b ≤ g(x) dx + 2 atrap S(f, π) in an interval of length 3 . Because the Riemann sum S(f, γ)for any other δ-fine partition γ is trapped in the same interval, the integralof f (x) exists by the Cauchy criterion.Proposition 3.3.3 Every continuous function f (x) on [a, b] is integrable.Proof: In view of the uniform continuity of f (x) on [a, b], for every > 0there exists a δ > 0 with |f (x) − f (y)| < when |x − y| < δ. For theconstant gauge δ(x) = δ and a corresponding δ-fine partition π with meshpoints s0 , . . . , sn , let mi be the minimum and Mi be the maximum of f (x)on [si , si+1 ]. The step functions n n g(x) = mi 1(si ,si+1 ] (x), h(x) = Mi 1(si ,si+1 ] (x) i=1 i=1then satisfy g(x) ≤ f (x) ≤ h(x) except at the single point a. Furthermore, b b n h(x) dx − g(x) dx ≤ (si+1 − si ) a a i=1 = (b − a).Application of Proposition 3.3.2 now completes the proof.3.4 The Fundamental Theorem of CalculusThe fundamental theorem of calculus divides naturally into two parts. Forthe gauge integral, the first and more difficult part is easily proved byinvoking what is called the straddle inequality. Let f (x) be differentiableat the point t ∈ [a, b]. Then there exists δ(t) > 0 such that f (x) − f (t) − f (t) < x−t
  • 11. 3.4 The Fundamental Theorem of Calculus 63for all x ∈ [a, b] with |x − t| < δ(t). If u < t < v are two points straddling tand located in [a, b] ∩ (t − δ(t), t + δ(t)), then |f (v) − f (u) − f (t)(v − u)| ≤ |f (v) − f (t) − f (t)(v − t)| + |f (t) − f (u) − f (t)(t − u)| ≤ (v − t) + (t − u) (3.8) = (v − u).Inequality (3.8) also clearly holds when either u = t or v = t.Proposition 3.4.1 (Fundamental Theorem I) If f (x) is differentiablethroughout [a, b], then b f (x) dx = f (b) − f (a). aProof: Using the gauge δ(t) figuring in the straddle inequality (3.8), selecta δ-fine partition π with mesh points a = s0 < s1 < · · · < sn = b and tagsti ∈ [si , si+1 ]. Application of the inequality and telescoping yield n−1 |f (b) − f (a) − S(f , π)| = [f (si+1 ) − f (si ) − f (ti )(si+1 − si )] i=0 n−1 ≤ |f (si+1 ) − f (si ) − f (ti )(si+1 − si )| i=0 n−1 ≤ (si+1 − si ) i=0 = (b − a).This demonstrates that f (x) has integral f (b) − f (a). The first half of the fundamental theorem remains valid for a continuousfunction f (x) that is differentiable except on a countable set N [250]. Sincechanging an integrand at a countable number of points does not alter itsintegral, it suffices to prove that b 0 t∈N f (b) − f (a) = g(t) dt, where g(t) = a f (t) t ∈ N.Suppose > 0 is given. For t ∈ N define the gauge value δ(t) to satisfythe straddle inequality. Enumerate the points tj of N, and define δ(tj ) > 0so that |f (tj ) − f (tj + s)| < 2−j−2 whenever |s| < δ(tj ). Now select aδ-fine partition π with mesh points a = s0 < s1 < · · · < sn = b and tagsri ∈ [si , si+1 ]. Break the sum n−1 f (b) − f (a) − S(g, π) = f (si+1 ) − f (si ) − g(ri )(si+1 − si ) i=0
  • 12. 64 3. The Gauge Integralinto two parts. Let S denote the sum of the terms with tags ri ∈ N, andlet S denote the sum of the terms with tags ri ∈ N . As noted earlier,|S | ≤ (b − a). Because a tag is attached to at most two subintervals, thesecond sum satisfies |S | ≤ |f (si+1 ) − f (si )| ri ∈N ≤ |f (si+1 ) − f (ri )| + |f (ri ) − f (si )| ri ∈N ∞ ≤ 2 22−j−2 = . j=1It follows that |S + S | ≤ (b − a + 1) and therefore that the stated integralexists and equals f (b) − f (a). In demonstrating the second half of the fundamental theorem, we willimplicitly use the standard convention c d f (x) dx = − f (x) dx d cfor c < d. This convention will also be in force in proving the substitutionformula.Proposition 3.4.2 (Fundamental Theorem II) If a function f (x) isintegrable on [a, b], then its indefinite integral t F (t) = f (x) dx ahas derivative F (t) = f (t) at any point t where f (x) is continuous. Thederivative is taken as one sided if t = a or t = b.Proof: In deriving the interval additivity rule (3.6), we showed that theintegral F (t) exists. At a point t where f (x) is continuous, for any > 0there is a δ > 0 such that − < f (x) − f (t) < when |x − t| < δ andx ∈ [a, b]. Hence, the difference F (t + s) − F (t) 1 t+s − f (t) = [f (x) − f (t)] dx s s tis less than and greater than − for |s| < δ. In the limit as s tends to 0,we recover F (t) = f (t). The fundamental theorem of calculus has several important corollaries.These are covered in the next three propositions on the substitution rule,integration by parts, and finite Taylor expansions.
  • 13. 3.4 The Fundamental Theorem of Calculus 65Proposition 3.4.3 (Substitution Rule) Suppose f (x) is differentiableon [a, b], g(x) is differentiable on [c, d], and the image of [c, d] under g(x)is contained within [a, b]. Then g(d) d f (y) dy = f [g(x)]g (x) dx. g(c) cProof: Part I of the fundamental theorem and the chain rule identity {f [g(x)]} = f [g(x)]g (x)imply that both integrals have value f [g(d)] − f [g(c)].Proposition 3.4.4 (Integration by Parts) Suppose f (x) and g(x) aredifferentiable on [a, b]. Then f (x)g(x) is integrable on [a, b] if and only iff (x)g (x) is integrable on [a, b]. Furthermore, the two integrals are relatedby the identity b b f (x)g(x) dx + f (x)g (x) dx = f (b)g(b) − f (a)g(a), a aProof: The product rule for derivatives is [f (x)g(x)] = f (x)g(x) + f (x)g (x).If two of three members of this identity are integrable, then the third is aswell. Since part I of the fundamental theorem entails b [f (x)g(x)] dx = f (b)g(b) − f (a)g(a), athe proposition follows. The derivative of a function may itself be differentiable. Indeed, it makessense to speak of the kth-order derivative of a function f (x) if f (x) issufficiently smooth. Traditionally, the second-order derivative is denotedf (x) and an arbitrary kth-order derivative by f (k) (x). We can use theseextra derivatives to good effect in approximating f (x) locally. The nextproposition makes this clear and offers an explicit estimate of the error ina finite Taylor expansion of f (x).Proposition 3.4.5 (Taylor Expansion) Suppose f (x) has a derivativeof order k + 1 on an open interval around the point y. Then for all x in theinterval, we have k 1 (j) f (x) = f (y) + f (y)(x − y)j + Rk (x), (3.9) j=1 j!
  • 14. 66 3. The Gauge Integralwhere the remainder 1 (x − y)k+1 Rk (x) = f (k+1) [y + t(x − y)](1 − t)k dt. k! 0If |f (k+1) (z)| ≤ b for all z between x and y, then b|x − y|k+1 |Rk (x)| ≤ . (3.10) (k + 1)!Proof: When k = 0, the Taylor expansion (3.9) reads 1 f (x) = f (y) + (x − y) f [y + t(x − y)]dt 0and follows from the fundamental theorem of calculus and the chain rule.Induction and the integration-by-parts formula 1 f (k) [y + t(x − y)](1 − t)k−1 dt 0 1 1 = − f (k) [y + t(x − y)](1 − t)k k 0 x − y 1 (k+1) + f [y + t(x − y)](1 − t)k dt k 0 1 1 (k) x−y = f (y) + f (k+1) [y + t(x − y)](1 − t)k dt k k 0now validate the general expansion (3.9). The error estimate follows directlyfrom the bound |f (k+1) (z)| ≤ b and the integral 1 1 (1 − t)k dt = . 0 k+13.5 More Advanced Topics in IntegrationWithin the confines of a single chapter, it is impossible to develop rigorouslyall of the properties of the gauge integral. In this section we will discussbriefly four topics: (a) integrals over unbounded intervals, (b) improperintegrals and Hake’s theorem, (c) the interchange of limits and integrals,and (d) multidimensional integrals and Fubini’s theorem. Defining the integral of a function over an unbounded interval requiresseveral minor adjustments. First, the real line is extended to include thepoints ±∞. Second, a gauge function δ(x) is now viewed as mapping xto an open interval containing x. The associated interval may be infinite;indeed, it must be infinite if x equals ±∞. In a δ-fine partition π, the
  • 15. 3.5 More Advanced Topics in Integration 67interval Ij containing the tag xj is contained in δ(xj ). The length of aninfinite interval Ij is defined to be 0 in an approximating Riemann sumS(f, π) to avoid infinite contributions to the sum. Likewise, the integrandf (x) is assigned the value 0 at x = ±∞. This extended definition carries with it all the properties we expect. Itsmost remarkable consequence is that it obliterates the distinction betweenproper and improper integrals. Hake’s theorem provides the link. If weallow a and b to be infinite as well as finite, then Hake’s theorem says afunction f (x) is integrable over (a, b) if and only if either of the two limits b c lim f (x) dx or lim f (x) dx c→a c c→b a bexists. If either limit exists, then a f (x) dx equals that limit. For instance,the integral ∞ c c 1 1 1 dx = lim dx = lim − = 1 1 x2 c→∞ 1 x2 c→∞ x 1exists and has the indicated limit by this reasoning. ∞Example 3.5.1 Existence of 0 sinc(x) dxConsider the integral of sinc(x) = sin(x)/x over the interval (0, ∞). Becausesinc(x) is continuous throughout [0, 1] with limit 1 as x approaches 0, theintegral over [0, 1] is defined. Hake’s theorem and integration by parts showthat the integral ∞ c sin x sin x dx = lim dx 1 x c→∞ 1 x c cos x c cos x = lim − − dx c→∞ x 1 1 x2 ∞ cos x = cos 1 − dx 1 x2exists provided the integral of x−2 cos x exists over (1, ∞). We will demon-strate this fact in a moment. If we accept it, then it is clear that the integralof sinc(x) over (0, ∞) exists as well. As we shall find in Example 3.5.4, thisintegral equals π/2. In contrast to these positive results, sinc(x) is notabsolutely integrable over (0, ∞). Finally, we note in passing that the sub-stitution rule gives ∞ ∞ ∞ sin cx sin y −1 sin y π dx = c dy = dy = . 0 x 0 c−1 y 0 y 2for any c > 0.
  • 16. 68 3. The Gauge Integral We now ask under what circumstances the formula b b lim fn (x) dx = lim fn (x) dx (3.11) n→∞ a a n→∞is valid. The two relevant theorems permitting the interchange of limitsand integrals are the monotone convergence theorem and the dominatedconvergence theorem. In the monotone convergence theorem, we are givenan increasing sequence fn (x) of integrable functions that converge to afinite limit for each x. Formula (3.11) is true in this setting provided b sup fn (x) dx < ∞. n aIn the dominated convergence theorem, we assume the sequence fn (x) istrapped between two integrable functions g(x) and h(x) in the sense that g(x) ≤ fn (x) ≤ h(x)for all n and x. If limn→∞ fn (x) exists in this setting, then the inter-change (3.11) is allowed. The choices fn (x) = 1[1,n] (x)x−2 cos x , g(x) = −x−2 , h(x) = x−2in the dominated convergence theorem validate the existence of ∞ n x−2 cos x dx = lim x−2 cos x dx. 1 n→∞ 1We now consider two more substantive applications of the monotone anddominated convergence theorems.Example 3.5.2 Johann Bernoulli’s IntegralAs example of delicate maneuvers in integration, consider the integral 1 1 1 dx = e−x ln x dx 0 xx 0 1 ∞ (−x ln x)n = dx 0 n=0 n! ∞ 1 1 = (−x ln x)n dx . n=0 n! 0The reader will notice the application of the monotone convergence theoremin passing from the second to the third line above. Further progress can bemade by applying the integration by parts result 1 1 n lnn−1 x xm lnn x dx = − xm+1 dx 0 m+1 0 x 1 n = − xm lnn−1 x dx m+1 0
  • 17. 3.5 More Advanced Topics in Integration 69recursively to evaluate 1 1 n! n! (−x ln x)n dx = xn dx = . 0 (n + 1)n 0 (n + 1)n+1The pleasant surprise 1 ∞ 1 1 dx = 0 xx n=0 (n + 1)n+1emerges.Example 3.5.3 Competing Definitions of the Gamma FunctionThe dominated convergence theorem allows us to derive Gauss’s represen-tation n!nz Γ(z) = lim n→∞ z(z + 1) · · · (z + n)of the gamma function from Euler’s representation ∞ Γ(z) = xz−1 e−x dx . 0As students of statistics are apt to know from their exposure to the betadistribution, repeated integration by parts and the fundamental theoremof calculus show that 1 n! xz−1 (1 − x)n dx = . 0 z(z + 1) · · · (z + n)The substitution rule yields 1 n n y nz xz−1 (1 − x)n dx = y z−1 1 − dy . 0 0 nThus, it suffices to prove that ∞ n n y xz−1 e−x dx = lim y z−1 1 − dy . 0 n→∞ 0 nGiven the limit y n lim 1− = e−y , n→∞ nwe need an integrable function h(y) that dominates the nonnegative se-quence y n fn (y) = 1[0,n] (y)y z−1 1 − n
  • 18. 70 3. The Gauge Integralfrom above in order to apply the dominated convergence theorem. In lightof the inequality y n 1− ≤ e−y , nthe function h(y) = y z−1 e−y will serve. Finally, the gauge integral extends to multiple dimensions, where a ver-sion of Fubini’s theorem holds for evaluating multidimensional integralsvia iterated integrals [278]. Consider a function f (x, y) defined over theCartesian product H × K of two multidimensional intervals H and K. Theintervals in question can be bounded or unbounded. If f (x, y) is integrableover H × K, then Fubini’s theorem asserts that the integrals H f (x, y) dxand K f (x, y) dy exist and can be integrated over the remaining variableto give the full integral. In symbols, f (x, y) dx dy = f (x, y) dx dy = f (x, y) dy dx . H×K K J J KConversely, if either iterated integral exists, one would like to conclude thatthe full integral exists as well. This is true whenever f (x, y) is nonnega-tive. Unfortunately, it is false in general, and two additional hypothesesintroduced by Tonelli are needed to rescue the situation. One hypothesisis that f (x, y) is measurable. Measurability is a technical condition thatholds except for very pathological functions. The other hypothesis is that|f (x, y)| ≤ g(x, y) for some nonnegative function g(x, y) for which theiterated integral exists. This domination condition is shared with the dom-inated convergence theorem and forces f (x, y) to be absolutely integrable. ∞Example 3.5.4 Evaluation of 0 sinc(x) dxAccording to Fubini’s theorem n nπ nπ n e−xy sin x dx dy = e−xy sin x dy dx . (3.12) 0 0 0 0The second of these iterated integrals nπ n nπ nπ sin x sin x e−xy sin x dy dx = dx − e−nx dx 0 0 0 x 0 x ∞tends to 0 as n tends to ∞ by a combination of Hake’s theorem sinc(x) dxand the dominated convergence theorem. The inner integral of the leftiterated integral in (3.12) equals nπ nπ nπ e−xy sin x dx = −e−xy cos x − ye−xy sin x 0 0 0 nπ − y2 e−xy sin x dx 0 nπ −nπy = 1−e cos nπ − y 2 e−xy sin x dx 0
  • 19. 3.6 Problems 71after two integrations by parts. It follows that nπ 1 − e−nπy cos nπ e−xy sin x dx = . 0 1 + y2Finally, application of the dominated convergence theorem gives ∞ n 1 − e−nπy cos nπ 1 lim dy = dy n→∞ 0 1 + y2 0 1 + y2 π = . 2Equating the limits of the right and left hand sides of the identity (3.12) ∞therefore [278] yields the value of π/2 for 0 sinc(x) dx.3.6 Problems 1. Give an alternative proof of Cousin’s lemma by letting y be the supre- mum of the set of x ∈ [a, b] such that [a, x] possesses a δ-fine partition. 2. Use Cousin’s lemma to prove that a continuous function f (x) defined on an interval [a, b] is uniformly continuous there [108]. (Hint: Given > 0 define a gauge δ(x) by the requirement that |f (y) − f (x)| < 1 2 for all y ∈ [a, b] with |y − x| < 2δ(x).) 3. A possibly discontinuous function f (x) has one-sided limits at each point x ∈ [a, b]. Show by Cousin’s lemma that f (x) is bounded on [a, b]. 4. Suppose f (x) has a nonnegative derivative f (x) throughout [a, b]. Prove that f (x) is nondecreasing on [a, b]. Also prove that f (x) is constant on [a, b] if and only if f (x) = 0 for all x. (Hint: These yield easily to the fundamental theorem of calculus. Alternatively for the first assertion, consider the function f (x) = f (x) + x for > 0.) 5. Using only the definition of the gauge integral, demonstrate that b −a f (t) dt = f (−t) dt a −b when either integral exists.
  • 20. 72 3. The Gauge Integral 6. Based on the standard definition of the natural logarithm y 1 ln y = dx, 1 x prove that ln yz = ln y + ln z for all positive arguments y and z. Use this property to verify that ln y −1 = − ln y and that ln y r = r ln y for every rational number r. 7. Apply Proposition 3.3.2 and demonstrate that every monotonic func- tion defined on an interval [a, b] is integrable on that interval. 8. Let f (x) be a continuous real-valued function on [a, b]. Show that there exists c ∈ [a, b] with b f (x) dx = f (c)(b − a). a 9. In the Taylor expansion of Proposition 3.4.5, suppose f (k+1) (x) is continuous. Show that we can replace the remainder by (x − y)k+1 (k+1) Rk (x) = f (z) (k + 1)! for some z between x and y. 10. Suppose that f (x) is infinitely differentiable and that c and r are pos- itive numbers. If |f (k) (x)| ≤ ck!rk for all x near y and all nonnegative integers k, then use Proposition 3.4.5 to show that ∞ f (k) (y) f (x) = (x − y)k k! k=0 near y. Explicitly determine the infinite Taylor series expansion of the function f (x) = (1 + x)−1 around x = 0 and justify its convergence. 11. Suppose the nonnegative continuous function f (x) satisfies b f (x) dx = 0. a Prove that f (x) is identically 0 on [a, b]. 12. Consider the function f (x) = x2 sin (x−2 ) x = 0 0 x = 0. 1 1 Show that 0 f (x) dx = sin (1) and limt↓0 t |f (x)| dx = ∞. Hence, f (x) is integrable but not absolutely integrable on [0,1].
  • 21. 3.6 Problems 7313. Prove that ∞ β 1 α+1 xα e−x dx = Γ 0 β β for α and β positive [82].14. Justify the formula 1 ∞ ln(1 − x) 1 = − . 0 x n=1 n215. Show that ∞ xz−1 dx = ζ(z)Γ(z), 0 ex − 1 ∞ where ζ(z) = n=1 n−z .16. Prove that the functions ∞ sin t f (x) = dt 1 x2 + t2 ∞ g(x) = e−xt cos t dt, x > 0, 0 are continuous.17. Let fn (x) be a sequence of integrable functions on [a, b] that converges uniformly to f (x). Demonstrate that f (x) is integrable and satisfies b b lim fn (x) dx = f (x) dx . n→∞ a a (Hints: For > 0 small take n large enough so that fn (x) − ≤ f (x) ≤ fn (x) + 2(b − a) 2(b − a) for all x.)18. Let p and q be positive integers. Justify the series expansion 1 ∞ xp−1 (−1)n dx = 0 1 + xq n=0 p + nq by the monotone convergence theorem. Be careful since the series does not converge absolutely [278].
  • 22. 74 3. The Gauge Integral 19. Suppose f (x) is a continuous function on R. Demonstrate that the sequence n−1 1 k fn (x) = f x+ n n k=0 converges uniformly to a continuous function on every finite interval [a, b] [69]. 20. Prove that 1 xb − xa b+1 dx = ln 0 ln x a+1 for 0 < a < b [278] by showing that both sides equal the double integral xy dx dy . [0,1]×[a,b] 21. Integrate the function y 2 − x2 f (x, y) = (x2 + y 2 )2 over the unit square [0, 1]× [0, 1]. Show that the two iterated integrals disagree, and explain why Fubini’s theorem fails. 2 2 22. Suppose the two partial derivatives ∂x∂∂x2 f (x) and ∂x∂∂x1 f (x) exist 1 2 and are continuous in a neighborhood of a point y ∈ R2 . Show that they are equal at the point. (Hints: If they are not equal, take a small box around the point where their difference has constant sign. Now apply Fubini’s theorem.) 23. Demonstrate that ∞ √ 2 π e−x dx = 0 2 2 2 by evaluating the integral of f (y) = y2 e−(1+y1 )y2 over the rectangle (0, ∞) × (0, ∞).