SlideShare a Scribd company logo
1 of 22
Download to read offline
September 2, 2014
Chapter 2: Roundoff Errors
Uri M. Ascher and Chen Greif
Department of Computer Science
The University of British Columbia
{ascher,greif}@cs.ubc.ca
Slides for the book
A First Course in Numerical Methods (published by SIAM, 2011)
http://bookstore.siam.org/cs07/
Goals and motivation
Goals of this chapter
• To describe how numbers are stored in a floating point system;
• to understand how standard floating point systems are designed and
implemented;
• to get a feeling for the almost random nature of rounding error;
• to identify different sources of roundoff error growth and explain how to
dampen their cumulative effect.
Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 1 / 21
Goals and motivation
Roundoff errors
• Roundoff error is generally inevitable in numerical algorithms involving real
numbers.
• People often like to pretend they work with exact real numbers, ignoring
roundoff errors, which may allow concentration on other algorithmic aspects.
• However, carelessness may lead to disaster!
• This chapter provides an overview of roundoff errors, including floating point
number representation, rounding error and arithmetic, the IEEE standard,
and roundoff error accumulation.
Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 2 / 21
Goals and motivation
Outline
• Floating point systems
• The IEEE standard
• Roundoff error accumulation
Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 3 / 21
Floating point systems
Real number representation: decimal
8
3
≃
(
2
100
+
6
101
+
6
102
+
6
103
)
× 100
= 2.666 × 100
.
An instance of the floating point representation
fl(x) = ±d0.d1 · · · dt−1 × 10e
= ±
(
d0
100
+
d1
101
+ · · · +
dt−2
10t−2
+
dt−1
10t−1
)
× 10e
for t = 4, e = 0.
Note that d0 > 0: normalized floating point representation.
Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 4 / 21
Floating point systems
Real number representation: binary
The decimal system is convenient for humans; but computers prefer binary.
• In binary the (normalized) representation of a real number x is
x = ±(1.d1d2d3 · · · dt−1dtdt+1 · · · ) × 2e
= ±(1 +
d1
2
+
d2
4
+
d3
8
+ · · · ) × 2e
,
with binary digits di = 0 or 1 and exponent e.
• Floating point representation: with a fixed number of digits t
fl(x) = ±(1. ˜d1
˜d2
˜d3 · · · ˜dt−1
˜dt) × 2e
• How to determine digits ˜di?
A popular strategy is Rounding:
fl(x) =
{
± 1.d1d2d3 · · · dt × 2e
dt+1 = 0
to nearest even otherwise
.
Alternatively, Chopping simply sets ˜di = di, i = 1, . . . , t.
Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 5 / 21
Floating point systems
General floating point system
defined by (β, t, L, U), where:
β: base of the number system (for binary, β = 2; for decimal, β = 10);
t: precision (number of digits);
L: lower bound on exponent e;
U: upper bound on exponent e.
For each x ∈ R corresponds a normalized floating point representation
fl(x) = ±
(
d0
β0
+
d1
β1
+ · · · +
dt−1
βt−1
)
× βe
,
where 0 ≤ di ≤ β − 1, d0 > 0, and L ≤ e ≤ U.
Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 6 / 21
Floating point systems
Example
(β, t, L, U) = (10, 4, −2, 1).
Largest number is 9.999 × 10U
= 99.99 10U+1
= 100
Smallest positive number is 1.000 × 10L
= 10L
= 0.01
Total different fractions: (β − 1) × βt−1
= 9 × 10 × 10 × 10 = 9, 000
Total different exponents: U − L + 1 = 4
Total different positive numbers: 4 × 9, 000 = 36, 000
Total different numbers in system: 72,001
Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 7 / 21
Floating point systems
Error in floating point number representation
For the real number x = ± d0.d1d2d3 · · · dt−1dtdt+1 · · · × βe
,
• Chopping:
fl(x) = ± d0.d1d2d3 · · · dt−1 × βe
Then absolute error is clearly bounded by β1−t
· βe
.
• Rounding:
fl(x) =
{
± d0.d1d2d3 · · · dt−1 × βe
dt < β/2
±
(
d0.d1d2d3 · · · dt−1 + β1−t
)
× βe
dt > β/2
,
round to even in case of a tie.
Then absolute error is bounded by half the above, 1
2 · β1−t
· βe
.
So relative error is bounded by rounding unit
η =
1
2
· β1−t
.
Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 8 / 21
Floating point systems
Floating point arithmetic
• Important to use exact rounding: if fl(x) and fl(y) are machine numbers,
then
fl(fl(x) ± fl(y)) = (fl(x) ± fl(y))(1 + ϵ1),
fl(fl(x) × fl(y)) = (fl(x) × fl(y))(1 + ϵ2),
fl(fl(x)/fl(y)) = (fl(x)/fl(y))(1 + ϵ3),
where |ϵi| ≤ η.
• Thus, the relative errors remain small after each such operation.
• This is achieved using guard digits (see Example 2.6 in text).
Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 9 / 21
Floating point systems
Spacing of floating point numbers
Run program Example2 8Figure2 3
−15 −10 −5 0 5 10 15
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Note the uneven distribution, both for large exponents and near 0.
Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 10 / 21
Floating point systems
Overflow, underflow, NaN
• Overflow: when e > U. (fatal)
• Underflow: when e < L. (non-fatal: set to 0 by default)
• NaN: Not-a-number. (e.g., 0/0)
Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 11 / 21
Floating point systems
Outline
• Floating point systems
• The IEEE standard
• Roundoff error accumulation
Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 12 / 21
The IEEE standard
IEEE standard
• Used by everyone today.
• Binary: use β = 2.
• Exact rounding: use guard digits to ensure that relative error in each
elementary arithmetic operation is bounded by η.
• NaN
• Overflow and underflow
• Subnormal numbers near 0.
• Many other features...
Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 13 / 21
The IEEE standard
IEEE standard word
Use base β = 2.
Recall that with this base, in normalized numbers the first digit is always d0 = 1
and thus it need not be stored.
Double precision (64 bit word)
s = ± b =11-bit exponent f =52-bit fraction
Rounding unit:
η =
1
2
· 2−52
≈ 1.1 × 10−16
Can have also single precision (32 bit word).
Then t = 23 and η = 2−24
≈ 6.0 × 10−8
.
Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 14 / 21
The IEEE standard
Outline
• Floating point systems
• The IEEE standard
• Roundoff error accumulation
Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 15 / 21
Roundoff error accumulation
Roundoff error accumulation
• In general, if En is error after n elementary operations, cannot avoid linear
roundoff error accumulation
En ≃ c0nE0.
• Will not tolerate an exponential error growth such as
En ≃ cn
1 E0 for some constant c1 > 1
– an unstable algorithm.
• In some situations an individual error contribution is particularly large and
occasionally can be made smaller.
Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 16 / 21
Roundoff error accumulation
Cancellation error
When two nearby numbers are subtracted, the relative error is large. That is, if
x ≃ y, then x − y has a large relative error.
This occurs in practice consistently and naturally, as we will see.
Function evaluation at nearby arguments:
• If g(·) is a smooth function then g(t) and g(t + h) are close for h small.
• But rounding errors in g(t) and g(t + h) are unrelated, so they can be of
opposing signs!
• For numerical differentiation, e.g.
f′
(x0) ≈
f(x0 + h) − f(x0)
h
, 0 < h ≪ 1,
if the relative rounding error in the representation is bounded by η then in
|g(t + h) − g(t)|/h it is bounded by 2η/h. This (tight) bound is much larger
than η when h is small.
Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 17 / 21
Roundoff error accumulation
Example
Compute y = sinh(x) = 1
2 (ex
− e−x
).
• Naively computing y at an x near 0 may result in a (meaningless) 0.
• Instead use Taylor’s expansion
ex
= 1 + x +
x2
2
+
x3
6
+ . . .
to obtain
sinh(x) = x +
x3
6
+ . . . .
• If x is near 0, can use x + x3
6 , or even just x, for an effective approximation
to sinh(x).
So, a good library function would compute sinh(x) by the regular formula (using
exponentials) for |x| not very small, and by taking a term or two of the Taylor
expansion for |x| very small.
Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 18 / 21
Roundoff error accumulation
Illustration
Compute y =
√
x + 1 −
√
x for x = 100, 000 in a 5-digit decimal arithmetic.
(So β = 10, t = 5.)
• Naively computing
√
x + 1 −
√
x results in the value 0.
• Instead use the identity
(
√
x + 1 −
√
x)(
√
x + 1 +
√
x)
(
√
x + 1 +
√
x)
=
1
√
x + 1 +
√
x
.
• In 5-digit decimal arithmetic calculating the right hand side expression yields
1.5811 × 10−3
: correct in the given accuracy.
Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 19 / 21
Roundoff error accumulation
The rough appearance of roundoff errors
Run program Example2 2Figure2 2.m
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
−8
−6
−4
−2
0
2
4
6
x 10
−8 error in sampling exp(−t)(sin(2π t)+2) in single precision
t
roundofferror
Note how the sign of the floating point representation error at nearby arguments t
fluctuates as if randomly: as a function of t it is a “non-smooth” error.
Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 20 / 21
Roundoff error accumulation
Avoiding overflow
• 2-norm computation: for a vector x = (x1, x2, . . . , xn)T
with n components,
∥x∥2 =
√
x2
1 + x2
2 + · · · + x2
n.
The vector x may have many components: accumulating damage may be
fatal.
• Suppose a ≫ b and we wish to compute c =
√
a2 + b2. Then it may be
better to compute
c = a
√
1 + (b/a)2.
• The norm calculation can be scaled in a similar manner.
Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 21 / 21

More Related Content

What's hot

What's hot (20)

Side 2019 #6
Side 2019 #6Side 2019 #6
Side 2019 #6
 
Slides: Hypothesis testing, information divergence and computational geometry
Slides: Hypothesis testing, information divergence and computational geometrySlides: Hypothesis testing, information divergence and computational geometry
Slides: Hypothesis testing, information divergence and computational geometry
 
Statistics Assignment Help
Statistics Assignment HelpStatistics Assignment Help
Statistics Assignment Help
 
Statistical Assignment Help
Statistical Assignment HelpStatistical Assignment Help
Statistical Assignment Help
 
Slides econometrics-2018-graduate-2
Slides econometrics-2018-graduate-2Slides econometrics-2018-graduate-2
Slides econometrics-2018-graduate-2
 
Visual Techniques
Visual TechniquesVisual Techniques
Visual Techniques
 
Machine Learning in Actuarial Science & Insurance
Machine Learning in Actuarial Science & InsuranceMachine Learning in Actuarial Science & Insurance
Machine Learning in Actuarial Science & Insurance
 
Interpolation of Cubic Splines
Interpolation of Cubic SplinesInterpolation of Cubic Splines
Interpolation of Cubic Splines
 
Bayes 6
Bayes 6Bayes 6
Bayes 6
 
numerical methods
numerical methodsnumerical methods
numerical methods
 
Lect3cg2011
Lect3cg2011Lect3cg2011
Lect3cg2011
 
Lausanne 2019 #4
Lausanne 2019 #4Lausanne 2019 #4
Lausanne 2019 #4
 
Lagrange’s interpolation formula
Lagrange’s interpolation formulaLagrange’s interpolation formula
Lagrange’s interpolation formula
 
Statistics Coursework Help
Statistics Coursework HelpStatistics Coursework Help
Statistics Coursework Help
 
Polynomial functions
Polynomial functionsPolynomial functions
Polynomial functions
 
Introduction to Evidential Neural Networks
Introduction to Evidential Neural NetworksIntroduction to Evidential Neural Networks
Introduction to Evidential Neural Networks
 
Actuarial Pricing Game
Actuarial Pricing GameActuarial Pricing Game
Actuarial Pricing Game
 
Side 2019 #3
Side 2019 #3Side 2019 #3
Side 2019 #3
 
A mid point ellipse drawing algorithm on a hexagonal grid
A mid  point ellipse drawing algorithm on a hexagonal gridA mid  point ellipse drawing algorithm on a hexagonal grid
A mid point ellipse drawing algorithm on a hexagonal grid
 
Es272 ch5a
Es272 ch5aEs272 ch5a
Es272 ch5a
 

Similar to 1

NUMERICA METHODS 1 final touch summary for test 1
NUMERICA METHODS 1 final touch summary for test 1NUMERICA METHODS 1 final touch summary for test 1
NUMERICA METHODS 1 final touch summary for test 1musadoto
 
Monte Carlo Methods 2017 July Talk in Montreal
Monte Carlo Methods 2017 July Talk in MontrealMonte Carlo Methods 2017 July Talk in Montreal
Monte Carlo Methods 2017 July Talk in MontrealFred J. Hickernell
 
Connection between inverse problems and uncertainty quantification problems
Connection between inverse problems and uncertainty quantification problemsConnection between inverse problems and uncertainty quantification problems
Connection between inverse problems and uncertainty quantification problemsAlexander Litvinenko
 
Physics_150612_01
Physics_150612_01Physics_150612_01
Physics_150612_01Art Traynor
 
Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Frank Nielsen
 
Mm chap08 -_lossy_compression_algorithms
Mm chap08 -_lossy_compression_algorithmsMm chap08 -_lossy_compression_algorithms
Mm chap08 -_lossy_compression_algorithmsEellekwameowusu
 
Optimization Techniques.pdf
Optimization Techniques.pdfOptimization Techniques.pdf
Optimization Techniques.pdfanandsimple
 
Chapter 1 Errors and Approximations.ppt
Chapter 1 Errors  and Approximations.pptChapter 1 Errors  and Approximations.ppt
Chapter 1 Errors and Approximations.pptEyob Adugnaw
 
An Efficient And Safe Framework For Solving Optimization Problems
An Efficient And Safe Framework For Solving Optimization ProblemsAn Efficient And Safe Framework For Solving Optimization Problems
An Efficient And Safe Framework For Solving Optimization ProblemsLisa Muthukumar
 
Error analysis statistics
Error analysis   statisticsError analysis   statistics
Error analysis statisticsTarun Gehlot
 
Lecture: Monte Carlo Methods
Lecture: Monte Carlo MethodsLecture: Monte Carlo Methods
Lecture: Monte Carlo MethodsFrank Kienle
 
Dynamic Programming - Laughlin Lunch and Learn
Dynamic Programming - Laughlin Lunch and LearnDynamic Programming - Laughlin Lunch and Learn
Dynamic Programming - Laughlin Lunch and LearnBillie Rose
 
Fast Identification of Heavy Hitters by Cached and Packed Group Testing
Fast Identification of Heavy Hitters by Cached and Packed Group TestingFast Identification of Heavy Hitters by Cached and Packed Group Testing
Fast Identification of Heavy Hitters by Cached and Packed Group TestingRakuten Group, Inc.
 
Image formation
Image formationImage formation
Image formationpotaters
 

Similar to 1 (20)

1519 differentiation-integration-02
1519 differentiation-integration-021519 differentiation-integration-02
1519 differentiation-integration-02
 
Error analysis
Error analysisError analysis
Error analysis
 
NUMERICA METHODS 1 final touch summary for test 1
NUMERICA METHODS 1 final touch summary for test 1NUMERICA METHODS 1 final touch summary for test 1
NUMERICA METHODS 1 final touch summary for test 1
 
Monte Carlo Methods 2017 July Talk in Montreal
Monte Carlo Methods 2017 July Talk in MontrealMonte Carlo Methods 2017 July Talk in Montreal
Monte Carlo Methods 2017 July Talk in Montreal
 
Connection between inverse problems and uncertainty quantification problems
Connection between inverse problems and uncertainty quantification problemsConnection between inverse problems and uncertainty quantification problems
Connection between inverse problems and uncertainty quantification problems
 
Physics_150612_01
Physics_150612_01Physics_150612_01
Physics_150612_01
 
Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...
 
Mm chap08 -_lossy_compression_algorithms
Mm chap08 -_lossy_compression_algorithmsMm chap08 -_lossy_compression_algorithms
Mm chap08 -_lossy_compression_algorithms
 
Optimization Techniques.pdf
Optimization Techniques.pdfOptimization Techniques.pdf
Optimization Techniques.pdf
 
Chapter 1 Errors and Approximations.ppt
Chapter 1 Errors  and Approximations.pptChapter 1 Errors  and Approximations.ppt
Chapter 1 Errors and Approximations.ppt
 
An Efficient And Safe Framework For Solving Optimization Problems
An Efficient And Safe Framework For Solving Optimization ProblemsAn Efficient And Safe Framework For Solving Optimization Problems
An Efficient And Safe Framework For Solving Optimization Problems
 
Error analysis statistics
Error analysis   statisticsError analysis   statistics
Error analysis statistics
 
Lecture: Monte Carlo Methods
Lecture: Monte Carlo MethodsLecture: Monte Carlo Methods
Lecture: Monte Carlo Methods
 
Es272 ch2
Es272 ch2Es272 ch2
Es272 ch2
 
Regression
RegressionRegression
Regression
 
Dynamic Programming - Laughlin Lunch and Learn
Dynamic Programming - Laughlin Lunch and LearnDynamic Programming - Laughlin Lunch and Learn
Dynamic Programming - Laughlin Lunch and Learn
 
Fast Identification of Heavy Hitters by Cached and Packed Group Testing
Fast Identification of Heavy Hitters by Cached and Packed Group TestingFast Identification of Heavy Hitters by Cached and Packed Group Testing
Fast Identification of Heavy Hitters by Cached and Packed Group Testing
 
Image formation
Image formationImage formation
Image formation
 
QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...
QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...
QMC Opening Workshop, High Accuracy Algorithms for Interpolating and Integrat...
 
2 simple regression
2   simple regression2   simple regression
2 simple regression
 

More from EasyStudy3

More from EasyStudy3 (20)

Week 7
Week 7Week 7
Week 7
 
Chapter 3
Chapter 3Chapter 3
Chapter 3
 
Week 6
Week 6Week 6
Week 6
 
Chapter2 slides-part 2-harish complete
Chapter2 slides-part 2-harish completeChapter2 slides-part 2-harish complete
Chapter2 slides-part 2-harish complete
 
L6
L6L6
L6
 
Chapter 5
Chapter 5Chapter 5
Chapter 5
 
2
22
2
 
Lec#4
Lec#4Lec#4
Lec#4
 
Chapter 12 vectors and the geometry of space merged
Chapter 12 vectors and the geometry of space mergedChapter 12 vectors and the geometry of space merged
Chapter 12 vectors and the geometry of space merged
 
Week 5
Week 5Week 5
Week 5
 
Chpater 6
Chpater 6Chpater 6
Chpater 6
 
Chapter 5
Chapter 5Chapter 5
Chapter 5
 
Lec#3
Lec#3Lec#3
Lec#3
 
Chapter 16 2
Chapter 16 2Chapter 16 2
Chapter 16 2
 
Chapter 5 gen chem
Chapter 5 gen chemChapter 5 gen chem
Chapter 5 gen chem
 
Topic 4 gen chem guobi
Topic 4 gen chem guobiTopic 4 gen chem guobi
Topic 4 gen chem guobi
 
Gen chem topic 3 guobi
Gen chem topic 3  guobiGen chem topic 3  guobi
Gen chem topic 3 guobi
 
Chapter 2
Chapter 2Chapter 2
Chapter 2
 
Gen chem topic 1 guobi
Gen chem topic 1 guobiGen chem topic 1 guobi
Gen chem topic 1 guobi
 
Chapter1 f19 bb(1)
Chapter1 f19 bb(1)Chapter1 f19 bb(1)
Chapter1 f19 bb(1)
 

Recently uploaded

KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...M56BOOKSTORE PRODUCT/SERVICE
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfakmcokerachita
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsKarinaGenton
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfUmakantAnnand
 

Recently uploaded (20)

KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
KSHARA STURA .pptx---KSHARA KARMA THERAPY (CAUSTIC THERAPY)————IMP.OF KSHARA ...
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Class 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdfClass 11 Legal Studies Ch-1 Concept of State .pdf
Class 11 Legal Studies Ch-1 Concept of State .pdf
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Science 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its CharacteristicsScience 7 - LAND and SEA BREEZE and its Characteristics
Science 7 - LAND and SEA BREEZE and its Characteristics
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Concept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.CompdfConcept of Vouching. B.Com(Hons) /B.Compdf
Concept of Vouching. B.Com(Hons) /B.Compdf
 

1

  • 1. September 2, 2014 Chapter 2: Roundoff Errors Uri M. Ascher and Chen Greif Department of Computer Science The University of British Columbia {ascher,greif}@cs.ubc.ca Slides for the book A First Course in Numerical Methods (published by SIAM, 2011) http://bookstore.siam.org/cs07/
  • 2. Goals and motivation Goals of this chapter • To describe how numbers are stored in a floating point system; • to understand how standard floating point systems are designed and implemented; • to get a feeling for the almost random nature of rounding error; • to identify different sources of roundoff error growth and explain how to dampen their cumulative effect. Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 1 / 21
  • 3. Goals and motivation Roundoff errors • Roundoff error is generally inevitable in numerical algorithms involving real numbers. • People often like to pretend they work with exact real numbers, ignoring roundoff errors, which may allow concentration on other algorithmic aspects. • However, carelessness may lead to disaster! • This chapter provides an overview of roundoff errors, including floating point number representation, rounding error and arithmetic, the IEEE standard, and roundoff error accumulation. Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 2 / 21
  • 4. Goals and motivation Outline • Floating point systems • The IEEE standard • Roundoff error accumulation Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 3 / 21
  • 5. Floating point systems Real number representation: decimal 8 3 ≃ ( 2 100 + 6 101 + 6 102 + 6 103 ) × 100 = 2.666 × 100 . An instance of the floating point representation fl(x) = ±d0.d1 · · · dt−1 × 10e = ± ( d0 100 + d1 101 + · · · + dt−2 10t−2 + dt−1 10t−1 ) × 10e for t = 4, e = 0. Note that d0 > 0: normalized floating point representation. Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 4 / 21
  • 6. Floating point systems Real number representation: binary The decimal system is convenient for humans; but computers prefer binary. • In binary the (normalized) representation of a real number x is x = ±(1.d1d2d3 · · · dt−1dtdt+1 · · · ) × 2e = ±(1 + d1 2 + d2 4 + d3 8 + · · · ) × 2e , with binary digits di = 0 or 1 and exponent e. • Floating point representation: with a fixed number of digits t fl(x) = ±(1. ˜d1 ˜d2 ˜d3 · · · ˜dt−1 ˜dt) × 2e • How to determine digits ˜di? A popular strategy is Rounding: fl(x) = { ± 1.d1d2d3 · · · dt × 2e dt+1 = 0 to nearest even otherwise . Alternatively, Chopping simply sets ˜di = di, i = 1, . . . , t. Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 5 / 21
  • 7. Floating point systems General floating point system defined by (β, t, L, U), where: β: base of the number system (for binary, β = 2; for decimal, β = 10); t: precision (number of digits); L: lower bound on exponent e; U: upper bound on exponent e. For each x ∈ R corresponds a normalized floating point representation fl(x) = ± ( d0 β0 + d1 β1 + · · · + dt−1 βt−1 ) × βe , where 0 ≤ di ≤ β − 1, d0 > 0, and L ≤ e ≤ U. Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 6 / 21
  • 8. Floating point systems Example (β, t, L, U) = (10, 4, −2, 1). Largest number is 9.999 × 10U = 99.99 10U+1 = 100 Smallest positive number is 1.000 × 10L = 10L = 0.01 Total different fractions: (β − 1) × βt−1 = 9 × 10 × 10 × 10 = 9, 000 Total different exponents: U − L + 1 = 4 Total different positive numbers: 4 × 9, 000 = 36, 000 Total different numbers in system: 72,001 Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 7 / 21
  • 9. Floating point systems Error in floating point number representation For the real number x = ± d0.d1d2d3 · · · dt−1dtdt+1 · · · × βe , • Chopping: fl(x) = ± d0.d1d2d3 · · · dt−1 × βe Then absolute error is clearly bounded by β1−t · βe . • Rounding: fl(x) = { ± d0.d1d2d3 · · · dt−1 × βe dt < β/2 ± ( d0.d1d2d3 · · · dt−1 + β1−t ) × βe dt > β/2 , round to even in case of a tie. Then absolute error is bounded by half the above, 1 2 · β1−t · βe . So relative error is bounded by rounding unit η = 1 2 · β1−t . Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 8 / 21
  • 10. Floating point systems Floating point arithmetic • Important to use exact rounding: if fl(x) and fl(y) are machine numbers, then fl(fl(x) ± fl(y)) = (fl(x) ± fl(y))(1 + ϵ1), fl(fl(x) × fl(y)) = (fl(x) × fl(y))(1 + ϵ2), fl(fl(x)/fl(y)) = (fl(x)/fl(y))(1 + ϵ3), where |ϵi| ≤ η. • Thus, the relative errors remain small after each such operation. • This is achieved using guard digits (see Example 2.6 in text). Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 9 / 21
  • 11. Floating point systems Spacing of floating point numbers Run program Example2 8Figure2 3 −15 −10 −5 0 5 10 15 −1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1 Note the uneven distribution, both for large exponents and near 0. Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 10 / 21
  • 12. Floating point systems Overflow, underflow, NaN • Overflow: when e > U. (fatal) • Underflow: when e < L. (non-fatal: set to 0 by default) • NaN: Not-a-number. (e.g., 0/0) Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 11 / 21
  • 13. Floating point systems Outline • Floating point systems • The IEEE standard • Roundoff error accumulation Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 12 / 21
  • 14. The IEEE standard IEEE standard • Used by everyone today. • Binary: use β = 2. • Exact rounding: use guard digits to ensure that relative error in each elementary arithmetic operation is bounded by η. • NaN • Overflow and underflow • Subnormal numbers near 0. • Many other features... Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 13 / 21
  • 15. The IEEE standard IEEE standard word Use base β = 2. Recall that with this base, in normalized numbers the first digit is always d0 = 1 and thus it need not be stored. Double precision (64 bit word) s = ± b =11-bit exponent f =52-bit fraction Rounding unit: η = 1 2 · 2−52 ≈ 1.1 × 10−16 Can have also single precision (32 bit word). Then t = 23 and η = 2−24 ≈ 6.0 × 10−8 . Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 14 / 21
  • 16. The IEEE standard Outline • Floating point systems • The IEEE standard • Roundoff error accumulation Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 15 / 21
  • 17. Roundoff error accumulation Roundoff error accumulation • In general, if En is error after n elementary operations, cannot avoid linear roundoff error accumulation En ≃ c0nE0. • Will not tolerate an exponential error growth such as En ≃ cn 1 E0 for some constant c1 > 1 – an unstable algorithm. • In some situations an individual error contribution is particularly large and occasionally can be made smaller. Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 16 / 21
  • 18. Roundoff error accumulation Cancellation error When two nearby numbers are subtracted, the relative error is large. That is, if x ≃ y, then x − y has a large relative error. This occurs in practice consistently and naturally, as we will see. Function evaluation at nearby arguments: • If g(·) is a smooth function then g(t) and g(t + h) are close for h small. • But rounding errors in g(t) and g(t + h) are unrelated, so they can be of opposing signs! • For numerical differentiation, e.g. f′ (x0) ≈ f(x0 + h) − f(x0) h , 0 < h ≪ 1, if the relative rounding error in the representation is bounded by η then in |g(t + h) − g(t)|/h it is bounded by 2η/h. This (tight) bound is much larger than η when h is small. Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 17 / 21
  • 19. Roundoff error accumulation Example Compute y = sinh(x) = 1 2 (ex − e−x ). • Naively computing y at an x near 0 may result in a (meaningless) 0. • Instead use Taylor’s expansion ex = 1 + x + x2 2 + x3 6 + . . . to obtain sinh(x) = x + x3 6 + . . . . • If x is near 0, can use x + x3 6 , or even just x, for an effective approximation to sinh(x). So, a good library function would compute sinh(x) by the regular formula (using exponentials) for |x| not very small, and by taking a term or two of the Taylor expansion for |x| very small. Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 18 / 21
  • 20. Roundoff error accumulation Illustration Compute y = √ x + 1 − √ x for x = 100, 000 in a 5-digit decimal arithmetic. (So β = 10, t = 5.) • Naively computing √ x + 1 − √ x results in the value 0. • Instead use the identity ( √ x + 1 − √ x)( √ x + 1 + √ x) ( √ x + 1 + √ x) = 1 √ x + 1 + √ x . • In 5-digit decimal arithmetic calculating the right hand side expression yields 1.5811 × 10−3 : correct in the given accuracy. Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 19 / 21
  • 21. Roundoff error accumulation The rough appearance of roundoff errors Run program Example2 2Figure2 2.m 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 −8 −6 −4 −2 0 2 4 6 x 10 −8 error in sampling exp(−t)(sin(2π t)+2) in single precision t roundofferror Note how the sign of the floating point representation error at nearby arguments t fluctuates as if randomly: as a function of t it is a “non-smooth” error. Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 20 / 21
  • 22. Roundoff error accumulation Avoiding overflow • 2-norm computation: for a vector x = (x1, x2, . . . , xn)T with n components, ∥x∥2 = √ x2 1 + x2 2 + · · · + x2 n. The vector x may have many components: accumulating damage may be fatal. • Suppose a ≫ b and we wish to compute c = √ a2 + b2. Then it may be better to compute c = a √ 1 + (b/a)2. • The norm calculation can be scaled in a similar manner. Uri Ascher & Chen Greif (UBC Computer Science) A First Course in Numerical Methods September 2, 2014 21 / 21