R debugging
Created and presented by,
Nilesh B. Borade
What is BUG ?
• A software bug is an error, flaw, failure, or fault in a computer program or system
that causes it to produce an incorrect or unexpected result, or to behave in
unintended ways.
• Most bugs arise from mistakes and errors made by people in a program's source
code.
• A program that contains a large number of bugs, and/or bugs that seriously
interfere with its functionality, is said to be buggy or defective.
• It is important to note that debugging is a practice which gets considerably easier
as one’s familiarity with the language increases. In some ways it can be more
“art” than “science”.
• For example, knowing where to look in a 500 line program after it has just halted
execution is sometimes just a “feeling” one develops after much previous
suffering.
• You should take an “innocent until proven guilty” approach to g(). Do not call
debug(g) yet. Execute that line and see if g() returns the value you expect. If it
does, then you’ve just avoided the time-consuming process of single-stepping
through g(). If g() returns the wrong value, then now is the time to call debug(g).
Indications that something’s not right
• message : A generic notification/diagnostic message produced by the message
function; execution of the function continues.
• warning : An indication that something is wrong but not necessarily fatal;
execution of the function continues; generated by warning function.
• error : An indication that a fatal problem has occurred; execution stops; produced
by the stop function.
• condition : A generic concept for indicating that something unexpected can occur;
programmers can create their own conditions.
The Essence of Debugging: The Principle of Confirmation
Fundamental Principles of Debugging
“Beware of bugs in the above code; I have only proved it correct, not tried it.”
—Donald Knuth, pioneer of computer science
The principle of confirmation is the essence of debugging
Fixing a buggy program is a process of confirming, one by one, that the many
things you believe to be true about the code actually are true. When you find
that one of your assumptions is not true, you have found a clue to the location
(if not the exact nature) of a bug.
Introduction
• As with programs written in any other language, functions written in R can
contain unforeseen problems which lead to failure.
• The purpose of the debugging tools is to help the programmer find these
problems quickly and efficiently.
• The R system has two main ways of reporting a problem in executing a function.
• One is a warning while the other is an error.
• The main difference between the two is that warnings do not halt execution of
the function.
• The purpose of the warning is to tell the user “Something unusual happened
during the execution of this function, but the function was nevertheless able to
execute to completion”.
• One example of getting a warning is when you take the log of a negative number:
> log(-1)
[1] NaN
Warning message:
In log(-1) : NaNs produced
• An error is usually a problem that is fatal and results in a complete halt in
execution of the function. That is, because of some problem in the function, the
function simply cannot execute to completion.
> message = function(x) {
+ if(x > 0)
+ print("Hello") # prints “Hello” if given value is grater than 0
+ else
+ print("Goodbye")# prints “Goodbye” if given value is not grater than 0
+ }
> message(4)
[1] "Hello"
> message(-4)
[1] "Goodbye"
> message(log(-1))
Error in if (x > 0) print("Hello") else print("Goodbye") :
missing value where TRUE/FALSE needed
In addition: Warning message:
In log(-1) : NaNs produced
R tools for debugging
• traceback()
• debug()
• browser()
• recover()
• cat()
• print()
traceback()
• When an R function fails, usually, an error is printed to the screen.
• The first thing you might want to do is print the call stack, i.e. print the sequence
of function calls which led to the error.
• This can be done using the traceback() function. The traceback function
prints the list of functions which were called before the error occurred.
• This can be uninteresting if the error occurred at a top level function.
• Immediately after an error, you can call traceback()to see in which function
the error occurred.
> log(-1)
[1] NaN
Warning message:
In log(-1) : NaNs produced
> traceback() # initiating traceback() function
1: traceback(log(-1))
> f = function(x) {
+ r = x - g(x)
+ r
+ } # first function f(x)
> g = function(y) {
+ r = y * h(y)
+ r
+ } # second function g(y)
> h = function(z) {
+ r = log(z)
+ if (r < 10)
+ r^2
+ else r^3
+ } # third function h(z)
> f(8) # calling function f() with x = 8
[1] -26.59262
> f(-1)
Error in if (r < 10) r^2 else r^3 : missing value where TRUE/FALSE
needed
In addition: Warning message:
In log(z) : NaNs produced
> traceback() # initiating traceback() function
3: h(y) at #2 # this shows that error occurred in function h(y)
2: g(x) at #2
1: f(-1)
Debug()
• we used traceback() to figure out where in the call stack an error
occurred.
• However, traceback() doesn't tell you where in the function the
error occurred.
• Calling debug()on function foo()flags that function for
“debugging”
• e.g. debug(foo)
• When foo()is called in your program, execution will pause and you
can step through foo()line by line.
> SS = function(mu, x) {
+ d = x – mu # first calculates the difference
+ d2 = d^2 # second square of that difference ‘d’
+ ss = sum(d2) # third sum of square of difference ‘d2’
+ ss
+ } # function for sum of squares
> set.seed(100) # set the seed so that the results are reproducible
> x = rnorm(100)# random sample from Normal distribution
> SS(1,x)
[1] 202.5615
> debug(SS) # function SS has flagged for debugging
> SS(1,x) # we have called the flagged function
debugging in: SS(1, x)
debug at #1: {
d = x - mu
d2 = d^2
ss = sum(d2)
ss
} # first body of the function is printed
Browse[2]>
debug at #2: d = x - mu
Browse[2]>
debug at #3: d2 = d^2
Browse[2]>
debug at #4: ss = sum(d2)
Browse[2]>
debug at #5: ss
Browse[2]>
exiting from: SS(1, x)
[1] 202.5615
> undebug(SS) # Remove debugging flag for SS
browser()
• Inserting a call to browser()in a function will pause the execution
of a function at the point where browser()is called.
• Similar to using debug()except you can control where execution
gets paused.
• This kind of use of browser can be useful if you have a vague idea as
to where a bug may be in your program.
> SS = function(mu, x) {
+ d = x - mu
+ d2 = d^2
+ browser()
+ ss = sum(d2)
+ ss
+ } # function for sum of squares
> SS(1,x)
Called from: SS(1, x)
Browse[1]>
debug at #5: ss = sum(d2)
Browse[2]> ls() # gives the objects in the function
[1] "d" "d2" "mu" "x"
Browse[2]> mean(x) # function for mean
[1] 0.002912563
Browse[2]>
debug at #6: ss
Browse[2]>
[1] 202.5615
recover()
• There may be a situation where you want to suspend execution of a
function in one location, but then browse a previous function call to
hunt down the bug.
• recover()can be used as an error handler, set using options()
• – e.g. options(error = recover)
• When a function throws an error, execution is halted at the point of
failure you can browse the function calls and examine the
environment to find the source of the problem.
> read.csv("nosuchfile")
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
cannot open file 'nosuchfile': No such file or directory
> options(error=recover)
> read.csv("nosuchfile")
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
cannot open file 'nosuchfile': No such file or directory
Enter a frame number, or 0 to exit
1: read.csv("nosuchfile")
2: read.table(file = file, header = header, sep = sep, quote = quote,
dec = de
3: file(file, "rt")
Selection:
Final thoughts
• There are three main indication of a problem/condition : message,
warning, error; only an error is fatal.
• Interactive debugging tool traceback, debug, browser, and recover
can be used to find problematic code in functions.
• The debugging tools should be used as much as necessary to
minimize the time spent debugging and to maximize the time spent,
as John Chambers wrote, “turning ideas into software”.
• Debugging tool are not a substitute for thinking !
Thank You

R Debugging

  • 1.
    R debugging Created andpresented by, Nilesh B. Borade
  • 2.
    What is BUG? • A software bug is an error, flaw, failure, or fault in a computer program or system that causes it to produce an incorrect or unexpected result, or to behave in unintended ways. • Most bugs arise from mistakes and errors made by people in a program's source code. • A program that contains a large number of bugs, and/or bugs that seriously interfere with its functionality, is said to be buggy or defective.
  • 3.
    • It isimportant to note that debugging is a practice which gets considerably easier as one’s familiarity with the language increases. In some ways it can be more “art” than “science”. • For example, knowing where to look in a 500 line program after it has just halted execution is sometimes just a “feeling” one develops after much previous suffering. • You should take an “innocent until proven guilty” approach to g(). Do not call debug(g) yet. Execute that line and see if g() returns the value you expect. If it does, then you’ve just avoided the time-consuming process of single-stepping through g(). If g() returns the wrong value, then now is the time to call debug(g).
  • 4.
    Indications that something’snot right • message : A generic notification/diagnostic message produced by the message function; execution of the function continues. • warning : An indication that something is wrong but not necessarily fatal; execution of the function continues; generated by warning function. • error : An indication that a fatal problem has occurred; execution stops; produced by the stop function. • condition : A generic concept for indicating that something unexpected can occur; programmers can create their own conditions.
  • 5.
    The Essence ofDebugging: The Principle of Confirmation Fundamental Principles of Debugging “Beware of bugs in the above code; I have only proved it correct, not tried it.” —Donald Knuth, pioneer of computer science The principle of confirmation is the essence of debugging Fixing a buggy program is a process of confirming, one by one, that the many things you believe to be true about the code actually are true. When you find that one of your assumptions is not true, you have found a clue to the location (if not the exact nature) of a bug.
  • 6.
    Introduction • As withprograms written in any other language, functions written in R can contain unforeseen problems which lead to failure. • The purpose of the debugging tools is to help the programmer find these problems quickly and efficiently. • The R system has two main ways of reporting a problem in executing a function. • One is a warning while the other is an error. • The main difference between the two is that warnings do not halt execution of the function. • The purpose of the warning is to tell the user “Something unusual happened during the execution of this function, but the function was nevertheless able to execute to completion”.
  • 7.
    • One exampleof getting a warning is when you take the log of a negative number: > log(-1) [1] NaN Warning message: In log(-1) : NaNs produced • An error is usually a problem that is fatal and results in a complete halt in execution of the function. That is, because of some problem in the function, the function simply cannot execute to completion.
  • 8.
    > message =function(x) { + if(x > 0) + print("Hello") # prints “Hello” if given value is grater than 0 + else + print("Goodbye")# prints “Goodbye” if given value is not grater than 0 + } > message(4) [1] "Hello" > message(-4) [1] "Goodbye" > message(log(-1)) Error in if (x > 0) print("Hello") else print("Goodbye") : missing value where TRUE/FALSE needed In addition: Warning message: In log(-1) : NaNs produced
  • 9.
    R tools fordebugging • traceback() • debug() • browser() • recover() • cat() • print()
  • 10.
    traceback() • When anR function fails, usually, an error is printed to the screen. • The first thing you might want to do is print the call stack, i.e. print the sequence of function calls which led to the error. • This can be done using the traceback() function. The traceback function prints the list of functions which were called before the error occurred. • This can be uninteresting if the error occurred at a top level function. • Immediately after an error, you can call traceback()to see in which function the error occurred.
  • 11.
    > log(-1) [1] NaN Warningmessage: In log(-1) : NaNs produced > traceback() # initiating traceback() function 1: traceback(log(-1))
  • 12.
    > f =function(x) { + r = x - g(x) + r + } # first function f(x) > g = function(y) { + r = y * h(y) + r + } # second function g(y) > h = function(z) { + r = log(z) + if (r < 10) + r^2 + else r^3 + } # third function h(z) > f(8) # calling function f() with x = 8 [1] -26.59262
  • 13.
    > f(-1) Error inif (r < 10) r^2 else r^3 : missing value where TRUE/FALSE needed In addition: Warning message: In log(z) : NaNs produced > traceback() # initiating traceback() function 3: h(y) at #2 # this shows that error occurred in function h(y) 2: g(x) at #2 1: f(-1)
  • 14.
    Debug() • we usedtraceback() to figure out where in the call stack an error occurred. • However, traceback() doesn't tell you where in the function the error occurred. • Calling debug()on function foo()flags that function for “debugging” • e.g. debug(foo) • When foo()is called in your program, execution will pause and you can step through foo()line by line.
  • 15.
    > SS =function(mu, x) { + d = x – mu # first calculates the difference + d2 = d^2 # second square of that difference ‘d’ + ss = sum(d2) # third sum of square of difference ‘d2’ + ss + } # function for sum of squares > set.seed(100) # set the seed so that the results are reproducible > x = rnorm(100)# random sample from Normal distribution > SS(1,x) [1] 202.5615 > debug(SS) # function SS has flagged for debugging
  • 16.
    > SS(1,x) #we have called the flagged function debugging in: SS(1, x) debug at #1: { d = x - mu d2 = d^2 ss = sum(d2) ss } # first body of the function is printed Browse[2]> debug at #2: d = x - mu Browse[2]> debug at #3: d2 = d^2 Browse[2]> debug at #4: ss = sum(d2) Browse[2]> debug at #5: ss Browse[2]> exiting from: SS(1, x) [1] 202.5615 > undebug(SS) # Remove debugging flag for SS
  • 17.
    browser() • Inserting acall to browser()in a function will pause the execution of a function at the point where browser()is called. • Similar to using debug()except you can control where execution gets paused. • This kind of use of browser can be useful if you have a vague idea as to where a bug may be in your program.
  • 18.
    > SS =function(mu, x) { + d = x - mu + d2 = d^2 + browser() + ss = sum(d2) + ss + } # function for sum of squares > SS(1,x) Called from: SS(1, x) Browse[1]> debug at #5: ss = sum(d2) Browse[2]> ls() # gives the objects in the function [1] "d" "d2" "mu" "x" Browse[2]> mean(x) # function for mean [1] 0.002912563 Browse[2]> debug at #6: ss Browse[2]> [1] 202.5615
  • 19.
    recover() • There maybe a situation where you want to suspend execution of a function in one location, but then browse a previous function call to hunt down the bug. • recover()can be used as an error handler, set using options() • – e.g. options(error = recover) • When a function throws an error, execution is halted at the point of failure you can browse the function calls and examine the environment to find the source of the problem.
  • 20.
    > read.csv("nosuchfile") Error infile(file, "rt") : cannot open the connection In addition: Warning message: In file(file, "rt") : cannot open file 'nosuchfile': No such file or directory > options(error=recover) > read.csv("nosuchfile") Error in file(file, "rt") : cannot open the connection In addition: Warning message: In file(file, "rt") : cannot open file 'nosuchfile': No such file or directory Enter a frame number, or 0 to exit 1: read.csv("nosuchfile") 2: read.table(file = file, header = header, sep = sep, quote = quote, dec = de 3: file(file, "rt") Selection:
  • 21.
    Final thoughts • Thereare three main indication of a problem/condition : message, warning, error; only an error is fatal. • Interactive debugging tool traceback, debug, browser, and recover can be used to find problematic code in functions. • The debugging tools should be used as much as necessary to minimize the time spent debugging and to maximize the time spent, as John Chambers wrote, “turning ideas into software”. • Debugging tool are not a substitute for thinking !
  • 22.