Debugging (in R)

Michael Jones

29 November 2022

Topics

Code someone else has written

Code you write

Can you work differently to avoid errors in the first place?

Other People’s Code

  • Scripts from your colleagues
  • Code you wrote a year ago
  • A package from CRAN

Code not in your mental RAM

Have you tried turning it off and on?

  • Restart your session: ctrl + shift + F10
  • Work reproducibly (scripts, {targets} etc) to avoid any emotional cost of this

Read the Error

my_func <- fucntion(x) {
  x + 1
}
Error: <text>:1:24: unexpected '{'
1: my_func <- fucntion(x) {
                           ^

Read the Erorr

data <- iris

sum(df$Petal.Length)
Error in df$Petal.Length: object of type 'closure' is not subsettable
data <- iris

sum(data$Petal.Length)
[1] 563.7
  1. Functions are also called “closures”
  2. There’s a function df()
  3. $ subsets data

Read the Error

Tidyverse functions tend to be clear:

iris %>% select(test)
Error in `select()`:
! Can't subset columns that don't exist.
✖ Column `test` doesn't exist.


iris$test
NULL

But {tidyverse} can also be opaque

iris %>% pull(test)
Error:
! object 'test' not found
rlang::last_error()
<error/rlang_error>
Error:
! object 'test' not found
---
Backtrace:
  1. iris %>% pull(test)
  9. base::.handleSimpleError(...)
 10. rlang (local) h(simpleError(msg, call))
 11. handlers[[1L]](cnd)
Run `rlang::last_trace()` to see the full context.

Make a Reproducible Example

  • Distil the issue down to the smallest possible implementation
  • Use a clean environment
  • Explain your steps
  • Make it so someone else can run your code and get the same error

{reprex}

  • There’s an R package for that!
  • https://reprex.tidyverse.org/
  • Outputs html, markdown, etc ready to share with people who can help
  • Runs in a clean R process
  • Help someone help you

Reprex Demo

Making the Reprex helps to define the problem

Get help from someone else

  • Other people will have experienced the same issue (probably)
  • Copy the error into Google or StackOverflow

Go deeper

  • You can view R function implementations at the console
  • Browse the source on github
dplyr::select
function (.data, ...) 
{
    UseMethod("select")
}
<bytecode: 0x557bd7e750f8>
<environment: namespace:dplyr>
dplyr:::select.data.frame
function (.data, ...) 
{
    error_call <- dplyr_error_call()
    loc <- tidyselect_fix_call(tidyselect::eval_select(expr(c(...)), 
        .data), call = error_call)
    loc <- ensure_group_vars(loc, .data, notify = TRUE)
    dplyr_col_select(.data, loc, names(loc))
}
<bytecode: 0x557bd93ee938>
<environment: namespace:dplyr>

Other People’s Code (summary)

  • Restart your R Environment
  • Read the Error
  • Reprex
  • Describe the problem
  • Stack Overflow/Github Issues

Your own Code

  • You’ve just written
  • You wrote a year ago
  • From your colleague

Code at hand that you can change

Unexpected result but no error

Ensure you’re using code you expect:

  • Made a change to a function recently? Reload that function
  • Make sure you’re using the right package. e.g. using filter() before loading {dplyr} means you’re using base::filter()

Aside: Raising Errors

Crash execution using stop(), warn using warning()

loud <- function(x) {
  if (x == 4) {
    stop("Don't give me 4")
  }
  x
}

loud(4)
Error in loud(4): Don't give me 4
quiet <- function(x) {
  if (x == 4) {
    warning("Don't give me 4")
  }
  x
}

quiet(4)
Warning in quiet(4): Don't give me 4
[1] 4

Aside: Raising Errors

Or using {cli} for richer, clearer messages.

loud_cli <- function(x) {
  if (x == 4) {
    cli::cli_abort(
      message = c("Error in `loud_cli()`",
                  "x" = "This function can't have certain values",
                  "i" = "You gave it 4, don't do that again")
    )
  }
}

loud_cli(4)
Error in `loud_cli()`:
! Error in `loud_cli()`
✖ This function can't have certain values
ℹ You gave it 4, don't do that again

Demo: debug(), debugonce() and friends

My View

  • Start with debugonce()
  • Then reach for browser()
  • Step through the code
  • Only rarely use recover() and trace() (mostly rely on RStudio GUI)
  • Never use print()

Just write code correctly the first time

Control your Inputs - Assertions

Stop early if things are wrong

div <- function(x, y) {
  x / y
}

div(100, 0)
[1] Inf
safe_div <- function(x, y) {
  if (y == 0) {
    stop("Don't divide by 0!")
  }
  x / y
}

safe_div(100, 0)
Error in safe_div(100, 0): Don't divide by 0!

Use {checkmate}

library(checkmate)
check_non_zero <- function(x) {
  if(x != 0) {
    TRUE
  } else {
    "Must be non-zero"
  }
}

even_safer_div <- function(x, y) {
  checkmate::assert(
    check_count(x), check_count(y),
    check_number(x), check_number(y),
    check_non_zero(y),
    combine = "and"
  )
  x / y
}

Use {checkmate}

even_safer_div(100, c(4, 3))
Error: Assertion on 'y' failed: Must have length 1.
even_safer_div("test", 10)
Error: Assertion on 'x' failed: Must be of type 'count', not 'character'.
even_safer_div(100, 0)
Error: Assertion on 'y' failed: Must be non-zero.
even_safer_div(100, 10)
[1] 10

try/catch

If you know what to do with errors, you can handle them automatically:

correct_my_mistake <- function(x) {
  tryCatch(
    loud(x),
    error = function(e) {cat(paste0(e, "Correcting to 100"))
      100}
  )
}

correct_my_mistake(10)
[1] 10
correct_my_mistake(4)
Error in loud(x): Don't give me 4
Correcting to 100
[1] 100

In a package

  • Use Unit tests
  • Beyond the scope of today - future RUG session

Restrict your input surface

  • Use an object system like S3 or S4 and take advantage of R’s internal dispatch mechanism
  • Your function doesn’t have to handle unexpected input if it will only ever be called on a certain class of objects.

Get Fancy

  • Use a different return type that wraps up whether your function worked or not
  • Popular in pure functional languages, rare in R
  • “Maybe” values that are either
    • Just value: signifying success
    • Nothing: signifying failure
  • Then your functions can be total
  • https://www.mpjon.es/2022/08/03/when-your-code-maybe-works/

Don’t write errors in the first place

  • Think about common errors and head them off
  • Handle them if you can with tryCatch
  • Consider designing your programs such that errors are minimised
  • There are risks in being too fancy though

Summary

  • Errors are always unexpected
  • debug() is a very powerful tool
  • If writing packages, think about how people can abuse your code as well as use your code.