Debugging (in R)

Michael Jones

29 November 2022

Topics

Code someone else has written

Code you write

Can you work differently to avoid errors in the first place?

Other People’s Code

Scripts from your colleagues
Code you wrote a year ago
A package from CRAN

Code not in your mental RAM

Have you tried turning it off and on?

Restart your session: ctrl + shift + F10
Work reproducibly (scripts, {targets} etc) to avoid any emotional cost of this

Read the Error

my_func <- fucntion(x) {
  x + 1
}

Error: <text>:1:24: unexpected '{'
1: my_func <- fucntion(x) {
                           ^

Read the Erorr

data <- iris

sum(df$Petal.Length)

Error in df$Petal.Length: object of type 'closure' is not subsettable

data <- iris

sum(data$Petal.Length)

[1] 563.7

Functions are also called “closures”
There’s a function df()
$ subsets data

Read the Error

Tidyverse functions tend to be clear:

iris %>% select(test)

Error in `select()`:
! Can't subset columns that don't exist.
✖ Column `test` doesn't exist.

iris$test

NULL

But {tidyverse} can also be opaque

iris %>% pull(test)

Error:
! object 'test' not found

rlang::last_error()

<error/rlang_error>
Error:
! object 'test' not found
---
Backtrace:
  1. iris %>% pull(test)
  9. base::.handleSimpleError(...)
 10. rlang (local) h(simpleError(msg, call))
 11. handlers[[1L]](cnd)
Run `rlang::last_trace()` to see the full context.

Make a Reproducible Example

Distil the issue down to the smallest possible implementation
Use a clean environment
Explain your steps
Make it so someone else can run your code and get the same error

{reprex}

There’s an R package for that!
https://reprex.tidyverse.org/
Outputs html, markdown, etc ready to share with people who can help
Runs in a clean R process
Help someone help you

Reprex Demo

Making the Reprex helps to define the problem

Get help from someone else

Other people will have experienced the same issue (probably)
Copy the error into Google or StackOverflow

Go deeper

You can view R function implementations at the console
Browse the source on github

dplyr::select

function (.data, ...) 
{
    UseMethod("select")
}
<bytecode: 0x557bd7e750f8>
<environment: namespace:dplyr>

dplyr:::select.data.frame

function (.data, ...) 
{
    error_call <- dplyr_error_call()
    loc <- tidyselect_fix_call(tidyselect::eval_select(expr(c(...)), 
        .data), call = error_call)
    loc <- ensure_group_vars(loc, .data, notify = TRUE)
    dplyr_col_select(.data, loc, names(loc))
}
<bytecode: 0x557bd93ee938>
<environment: namespace:dplyr>

Other People’s Code (summary)

Restart your R Environment
Read the Error
Reprex
Describe the problem
Stack Overflow/Github Issues

Your own Code

You’ve just written
You wrote a year ago
From your colleague

Code at hand that you can change

Unexpected result but no error

Ensure you’re using code you expect:

Made a change to a function recently? Reload that function
Make sure you’re using the right package. e.g. using filter() before loading {dplyr} means you’re using base::filter()

Aside: Raising Errors

Crash execution using stop(), warn using warning()

loud <- function(x) {
  if (x == 4) {
    stop("Don't give me 4")
  }
  x
}

loud(4)

Error in loud(4): Don't give me 4

quiet <- function(x) {
  if (x == 4) {
    warning("Don't give me 4")
  }
  x
}

quiet(4)

Warning in quiet(4): Don't give me 4

[1] 4

Aside: Raising Errors

Or using {cli} for richer, clearer messages.

loud_cli <- function(x) {
  if (x == 4) {
    cli::cli_abort(
      message = c("Error in `loud_cli()`",
                  "x" = "This function can't have certain values",
                  "i" = "You gave it 4, don't do that again")
    )
  }
}

loud_cli(4)

Error in `loud_cli()`:
! Error in `loud_cli()`
✖ This function can't have certain values
ℹ You gave it 4, don't do that again

Print Debugging

my_func <- function(x) {
  print("got to here")
  y <- x * 100
  print("then got to here")
  z <- x + 50
  print("almost there")
  z * 1000
}

my_func(100)

[1] "got to here"
[1] "then got to here"
[1] "almost there"

[1] 150000

Print Debugging

my_func2 <- function(x) {
  print(paste("x:", x))
  y <- x * 100
  print(paste("y:", x))
  z <- x + 50
  print(paste("z:", z))
  z * 1000
}

my_func2(100)

[1] "x: 100"
[1] "y: 100"
[1] "z: 150"

[1] 150000

Demo: `debug()`, `debugonce()` and friends

My View

Start with debugonce()
Then reach for browser()
Step through the code
Only rarely use recover() and trace() (mostly rely on RStudio GUI)
Never use print()

Just write code correctly the first time

Control your Inputs - Assertions

Stop early if things are wrong

div <- function(x, y) {
  x / y
}

div(100, 0)

[1] Inf

safe_div <- function(x, y) {
  if (y == 0) {
    stop("Don't divide by 0!")
  }
  x / y
}

safe_div(100, 0)

Error in safe_div(100, 0): Don't divide by 0!

Use {checkmate}

library(checkmate)
check_non_zero <- function(x) {
  if(x != 0) {
    TRUE
  } else {
    "Must be non-zero"
  }
}

even_safer_div <- function(x, y) {
  checkmate::assert(
    check_count(x), check_count(y),
    check_number(x), check_number(y),
    check_non_zero(y),
    combine = "and"
  )
  x / y
}

Use {checkmate}

even_safer_div(100, c(4, 3))

Error: Assertion on 'y' failed: Must have length 1.

even_safer_div("test", 10)

Error: Assertion on 'x' failed: Must be of type 'count', not 'character'.

even_safer_div(100, 0)

Error: Assertion on 'y' failed: Must be non-zero.

even_safer_div(100, 10)

[1] 10

try/catch

If you know what to do with errors, you can handle them automatically:

correct_my_mistake <- function(x) {
  tryCatch(
    loud(x),
    error = function(e) {cat(paste0(e, "Correcting to 100"))
      100}
  )
}

correct_my_mistake(10)

[1] 10

correct_my_mistake(4)

Error in loud(x): Don't give me 4
Correcting to 100

[1] 100

In a package

Use Unit tests
Beyond the scope of today - future RUG session

Restrict your input surface

Use an object system like S3 or S4 and take advantage of R’s internal dispatch mechanism
Your function doesn’t have to handle unexpected input if it will only ever be called on a certain class of objects.

Get Fancy

Use a different return type that wraps up whether your function worked or not
Popular in pure functional languages, rare in R
“Maybe” values that are either
- Just value: signifying success
- Nothing: signifying failure
Then your functions can be total
https://www.mpjon.es/2022/08/03/when-your-code-maybe-works/

Don’t write errors in the first place

Think about common errors and head them off
Handle them if you can with tryCatch
Consider designing your programs such that errors are minimised
There are risks in being too fancy though

Summary

Errors are always unexpected
debug() is a very powerful tool
If writing packages, think about how people can abuse your code as well as use your code.

Debugging (in R)

Topics

Other People’s Code

Code not in your mental RAM

Have you tried turning it off and on?

Read the Error

Read the Erorr

Read the Error

But {tidyverse} can also be opaque

Make a Reproducible Example

{reprex}

Reprex Demo

Making the Reprex helps to define the problem

Get help from someone else

Go deeper

Other People’s Code (summary)

Your own Code

Code at hand that you can change

Unexpected result but no error

Aside: Raising Errors

Aside: Raising Errors

Print Debugging

Print Debugging

Demo: debug(), debugonce() and friends

My View

Just write code correctly the first time

Control your Inputs - Assertions

Use {checkmate}

Use {checkmate}

try/catch

In a package

Restrict your input surface

Get Fancy

Don’t write errors in the first place

Summary

Demo: `debug()`, `debugonce()` and friends