Use hidden advanced arguments for user-friendly functions

2014-12-11

As a user

Imagine that you are starting to learn how to use a specific R package, lets call it foo. You will look at the vignette (if there is one), use help(package = foo), or look at the reference manual (for example, devtools' ref man). Eventually, you will open the help page for the function(s) you are interested in using.

?function_I_want_to_use

In many packages, there is a main use case that is addressed by the package. A common strategy is to export a main function. That function will likely have a long list of arguments. So as a new user, you are suddenly exposed to a complicated help page and you will want to figure out which arguments you need to use.

As a developer

From the developer's side, you want to give users control over several details. Each detail you want the user to control involves one more argument in your function. Sooner rather than later, you will have a long list of arguments. This increases the learning curve for new users of your package, and can potentially scare them away. That is contradictory of another goal you have as a developer: you want to get people to use your package.

Lets say that you are developing the function use_me(). If the details you want the users to control are actually arguments of other functions used inside use_me(), then you can simplify your function by using the ... argument. This argument is very well explained at The three-dots construct in R (Burns, 2013). It is very useful and can greatly simplify your life as a developer. Plus, it reduces the length of your help pages, thus making your package more user friendly.

center

However, if some of the details in use_me() are not arguments to other functions, then the common strategy is to write two functions. One is a low level function with arguments for all the details which might or might not export. Then, you write a second function that is a wrapper for the low level function and pre-specifies values for all the details. See the next minimal example:

# Don't export this function
.use_me <- function(arg1, arg2, verbose = TRUE) {
    if(verbose) message(paste(Sys.time(), 'working'))
    pmax(arg1, arg2)
}

#' @export
use_me <- function(arg1, ...) {
    .use_me(arg1, 0, ...)
}

## Lets see it in action
use_me(1:3)
## 2014-12-11 17:03:32 working
## [1] 1 2 3
use_me(-1:1, verbose = FALSE)
## [1] 0 0 1

In this example, the help page for use_me() is fairly short and friendly. You don't expect users to be interested in changing arg2 much. Surely you could make it so the non-exported function .use_me() sets a default value for arg2.

Another strategy is to specify inside use_me() the default values for all the arguments you want to use while keeping the list of visible arguments short. That is, maintain the user friendliness of your functions while also giving them control over all the details. That is what you can do using dots() from dots (Collado-Torres, 2014). dots() is a very simple function that checks if ... has a specific argument, and if absent, it returns a default value. It can be seen in action below:

library('dots')
use_me_dots <- function(arg1, ...) {
    ## Default hidden arguments
    arg2 <- dots(name = 'arg2', value = 0, ...)
    verbose <- dots('verbose', TRUE, ...)
    
    ## Regular code
    if(verbose) message(paste(Sys.time(), 'working'))
    pmax(arg1, arg2)
}
use_me_dots(1:3)
## 2014-12-11 17:03:32 working
## [1] 1 2 3
use_me_dots(-1:1, verbose = FALSE)
## [1] 0 0 1
use_me_dots(-1:1, verbose = FALSE, arg2 = 5)
## [1] 5 5 5

dots is my solution to the problem of keeping functions user friendly while giving them control over all the details. The idea is that experienced users will be able to find what the advanced arguments are. While they could find them from the code itself, I do recommend describing the advanced arguments in a vignette targeted for these users.

Complications

Now, while ... is great, you might run into problems when use_me() calls two functions that have different arguments and that don't have the ... argument. Such a scenario is illustrated below.

status <- function(arg3, status = TRUE) {
    if(status) print(arg3)
    return(invisible(NULL))
}
use_me_again <- function(arg1, ...) {
    res <- .use_me(arg1, 0, ...)
    status(res, ...)
    return(res)
}

## Seems to work
x <- use_me_again(1)
## 2014-12-11 17:03:32 working
## [1] 1
## But nope, it doesn't
use_me_again(1, verbose = FALSE, status = FALSE)
## Error in .use_me(arg1, 0, ...): unused argument (status = FALSE)

This scenario can happen when you are using functions from other packages. It's happened to me in cases where the main function does have a ... argument but uses several internal functions that don't use it.

In such situations, you might want to use formal_call() from dots. It figures out which are the arguments formally used by the function of interest and drops out un-used arguments from ..., thus avoiding this type of problem.

use_me_fixed <- function(arg1, ...) {
    res <- formalCall(.use_me, arg1 = arg1, arg2 = 0, ...)
    formal_call(status, arg3 = res, ...)
    return(res)
}

## Works now!
use_me_fixed(1, verbose = FALSE, status = FALSE)
## [1] 1

For a more complicated example, see the dots complex example in the vignette.

Conclusions

As a developer, it is possible to keep your functions user friendly while giving experienced users the option to control the fine tuning arguments which you don't expect most users will want to tweak. My solution to this problem is implemented in dots (check it's vignette). I'd love to hear what you think about it! I am specially interested on what users think about the idea of hidden advanced arguments (documented in an advanced users vignette).

I might try to get dots into a repository: probably in Bioconductor since most of the dots code was first implemented for derfinder.

PS I just found a similar function to dots(). It's berryFunctions::owa() and you can find its code here.

References

Citations made with knitcitations (Boettiger, 2014).

[1] C. Boettiger. knitcitations: Citations for knitr markdown files. R package version 1.0.4. 2014. URL: https://github.com/cboettig/knitcitations.

[2] P. Burns. The three-dots construct in R - Burns Statistics. http://www.burns-stat.com/the-three-dots-construct-in-r/. 2013. URL: http://www.burns-stat.com/the-three-dots-construct-in-r/.

[3] L. Collado-Torres. dots: Simplifying function calls. R package version 1.0.0. 2014. URL: https://github.com/lcolladotor/dots.

Reproducibility

## Reproducibility info
library('devtools')
options(width = 120)
session_info()
## Session info-----------------------------------------------------------------------------------------------------------
##  setting  value                                             
##  version  R Under development (unstable) (2014-11-01 r66923)
##  system   x86_64, darwin10.8.0                              
##  ui       X11                                               
##  language (EN)                                              
##  collate  en_US.UTF-8                                       
##  tz       America/New_York
## Packages---------------------------------------------------------------------------------------------------------------
##  package       * version  date       source                                 
##  bibtex          0.3.6    2013-07-29 CRAN (R 3.2.0)                         
##  devtools      * 1.6.1    2014-10-07 CRAN (R 3.2.0)                         
##  digest          0.6.4    2013-12-03 CRAN (R 3.2.0)                         
##  dots          * 1.0.0    2014-11-14 Github (lcolladotor/dots@a933540)      
##  evaluate        0.5.5    2014-04-29 CRAN (R 3.2.0)                         
##  formatR         1.0      2014-08-25 CRAN (R 3.2.0)                         
##  httr            0.5      2014-09-02 CRAN (R 3.2.0)                         
##  knitcitations * 1.0.4    2014-11-03 Github (cboettig/knitcitations@508de74)
##  knitr         * 1.7      2014-10-13 CRAN (R 3.2.0)                         
##  lubridate       1.3.3    2013-12-31 CRAN (R 3.2.0)                         
##  memoise         0.2.1    2014-04-22 CRAN (R 3.2.0)                         
##  plyr            1.8.1    2014-02-26 CRAN (R 3.2.0)                         
##  RColorBrewer  * 1.0.5    2011-06-17 CRAN (R 3.2.0)                         
##  Rcpp            0.11.3   2014-09-29 CRAN (R 3.2.0)                         
##  RCurl           1.95.4.3 2014-07-29 CRAN (R 3.2.0)                         
##  RefManageR      0.8.40   2014-10-29 CRAN (R 3.2.0)                         
##  RJSONIO         1.3.0    2014-07-28 CRAN (R 3.2.0)                         
##  rstudioapi      0.1      2014-03-27 CRAN (R 3.2.0)                         
##  stringr         0.6.2    2012-12-06 CRAN (R 3.2.0)                         
##  XML             3.98.1.1 2013-06-20 CRAN (R 3.2.0)

Want more?

Check other @jhubiostat student blogs at Bmore Biostats as well as topics on #rstats.

I wrote dots a month ago and the post itself today during our bi-weekly blog meeting.

An xpd-tion into R plot margins

2014-11-21

This is a guest post by Prasad Patil that answers the question: how to put a shape in the margin of an R plot?

The help page for R's par() function is a somewhat impenetrable list of abbreviations that allow you to manipulate anything and everything in the plotting device. You may have used this function in the past to create an array of plots (using mfrow or mfcol) or to set margins (mar or mai).

Way down toward the end of the list is the often-overlooked xpd parameter. This value specifies where in the plotting device an object can actually be plotted. The default is xpd = FALSE, which means that plotting is clipped, or restricted, to the plotting region. In other words, if your plot has xlim = c(0, 10) and ylim = c(0, 10) and you try to plot the point (-1, -1), it will not appear anywhere in the device.

xpd takes two other values, TRUE and NA, which limit plotting to the figure and device region, respectively. If you're fuzzy on plotting terms, this tutorial presents those topics well.

Plotting outside the plot

If you want to plot outside of the plotting region, I find that setting xpd = NA easiest since it opens up all external space. We also need to make sure that we keep space outside of the plot so that we have room to place our objects. Let's say we want to put an ugly border above and below our plot:

# Set xpd=NA and expand the top and bottom margins
par(xpd = NA, mar = par()$mar + c(2.5, 0, 1, 0))
plot(1:10)
# Note that the rectangle we make here has corner coordinates outside of
# our plotting device
rect(-5, 11, 12, 14, col="red")
# Random dots in our rectangluar region
points(runif(100, -4.2, 12.8), runif(100, 11.2, 13.6), col = "green", pch = 19, cex = 1.2)
# And another rectangle for below
rect(-5, -1.7, 12, -3.5, col="red")
points(runif(100, -4.2, 12.8), runif(100, -3.3, -1.8), col = "green", pch = 19, cex = 1.2)

center

Here we mentally extend the axes of our plot to determine where to put our margin elements. One can imagine a diagonal for the top rectangle running from (-5,11) to (12,14). Neither of these points appear in the plot itself, but we used the established axes to estimate them and plot outside the plotting region.

Images outside the plot

Now let's say we want to add a logo or other external image in the margin of our plot. We will use R's png library to load a PNG image and rasterImage() to plot it:

## If needed: install.packages("png")
library(png)
img <- readPNG("logo.png")
par(xpd = NA, mar=par()$mar + c(3, 0, 0, 0))
plot(1:10)
rasterImage(img, 0.5, -2.5, 10.5, -1)

center

Here we used the png library and the rasterImage() command to read in and plot the "logo.png" file. Based on the previously-known dimensions of the logo, we can choose which points to use as endpoints for the image. Note that this image may appear stretched or contorted depending on the size of your R plot device, and it will not stay consistent if you resize.