An xpd-tion into R plot margins

2014-11-21

This is a guest post by Prasad Patil that answers the question: how to put a shape in the margin of an R plot?

The help page for R's par() function is a somewhat impenetrable list of abbreviations that allow you to manipulate anything and everything in the plotting device. You may have used this function in the past to create an array of plots (using mfrow or mfcol) or to set margins (mar or mai).

Way down toward the end of the list is the often-overlooked xpd parameter. This value specifies where in the plotting device an object can actually be plotted. The default is xpd = FALSE, which means that plotting is clipped, or restricted, to the plotting region. In other words, if your plot has xlim = c(0, 10) and ylim = c(0, 10) and you try to plot the point (-1, -1), it will not appear anywhere in the device.

xpd takes two other values, TRUE and NA, which limit plotting to the figure and device region, respectively. If you're fuzzy on plotting terms, this tutorial presents those topics well.

Plotting outside the plot

If you want to plot outside of the plotting region, I find that setting xpd = NA easiest since it opens up all external space. We also need to make sure that we keep space outside of the plot so that we have room to place our objects. Let's say we want to put an ugly border above and below our plot:

# Set xpd=NA and expand the top and bottom margins
par(xpd = NA, mar = par()$mar + c(2.5, 0, 1, 0))
plot(1:10)
# Note that the rectangle we make here has corner coordinates outside of
# our plotting device
rect(-5, 11, 12, 14, col="red")
# Random dots in our rectangluar region
points(runif(100, -4.2, 12.8), runif(100, 11.2, 13.6), col = "green", pch = 19, cex = 1.2)
# And another rectangle for below
rect(-5, -1.7, 12, -3.5, col="red")
points(runif(100, -4.2, 12.8), runif(100, -3.3, -1.8), col = "green", pch = 19, cex = 1.2)

center

Here we mentally extend the axes of our plot to determine where to put our margin elements. One can imagine a diagonal for the top rectangle running from (-5,11) to (12,14). Neither of these points appear in the plot itself, but we used the established axes to estimate them and plot outside the plotting region.

Images outside the plot

Now let's say we want to add a logo or other external image in the margin of our plot. We will use R's png library to load a PNG image and rasterImage() to plot it:

## If needed: install.packages("png")
library(png)
img <- readPNG("logo.png")
par(xpd = NA, mar=par()$mar + c(3, 0, 0, 0))
plot(1:10)
rasterImage(img, 0.5, -2.5, 10.5, -1)

center

Here we used the png library and the rasterImage() command to read in and plot the "logo.png" file. Based on the previously-known dimensions of the logo, we can choose which points to use as endpoints for the image. Note that this image may appear stretched or contorted depending on the size of your R plot device, and it will not stay consistent if you resize.

Where do I start using Bioconductor?

2014-10-16

I was recently asked where do I get started with Bioconductor? and thought this would be a good short post.

What is BioC?

Briefly, Bioconductor (Gentleman, Carey, Bates, and others, 2004) is an open source project that hosts a wide range of tools for analyzing biological data with R (R Core Team, 2014). These analysis tools are bundled into packages which are designed to answer specific questions or to provide key infrastructure. If this sounds like something you are interested in, visit bioconductor.org.

Obviously, you need to know the basics about R in order to use Bioconductor.

BioconductorLogo

Getting started

bioconductor.org has a section in it's front page titled get started with Bioconductor. There you will find links that explain how to install it or to explore the available packages.

You have a use case

If you have a particular use case in mind, I recommend browsing the software packages and searching for some key words. For example, you might be interested in high throughput sequencing of RNAs and if you search RNAseq or RNA-seq you can find a good set of packages to start. Alternatively, use the biocViews tree menu to explore specific categories of packages.

Once you find a set of packages that have descriptions that appeal to you, explore their vignettes. These are either PDF or HTML documents that explain what the package does to new users. They also exemplify how to tie together the different functions in the package, which is a key piece of information. For example, in the RNA-seq example you will find the DEXSeq package. DEXseq (Anders, Reyes, and Huber, 2012) has a vignette called Analyzing RNA-seq data for differential exon usage with the "DEXSeq" package and from the page of the package you can access the PDF vignette.

Then it's just a matter of exploring other packages, checking the vignettes and learning as you go.

You don't have a use case

If you don't have a specific use case in mind, it might pay off to start by exploring the Bioconductor workflows. These documents explain how to use different packages to accomplish specific type of analyses. They are great to learn what you can do with Bioconductor!

Another option is to look at the previous courses. For example, under the 2008 courses you'll find to the course R/Bioconductor Curso Intensivo (Spanish) which I taught back in the day. As much as I would like to self promote myself, the best starting point is the most recent BioC20XX course: BioC2014. It has slides showcasing some of the newest packages and tutorials on how to use them.

An alternative is to look at some of the Bioconductor publications which includes books about Bioconductor and research papers describing some of the packages.

Once you find a set of packages that catch your eye, go look at their vignettes just like I explained in the you have a use case scenario.

Help tips

It's not a matter of whether you will need help learning how to use Bioconductor. It's just a matter of when. So don't feel bad about having to ask for help!!

The very first place to start is to look at bioconductor.org at the Help section in the bottom. For example, you can find youtube videos contributed under the community section. There you can also find links to other blog posts explaining how to use Bioconductor. Take a peak at the other sections under Help before using the Bioconductor support site: it's where you can ask very specific questions and interact with the maintainers of the packages you are using.

Finally, if you are interested in new developments, then check the latest newsletter, for example the October 2014 one.

Good luck using Bioconductor!

References

Citations made with knitcitations (Boettiger, 2014).

[1] S. Anders, A. Reyes and W. Huber. “Detecting differential usage of exons from RNA-seq data.” In: Genome Research 22 (2012), p. 4025. DOI: 10.1101/gr.133744.111.

[2] C. Boettiger. knitcitations: Citations for knitr markdown files. R package version 1.0.2. 2014. URL: https://github.com/cboettig/knitcitations.

[3] R. C. Gentleman, V. J. Carey, D. M. Bates and others. “Bioconductor: Open software development for computational biology and bioinformatics”. In: Genome Biology 5 (2004), p. R80. URL: http://genomebiology.com/2004/5/10/R80.

[4] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria, 2014. URL: http://www.R-project.org/.

Want more?

Check other @jhubiostat student blogs at Bmore Biostats as well as topics on #rstats.