Introduction to knitr

The R Markdown (Rmd) format

L. Collado Torres for JHSPH Biostat computing club
http://lcolladotor.github.com/Rmd-intro/

Overview

  • knitr

  • R markdown

  • Usign knitr via Rmd

  • Rnw knitr style

knitr

It's a framework for producing reproducible reports

  • Code and text
  • Runs R code and includes the output

How is it better than Sweave()?

  • Prettier out of the box
    • Code re-formating: tidy
    • Code highlighting
    • Simple copy-paste enabled
  • Better approach to dealing with plots
  • Under active development by @xieyihui

The main difference: Markdown

Mark-down (md) advantages

  • It's simple: little overhead
  • Main result is HTML which opens new horizons besides PDF

knitr main outputs

More at the knitr showcase

Active development around knitr

slidify

  • How this presentation was made and published on the web.

knitcitations

  • Useful for creating HTML citations: example

RStudio

  • Everything works out of the box
  • You can use knitr instead of Sweave() (change the option)
  • Easily publish your reports via RPubs

Blogging

R Markdown

Markdown's syntax is simple

  • That's also partly why LaTeX (Rnw) is still superior for PDF output if you want more control.

The only major change in Rmd are the R code chunks

RStudio (desktop) has a great syntax description. They have another great page online. Check it out!

This blog post goes over the basics pretty well.

Rmd basics

Your first Rmd file

  1. In RStudio: File, New, R Markdown
  2. Click on Knit HTML

Your 2nd file

  1. In RStudio: File, New, R Markdown
  2. Click on MD (Markdown quick reference)
  3. Edit the title, text, code, add chunks as you wish
  4. Click on Knit HTML

Rmd R chunks

Basic chunk

```r

hola()
```

You can add chunk labels after r

Options

http://yihui.name/knitr/options

  • Figure
    • fig.width, fig.height: R control
    • out.width, out.height: output control
    • fig.keep: useful for >=2 plots in 1 chunk
    • fig.cap: caption in Rnw only
  • Cache
    • cache: whether to cache a chunk
    • dependson: which other chunks this chunk depends on

Options continued

  • Results
    • results: similar to the same option in Sweave()
    • message, error, warning: whether to print them or not
  • Global options
opts_chunk$set(fig.width = 5, fig.height = 5, cache = TRUE)

fig.height example

set.seed(20130404)
x <- rnorm(100)
hist(x, col = "light blue", freq = FALSE)

plot of chunk unnamed-chunk-1

set.seed(20130404)
x <- rnorm(100)
hist(x, col = "light blue", freq = FALSE)

plot of chunk unnamed-chunk-2

So, when do I mostly use Rmd?

What about Rnw?

  • Change the setting in RStudio to weave with knitr
    • Preferences, Sweave, Weave Rnw files using knitr
  • Use knitr options instead of the Sweave options
    • Sexpr() works for in-line code
  • Enjoy your PDF output!

Quick example through RStudio

Longer template

Commands to make the presentation

library(slidify)
author("Rmd-intro")
## Edit the text
slidify("index.Rmd")
## Created the GitHub repo 'Rmd-intro'

## Initialized my git repository locally

## Added the github repo as a remote

## These steps are described here
## https://github.com/ramnathv/slidify/issues/99
publish("lcolladotor", "Rmd-intro")
## Slides are live at http://lcolladotor.github.com/Rmd-intro/

Thanks!