4 R/Bioconductor Data Science bootcamps

In order to have a common set of external references and R knowledge that we use for the Data Science guidance sessions as well as our work, we have a series of R and Bioconductor bootcamps. Most of the material these bootcamps are based on is freely available or can be purchased for a small fee 7. Thus, we can feel confident that our LIBD and JHU collaborators will be able to access the material, as well as anyone else in the world.

You can also find the videos of all our bootcamps on YouTube at lcolladotor/playlists and described on the following Twitter thread.

Here are the videos from YouTube.

4.1 Overview videos

Here are some highlighted overview videos that you might find useful as you get started.

4.1.1 R and RStudio

If you are new to R and RStudio, you might like this overview video and companion notes.

4.1.2 Bioconductor

Similarly, this overview video and companion notes on Bioconductor might be useful too.

4.1.3 ggplot2 R graphics

This video and companion notes on ggplot2 R graphics could also be useful.

4.1.4 Submitting jobs at JHPCE

This video on how to use sgejobs for submitting jobs at JHPCE could be useful too.

For a more advanced video on setting up bash loops and nested qsubs, check this second sgejobs video.

If you need to use the GPU queue at JHPCE, then check this video and companion slides.

4.2 LIBD bootcamps

We have organized accessible bootcamps that have biologists in mind. If you spend most of your time coding, check the Team bootcamps further below. For all of these bootcamps, you should have the latest R and RStudio Desktop installed. You will typically need a computer with at least 8 GB of RAM.

Session Time Prerequisites Topic
1 2020-10-05 3-5 pm R + RStudio Differential expression analysis (LIBD-style)
2 2020-10-06 3-5 pm R + RStudio Differential expression analysis (LIBD-style)
3 2020-10-07 1-3 pm R + RStudio Differential expression analysis (LIBD-style)

4.3 Team bootcamps

The team bootcamps are really for our team members and other more advanced R/programming members and collaborators. However, the material we cover is within reach of everyone as long as you practice using R/Bioconductor here and there. The concepts covered are of use to all of us who work with R, but we understand that you might not have as much time to learn these materials. If that’s the case, please feel free to sign up for our Data Science guidance sessions and we’ll help you learn these concepts at your own pace.

The first iteration of these bootcamps were run on September 2020 with the following schedule. For all of them, you should have the latest R and RStudio versions installed in your computer and be familiar with the R programming language. For a more structured working environment, we might use JHPCE’s computational resources while running RStudio on our computers and running code through a Linux terminal 8. You will probably need to spend time self-learning and practicing some of the material beyond these videos. If you just started learning about R, then these bootcamps will be quite challenging.

Session Time Prerequisites Topic
1 2020-09-21 3-5 pm NA How to be a modern scientist
2 2020-09-22 3-5 pm R + RStudio What they forgot to teach you about R
3 2020-09-23 1-3 pm R + RStudio What they forgot to teach you about R
4 2020-09-24 3-5 pm R + RStudio The Elements of Data Analytic Style + CBDS
5 2020-09-28 3-5 pm R + RStudio The Elements of Data Analytic Style + CBDS
6 2020-09-29 3-5 pm Be a part of the DSgs-guides team DSgs-guide training
7 2020-09-30 1-3 pm RStudio + R functions Building Tidy Tools
8 2020-10-01 3-5 pm RStudio + R functions Building Tidy Tools

4.4 Bootcamp source materials

There are tons of resources that are useful for learning about R, Bioconductor, and data science in general. We have selected some of those resources because:

  • we are familiar with them
  • they are freely available or can be purchased for a small fee
  • we have heard good things about them

4.4.1 Main materials

There are more materials out there than those that we’ve had a chance to learn about, and with time, this list will change. Here is our latest list.

We also highly recommend keeping an eye open for any new work from:

4.4.2 Courses

Our colleagues at Hopkins have also created Coursera courses (MOOCs) which you might be interested in, though they might involve other fees. These are:

Rafael Irizarry and his lab have also generated free teaching materials. A former postdoc with Rafael, Michael I Love also has teaching materials. Between them they have courses on:

  • Introduction to Data Science (at least 8 college-level courses)
  • Data Analysis for the Life Sciences (at least 4 graduate level courses)
  • Genomics Data Analysis (at least 3 graduate level courses)

They have lots of YouTube links organized on their old harvardx website, whose layout is based on Kasper Daniel Hansen’s Bioconductor for Genomic Data Science course.

  1. A lot of the resources listed are available through Leanpub which is a publishing platform for books and courses. Authors get to set a recommended price, but also a minimum price that can be as low as $0, thus making their materials free to use.↩︎

  2. Check our LIBD rstats club videos on how to configure your macOS or Windows computer to work with JHPCE.↩︎

© 2011-2023. All thoughts and opinions here are my own. The icon was designed by Mauricio Guzmán and is inspired by Huichol culture; it represents my community building interests.

Published with Bookdown