In order to have a common set of external references and R knowledge that we use for the Data Science guidance sessions as well as our work, we have a series of R and Bioconductor bootcamps. Most of the material these bootcamps are based on is freely available or can be purchased for a small fee 9. Thus, we can feel confident that our LIBD and JHU collaborators will be able to access the material, as well as anyone else in the world.
You can also find the videos of all our bootcamps on YouTube at lcolladotor/playlists and described on the following Twitter thread.
It's been ~4 weeks. We spent the first 3 learning more about #rstats #bioinformatics & @Bioconductor. If you are interested in our bootcamp sessions, you can find the videos at https://t.co/uBVqZMfgkP— 🇲🇽 Leonardo Collado-Torres (@lcolladotor) October 16, 2020
and in general athttps://t.co/JjZMQZsY1S@LieberInstitute @LIBDrstats ✌️🏽 https://t.co/uujaBguHMi
Here are the videos from YouTube.
Here are some highlighted overview videos that you might find useful as you get started.
If you are new to R and RStudio, you might like this overview video and companion notes.
Similarly, this overview video and companion notes on Bioconductor might be useful too.
This video on how to use
sgejobs for submitting jobs at JHPCE could be useful too.
For a more advanced video on setting up bash loops and nested
qsubs, check this second
If you need to use the GPU queue at JHPCE, then check this video and companion slides.
We have organized accessible bootcamps that have biologists in mind. If you spend most of your time coding, check the Team bootcamps further below. For all of these bootcamps, you should have the latest R and RStudio Desktop installed. You will typically need a computer with at least 8 GB of RAM.
|1||2020-10-05 3-5 pm||R + RStudio||Differential expression analysis (LIBD-style)|
|2||2020-10-06 3-5 pm||R + RStudio||Differential expression analysis (LIBD-style)|
|3||2020-10-07 1-3 pm||R + RStudio||Differential expression analysis (LIBD-style)|
The team bootcamps are really for our team members and other more advanced R/programming members and collaborators. However, the material we cover is within reach of everyone as long as you practice using R/Bioconductor here and there. The concepts covered are of use to all of us who work with R, but we understand that you might not have as much time to learn these materials. If that’s the case, please feel free to sign up for our Data Science guidance sessions and we’ll help you learn these concepts at your own pace.
The first iteration of these bootcamps were run on September 2020 with the following schedule. For all of them, you should have the latest R and RStudio versions installed in your computer and be familiar with the R programming language. For a more structured working environment, we might use JHPCE’s computational resources while running RStudio on our computers and running code through a Linux terminal 10. You will probably need to spend time self-learning and practicing some of the material beyond these videos. If you just started learning about R, then these bootcamps will be quite challenging.
|1||2020-09-21 3-5 pm||NA||How to be a modern scientist|
|2||2020-09-22 3-5 pm||R + RStudio||What they forgot to teach you about R|
|3||2020-09-23 1-3 pm||R + RStudio||What they forgot to teach you about R|
|4||2020-09-24 3-5 pm||R + RStudio||The Elements of Data Analytic Style + CBDS|
|5||2020-09-28 3-5 pm||R + RStudio||The Elements of Data Analytic Style + CBDS|
|6||2020-09-29 3-5 pm||Be a part of the DSgs-guides team||DSgs-guide training|
|7||2020-09-30 1-3 pm||RStudio + R functions||Building Tidy Tools|
|8||2020-10-01 3-5 pm||RStudio + R functions||Building Tidy Tools|
There are tons of resources that are useful for learning about R, Bioconductor, and data science in general. We have selected some of those resources because:
- we are familiar with them
- they are freely available or can be purchased for a small fee
- we have heard good things about them
There are more materials out there than those that we’ve had a chance to learn about, and with time, this list will change. Here is our latest list.
- Introductory level
- book R for Data Science by Garrett Grolemund and Hadley Wickham. Useful for learning about the tidyverse, which is supported by RStudio.
- workshop What they forgot to teach you about R 2020 version by Kara Woo, Jenny Bryan, and Jim Hester.
- workshop Building Tidy Tools 2020 version by Charlotte Wickham and Hadley Wickham.
We also highly recommend keeping an eye open for any new work from:
- Alison Presmanes Hill. Just look at her awesome projects website!
- Desirée De Leon. Note that Desirée started as an intern working with Alison, so it’s no surprise that her projects website is excellent.
- Allison Horst and in particular her stats-illustrations which are widely used and are very helpful when teaching statistical and R concepts.
Our colleagues at Hopkins have also created Coursera courses (MOOCs) which you might be interested in, though they might involve other fees. These are:
- Genomic Data Science Specialization by Steven Salzberg, Jeff Leek, James Taylor (1979-2020), Mihaela Pertea, Ben Langmead, Jacob Pritt, Liliana Florea, Kasper Daniel Hansen. Actually, LIBD is an “industry partner” for this specialization.
- Data Science Specialization by Jeff Leek, Roger D. Peng, and Brian Caffo.
- Data Science: Foundations using R Specialization by Jeff Leek, Roger D. Peng, and Brian Caffo.
- Executive Data Science Specialization by Jeff Leek, Roger D. Peng, and Brian Caffo.
- The Unix Workbench by Sean Kross, Jeff Leek, Roger D. Peng, and Brian Caffo.
- Introduction to Data Science (at least 8 college-level courses)
- Data Analysis for the Life Sciences (at least 4 graduate level courses)
- Genomics Data Analysis (at least 3 graduate level courses)
A lot of the resources listed are available through Leanpub which is a publishing platform for books and courses. Authors get to set a recommended price, but also a minimum price that can be as low as $0, thus making their materials free to use.↩︎
Check our LIBD rstats club videos on how to configure your macOS or Windows computer to work with JHPCE.↩︎