Introducing R and Biostatistics to first year LCG students (2012 version)
On Friday November 9th I’ll be giving a talk to the first year students from the Undergraduate Program on Genomic Sciences (LCG in Spanish) during their “Seminar 1: Introduction to Bioinformatics” course. It’s just like I did a year ago as I documented in my post Introducing Biostatistics to first year LCG students.
Well, this time I’ll change things a bit. I’m allowed to require the students to read 2-3 papers before my talk to introduce them to my field. I’ll do so, but in a more peculiar way by requiring them to listen in to a few videos I selected. So, without further ado here are the three required “papers”:
Here is “paper 1" (~30 minutes). The goal is to introduce you to the basic workings of R and also to great sources of R videos.
First, learn to install R (watch it in full screen).
Or you can also watch any of the two following videos:
Next learn about RStudio and why it’s a great place to start (watch it on hd and fullscreen).
Now you are ready to learn how to create a variable in R. Use RStudio instead of the R GUI to do so.
Next, learn the super basics about the basic R plot system.
Now you are ready to learn about how to use the combine function.
Next, learn about data.frame type of objects
and how to add new variables to them.
Next up is learning how to find help.
Almost there. Now check how to change your current working directory.
Finally, learn how to install and load a package in R.
If you are more curious regarding the origins of R check the next video (not part of “paper 1”).
Next, “paper 2" (~39 minutes). The goal here is to get a feeling of how you can use R to create plots.
First start with this demonstration of the basic R plotting tools (called “base graphics”). It does in enough level of detail of how the basic plotting system works and how you can customize the colors, layout, etc. For the purpose of getting used to the tool, I recommend that you follow this video using RStudio. Also, you’ll want to watch it in 720p.
Now check the demo for plotting with the lattice package. This is more advanced, but it should also be more illustrative of the power you have with R. Plus it shows how we can expand the functionality of R by using packages contributed to the community and freely available for us to use.
Finally, “paper 3" (~28 minutes). This is the first lecture from a course by Brian Caffo in which he goes over the definition and overall motivation behind Biostatistics. It should be much more fun to watch than reading a review paper in the area.
Now, for those motivated to learn more, I recommend some of my own posts summarizing information that can be useful to you.
- JHSPH-Biostat through Coursera and An Online Bioinformatics Curriculum
- Setting up your computer for bioinformatics/biostatistics and a compedium of resources
- Motivation behind using a version control system and Introducing Git while making your academic webpage
- Why aren’t all of our graphs interactive? and Visualizing colors()
- P-values and Statistics phylosophy
- The new visualization package for genome data in Bioconductor: ggbio