Where do I start using Bioconductor?

2014-10-16

I was recently asked where do I get started with Bioconductor? and thought this would be a good short post.

What is BioC?

Briefly, Bioconductor (Gentleman, Carey, Bates, and others, 2004) is an open source project that hosts a wide range of tools for analyzing biological data with R (R Core Team, 2014). These analysis tools are bundled into packages which are designed to answer specific questions or to provide key infrastructure. If this sounds like something you are interested in, visit bioconductor.org.

Obviously, you need to know the basics about R in order to use Bioconductor.

BioconductorLogo

Getting started

bioconductor.org has a section in it's front page titled get started with Bioconductor. There you will find links that explain how to install it or to explore the available packages.

You have a use case

If you have a particular use case in mind, I recommend browsing the software packages and searching for some key words. For example, you might be interested in high throughput sequencing of RNAs and if you search RNAseq or RNA-seq you can find a good set of packages to start. Alternatively, use the biocViews tree menu to explore specific categories of packages.

Once you find a set of packages that have descriptions that appeal to you, explore their vignettes. These are either PDF or HTML documents that explain what the package does to new users. They also exemplify how to tie together the different functions in the package, which is a key piece of information. For example, in the RNA-seq example you will find the DEXSeq package. DEXseq (Anders, Reyes, and Huber, 2012) has a vignette called Analyzing RNA-seq data for differential exon usage with the "DEXSeq" package and from the page of the package you can access the PDF vignette.

Then it's just a matter of exploring other packages, checking the vignettes and learning as you go.

You don't have a use case

If you don't have a specific use case in mind, it might pay off to start by exploring the Bioconductor workflows. These documents explain how to use different packages to accomplish specific type of analyses. They are great to learn what you can do with Bioconductor!

Another option is to look at the previous courses. For example, under the 2008 courses you'll find to the course R/Bioconductor Curso Intensivo (Spanish) which I taught back in the day. As much as I would like to self promote myself, the best starting point is the most recent BioC20XX course: BioC2014. It has slides showcasing some of the newest packages and tutorials on how to use them.

An alternative is to look at some of the Bioconductor publications which includes books about Bioconductor and research papers describing some of the packages.

Once you find a set of packages that catch your eye, go look at their vignettes just like I explained in the you have a use case scenario.

Help tips

It's not a matter of whether you will need help learning how to use Bioconductor. It's just a matter of when. So don't feel bad about having to ask for help!!

The very first place to start is to look at bioconductor.org at the Help section in the bottom. For example, you can find youtube videos contributed under the community section. There you can also find links to other blog posts explaining how to use Bioconductor. Take a peak at the other sections under Help before using the Bioconductor support site: it's where you can ask very specific questions and interact with the maintainers of the packages you are using.

Finally, if you are interested in new developments, then check the latest newsletter, for example the October 2014 one.

Good luck using Bioconductor!

References

Citations made with knitcitations (Boettiger, 2014).

[1] S. Anders, A. Reyes and W. Huber. “Detecting differential usage of exons from RNA-seq data.” In: Genome Research 22 (2012), p. 4025. DOI: 10.1101/gr.133744.111.

[2] C. Boettiger. knitcitations: Citations for knitr markdown files. R package version 1.0.2. 2014. URL: https://github.com/cboettig/knitcitations.

[3] R. C. Gentleman, V. J. Carey, D. M. Bates and others. “Bioconductor: Open software development for computational biology and bioinformatics”. In: Genome Biology 5 (2004), p. R80. URL: http://genomebiology.com/2004/5/10/R80.

[4] R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria, 2014. URL: http://www.R-project.org/.

Want more?

Check other @jhubiostat student blogs at Bmore Biostats as well as topics on #rstats.

End of the summer blogger challenge

2014-10-03

This summer a few of us at Bmore Biostats agreed to a summer iron blogger challenge. We either had to blog every week or every two weeks. The penalty of not publishing a post in our chosen timeframe was to donate 5 or 10 USD (respectively) to our charity of choice.

I liked the idea of the blogging challenge and thought it would help me keep motivated to keep my blog active during the summer. However, I totally failed and only posted twice during the 16 week summer. That is, I have to donate 14 * 5 = 70 USD. I think that most of us blogged a lot less than what we were hoping to and potentially owe a charity some money. But well, there is no real obligation to donate and I don't expect others to do so. However, I want to do it. Doing so will help me take a iron blogger challenge more seriously next time.

Kiva

As a charity of my choice, I ended up choosing Kiva. Through them you can loan 25 USD to an individual or group that needs the money to improve their business, home, etc. And in general, you get your money back. Alternatively, you can donate your money from the get go or keep re-loaning it instead of withdrawing it back to your bank.

It is my first time doing something like this, so I am going to ask for the money back and then re-loan it as I see fit. So, because strictly speaking I am not donating it, I decided to increase my entry to 100 USD. That is, 4 Kiva loans.

Kiva recommends lending to different individuals, Kiva partners, and countries to diversify your risk. Being born in Mexico, I wanted to specifically help people from there. If you restrict the search to Mexico, currently there are 13 open loans. From them, I chose:

  • Aurea administered by Eblock International. She wants to install a floor in her room. It seems like she has used other Kiva loans to slowly build her house.
  • Sonia administered by Kubo Financiero. She wants to buy recycling materials for her business.
  • Mujeres Mazahuas Group who are re-stocking their grocery business. The loan is administered by VisionFund Mexico who have gotten loans for over 6 million since 2009. That's a large chunk of the 16 million that Kiva has sent to Mexico.
  • Gabino administered by Sistema Biobolsa. He wants a biodigester which will help him in his agriculture business.

Hopefully the loans will be used properly! But like my friend who has experience in micro-banks in Mexico tells me, it is very hard to know that. We'll see what happens.

For the meantime, here is my receipt.

Workflow

Join Kiva

For whatever's worth, here is my Kiva invitation link: kiva.org/invitedby/fellgernon.

Want more?

Check other @jhubiostat student blogs at Bmore Biostats as well as topics on #rstats.