L. Collado-Torres

At the Lieber Institute for Brain Development, I am part of the Data Science Team I led by Andrew E Jaffe. My research aims to better understand the roots and signatures of disease (particularly psychiatric disorders) by zooming in across dimensions of gene activity: from studying gene expression at all feature levels (genes to exons to exon-exon junctions and un-annotated regions of expression), to using different gene expression measurement technologies (bulk RNA-seq, single cell/nuclei RNA-seq to spatial transcriptomics) that provide finer biological resolution and localization of gene expression. I’m interested in both hypothesis-driven projects as well as building general resources such as recount2 that enable us to contextualize our findings across all of the public human gene expression landscape. I use the R programming language for nearly all my work and like to organize my code in R packages that I share mostly through the Bioconductor project, for which I’m part of its Community Advisory Board. From my position at LIBD, I’m able to interact with and collaborate with fantastic biologists, data scientists, researchers at Johns Hopkins University and beyond. Furthermore, I officially help mentor LIBD employees in data science and R tools.

As a quick background, I graduated from the Undergraduate Program on Genomic Sciences from the National Autonomous University of Mexico ( UNAM) in 2009 and worked for two years at Winter Genomics analyzing high-throughput sequencing data. I then got a PhD in 2016 from the Department of Biostatistics at Johns Hopkins Bloomberg School of Public Health thanks to a CONACyT scholarship. I worked with Jeff Leek and Andrew Jaffe in developing derfinder and recount. I then worked ~ 3.5 years as a Staff Scientist in Andrew Jaffe’s lab on a variety of data analysis projects.

Every day I use R and Bioconductor, and on some days I write R packages. Occasionally I blog about them and other tools. I’m a co-founder of the LIBD rstats club and the CDSB community of R and Bioconductor developers in Mexico and Latin America, just like we described at the R Consortium website.

If you want to join my team, please get in touch! ^_^ 💪🏽🇲🇽


  • Genomics
  • R programming
  • Biostatistics
  • Teaching
  • Diversity


  • PhD in Biostatistics, 2016

    Johns Hopkins Bloomberg School of Public Health

  • Bachelor in Genomic Sciences (LCG), 2009

    National Autonomous University of Mexico (UNAM)


that drive me







BrainSEQ™ Consortium

BrainSeq Consortium lead by LIBD to understand the genetics and gene expression variability in schizophrenia disorder


Uniform processing of human RNA-seq data to improve usability and power methods development


Annotation-agnostic methods for gene expression data

Recent Publications

and posters

Quickly discover relevant content by filtering publications.

† indicates corresponding author, * indicates equal contribution

Transcriptome-scale spatial gene expression in the human dorsolateral prefrontal cortex

We used the 10x Genomics Visium platform to define the spatial topography of gene expression in the six-layered human dorsolateral …

Favorite talks

From learning to using to teaching to developing R

Keynote to kickoff the CDSB Mexico 2018 R/Bioconductor workshop

Annotation-agnostic differential expression and binding analyses

L. Collado-Torres’s Johns Hopkins Bloomberg School of Public Health, Department of Biostatistics Ph.D. defense talk

Recent & Upcoming Talks

Analyzing BrainSeq Phase II and generating the recount-brain resource

Update on BrainSeq Phase II and recount-brain for the LIBD 2019 staff seminar series

Reproducible RNA-seq analysis with recount2

Junior Research Symbiont Award Presentation for Excellence in Data Sharing at PSB2019

Reproducible RNA-seq analysis with recount2 and recount-brain

Work in progress presentation on recount-brain for the Joint Genomics Meeting

Reproducible RNA-seq analysis with recount2 and recount-brain

Guest lecture for LCG-UNAM spanning some recent research

Recent Posts

Posts with the rstats category can also be found at RBloggers and R Weekly. Also check the LIBD rstats club where I am a contributor. You can also view posts grouped by category or tag.

You just committed a large file and can't push to GitHub

Oh ohh! 😱 What do you do now? The data me and my colleagues work with is typically too big for our personal computers, so we use a high performance computing environment (cluster) and mostly interact with it through the command line terminal. As you might know, I’m a big fan of version control and I use git plus GitHub for sharing our code 1. That’s why I’ve been advocating others to use it for a while and when they do, they run to me if they have some issues.

Research Scientist: an academic career launch pad

After a long start to 2020 including the past four very busy weeks, I’m happy to announce that today March 16th 2020 I accepted a position as Research Scientist at the Lieber Institute for Brain Development in Baltimore, MD, USA. via GIPHY What will I do as a Research Scientist at LIBD? At LIBD we currently have the following scientific ranks: Research Technician Research Assistant Research Associate Staff Scientist I, II and III Research Scientist (+ Lead and Senior) Investigator (+ Lead and Senior) Research Scientists carry out research, do so scholarly, are tasked with being creative, are encouraged to seek funding, and can have supervisory and mentor roles.

Diving together into the unknown world of spatial transcriptomics

Yesterday was an extremely exciting day for me and my colleagues. We finished a project we had been working on and shared it with the world. Meaning, it’s done and we can relax for a little bit while we wait for feedback from our peers. But this was not any project, at least not for me. Why do you ask? In general terms, it involved an analysis that you could not search on Google and find the answer for.

Learning from our search history

Origin of the idea Recently the team I work with has had a few new members and I’ve been thinking lately of ways we could try to help them. The team leader was traveling this week, which gave me the opportunity to come up with a new type of session and test it out. That’s the origin of this learning from our search history idea. We tested it today and I’m quite happy with the results so far, so I thought it would be useful to document what we did and share it with others.

Conference feelings: from newbie to sponsor

In the summer of 2008, nearly 12 years ago, I attended my first R/Bioconductor conference: BioC2008. Just last week I went to my second rstudio::conf(2020) which I greatly enjoyed. After some tweets exchanges today, I started reflecting on my journey and wanted to share my thoughts. Why I like going to conferences I typically enjoy going to conferences, though I also end up exhausted. via GIPHY Part of it could be the traveling and all that goes with it, but I think that conferences are mostly mentally taxing.

Curriculum vitae

Download my cv or view it at GitHub.


Here you can find the students that I’ve mentored as their advisor and/or are currently working with me.



Amy Peterson

MPH 2017-2018

R programming, Biostatistics, Clinical Research


Ashkaun Razmara

MPH 2017-2018

Neuroscience, Reproducibility




  1. Instructor and member of the Organizing Committee for the CDSB Workshop 2019: How to Build and Create Tidy Tools


  1. Keynote speaker and member of the Organizing Committee for the Latin American R/BioConductor Developers Workshop 2018


  1. Biostatistics and Stata instructor at a workshop for Kandahar University Faculty, organized by Johns Hopkins University.
  2. Invited instructor for the Genomeeting 2016 course taught at INMEGEN, Mexico City, Mexico.



  1. Teaching assistant and guest lecturer for Introduction to R for Public Health Researchers.
  2. Teaching assistant for Statistical Methods in Public Health I (140.621).
  3. Lead teaching assistant for Statistical Methods in Public Health II (140.622).
  4. Teaching assistant for the MPH capstone project.


  1. Lead teaching assistant for Statistical Methods in Public Health I (140.621) and II (140.622).
  2. Teaching assistant for the MPH capstone project.


  1. Teaching assistant for Statistical Methods in Public Health I (140.621) and II (140.622).
  2. Teaching assistant for the MPH capstone project. Developed a shiny application that allows students to sign up for a TA session ( code) and wrote a report of the number of TA sessions available here.


  1. Teaching assistant for Statistical Methods in Public Health I (140.621), II (140.622), III (140.623), and IV (140.624) courses.



While working at Winter Genomics I taught two courses for students of the Biomedical Sciences PhD Program (PDCB) from the National Autonomous University of Mexico (UNAM).

  1. Analysis of High-Throughput Sequencing data with Bioconductor Aug-Dec 2010.
  2. Introduction to R and Biostatistics (along with two other teachers).


While I was at the Institute of Biotechnology (UNAM) working with the Winter Genomics crew I organized two courses. One was a series of various bioinformatics and biology mini-courses and another one involved members of different academic institutions.

  1. Introduction to R for bench biologists Oct-Nov 2009. This mini-course has quite a bit of material on learning how to make plots with R.
  2. Statistical Methods and Analysis of Genomic Data Jan 2010. This one week course had lectures about Perl, using a Cluster, high-throughput technologies, R and Bioconductor, C, and biology overviews.


I taught three courses during my undergrad stage at the Undergraduate Program on Genomic Sciences (LCG). Each of these courses has its own website organizing the material. These are:

  1. Intensive course on R/Bioconductor Oct-Nov 2008
  2. Principles of Statistics Feb-June 2009
  3. Seminar III: R/Bioconductor Aug-Dec 2009


If you have questions about the R/Bioconductor packages I maintain, please read this post. If you send me an email, I’ll simply refer you to the same post.