L. Collado-Torres

At the Lieber Institute for Brain Development, I lead the R/Bioconductor-powered Team Data Science team. My team aims to better understand the roots and signatures of disease (particularly psychiatric disorders) by zooming in across dimensions of gene activity: from studying gene expression at all feature levels (genes to exons to exon-exon junctions and un-annotated regions of expression), to using different gene expression measurement technologies (bulk RNA-seq, single cell/nuclei RNA-seq to spatial transcriptomics) that provide finer biological resolution and localization of gene expression. I’m interested in both hypothesis-driven projects as well as building general resources such as recount2 that enable us to contextualize our findings across all of the public human gene expression landscape. I use the R programming language for nearly all my work and like to organize my code in R packages that I share mostly through the Bioconductor project, for which I’m part of its Community Advisory Board. From my position at LIBD, I’m able to interact with and collaborate with fantastic biologists, data scientists, researchers at Johns Hopkins University and beyond. Furthermore, I officially help mentor LIBD employees in data science and R tools.

As a quick background, I graduated from the Undergraduate Program on Genomic Sciences from the National Autonomous University of Mexico ( UNAM) in 2009 and worked for two years at Winter Genomics analyzing high-throughput sequencing data. I then got a PhD in 2016 from the Department of Biostatistics at Johns Hopkins Bloomberg School of Public Health thanks to a CONACyT scholarship. I worked with Jeff Leek and Andrew Jaffe in developing derfinder and recount. I then worked ~ 3.5 years as a Staff Scientist in Andrew Jaffe’s lab on a variety of data analysis projects.

Every day I use R and Bioconductor, and on some days I write R packages. Occasionally I blog about them and other tools. I’m a co-founder of the LIBD rstats club and the CDSB community of R and Bioconductor developers in Mexico and Latin America, just like we described at the R Consortium website.

If you want to join my team, please get in touch! ^_^ 💪🏽🇲🇽

Interests

  • Genomics
  • R programming
  • Biostatistics
  • Teaching
  • Diversity

Education

  • PhD in Biostatistics, 2016

    Johns Hopkins Bloomberg School of Public Health

  • Bachelor in Genomic Sciences (LCG), 2009

    National Autonomous University of Mexico (UNAM)

Fields

that drive me

R

Genomics

Education

Community

Projects

*

spatial

Human brain spatial transcriptomics work using Visium from 10x Genomics

BrainSEQ™ Consortium

BrainSeq Consortium lead by LIBD to understand the genetics and gene expression variability in schizophrenia disorder

recount2

Uniform processing of human RNA-seq data to improve usability and power methods development

derfinder

Annotation-agnostic methods for gene expression data

Recent Publications

and posters

Quickly discover relevant content by filtering publications.

† indicates corresponding author, * indicates equal contribution

Programmatic access to bacterial regulatory networks with regutools

RegulonDB has collected, harmonized and centralized data from hundreds of experiments for nearly two decades and is considered a point …

Recent Posts

Posts with the rstats category can also be found at RBloggers and R Weekly. Also check the LIBD rstats club where I am a contributor. You can also view posts grouped by category or tag.

Curriculum vitae

Download my cv or view it at GitHub.

Mentoring

Here you can find the students that I’ve mentored as their advisor and/or are currently working with me.

Alumni

Avatar

Ashkaun Razmara

MPH 2017-2018

Neuroscience, Reproducibility

Avatar

Amy Peterson

MPH 2017-2018

R programming, Biostatistics, Clinical Research

Teaching

LIBD

2020

  1. Instructor and member of the Organizing Committee for the CDSB Workshop 2020: Building workflows with RStudio and Bioconductor for single cell RNA-seq analysis
  2. Instructor of the Analyzing scRNA-seq data with Bioconductor for LCG-EJ-UNAM students.

2019

  1. Instructor and member of the Organizing Committee for the CDSB Workshop 2019: How to Build and Create Tidy Tools

2018

  1. Keynote speaker and member of the Organizing Committee for the Latin American R/BioConductor Developers Workshop 2018

2016

  1. Biostatistics and Stata instructor at a workshop for Kandahar University Faculty, organized by Johns Hopkins University.
  2. Invited instructor for the Genomeeting 2016 course taught at INMEGEN, Mexico City, Mexico.

JHBSPH

2015-2016

  1. Teaching assistant and guest lecturer for Introduction to R for Public Health Researchers.
  2. Teaching assistant for Statistical Methods in Public Health I (140.621).
  3. Lead teaching assistant for Statistical Methods in Public Health II (140.622).
  4. Teaching assistant for the MPH capstone project.

2014-2015

  1. Lead teaching assistant for Statistical Methods in Public Health I (140.621) and II (140.622).
  2. Teaching assistant for the MPH capstone project.

2013-2014

  1. Teaching assistant for Statistical Methods in Public Health I (140.621) and II (140.622).
  2. Teaching assistant for the MPH capstone project. Developed a shiny application that allows students to sign up for a TA session ( code) and wrote a report of the number of TA sessions available here.

2012-2013

  1. Teaching assistant for Statistical Methods in Public Health I (140.621), II (140.622), III (140.623), and IV (140.624) courses.

UNAM

PDCB

While working at Winter Genomics I taught two courses for students of the Biomedical Sciences PhD Program (PDCB) from the National Autonomous University of Mexico (UNAM).

  1. Analysis of High-Throughput Sequencing data with Bioconductor Aug-Dec 2010.
  2. Introduction to R and Biostatistics (along with two other teachers).

IBT

While I was at the Institute of Biotechnology (UNAM) working with the Winter Genomics crew I organized two courses. One was a series of various bioinformatics and biology mini-courses and another one involved members of different academic institutions.

  1. Introduction to R for bench biologists Oct-Nov 2009. This mini-course has quite a bit of material on learning how to make plots with R.
  2. Statistical Methods and Analysis of Genomic Data Jan 2010. This one week course had lectures about Perl, using a Cluster, high-throughput technologies, R and Bioconductor, C, and biology overviews.

LCG

I taught three courses during my undergrad stage at the Undergraduate Program on Genomic Sciences (LCG). Each of these courses has its own website organizing the material. These are:

  1. Intensive course on R/Bioconductor Oct-Nov 2008
  2. Principles of Statistics Feb-June 2009
  3. Seminar III: R/Bioconductor Aug-Dec 2009

Contact

If you have questions about the R/Bioconductor packages I maintain, please read this post. If you send me an email, I’ll simply refer you to the same post.