Translational Neuroscience Division, Data Science I

JHPCE: lieber_lcolladotor

Lieber Institute for Brain Development

R/Bioconductor-powered Team Data Science

At the Lieber Institute for Brain Development (LIBD), as part of the Translational Neuroscience Division, our group works on understanding the roots and signatures of disease (particularly psychiatric disorders) by zooming in across dimensions of gene activity. We achieve this by studying gene expression at all expression feature levels (genes, exons, exon-exon junctions, and un-annotated regions) and by using different gene expression measurement technologies (bulk RNA-seq, single cell/nucleus RNA-seq, and spatial transcriptomics) that provide finer biological resolution and localization of gene expression. We work closely with collaborators from LIBD as well as from Johns Hopkins University (JHU), University of Cambridge, and other institutions, which reflects the cross-disciplinary approach and diversity in expertise needed to further advance our understanding of high throughput biology.

In order to provide a supportive and stimulating research environment at LIBD, our group provides Data Science guidance sessions open to any LIBD staff member and we organize the LIBD rstats club, among other initiatives. Our documentation book website contains more details for on boarding, how to ask for help, bootcamps, writing papers, authorship, configuration files, and much more.

We constantly create new content to share what we are learning or working on, which you might be interested in. In particular, we:

run the LIBD rstats club: rstats club schedule
discuss papers and software in our team meetings: team meeting schedule

Join the team

If you are interested in joining the R/Bioconductor-powered Team Data Science group, please check our open positions at the LIBD career opportunities website. You might be interested in checking our anonymous team survey results, which highlights some strengths but also some weaknesses and areas we can improve.

If we don’t have any open positions, please reach out to Leonardo with your CV, GitHub/GitLab/etc profile with open-source software, and a short description of why you are interested in our team.

Fields

that drive us

R

Genomics

Education

Community

Team

Principal Investigator

Leonardo Collado-Torres

Investigator @ LIBD, Assistant Professor, Department of Biostatistics @ JHBSPH

Genomics, R programming, Biostatistics, Teaching, Diversity

LIBD Staff

Louise A. Huuki-Myers

Research Associate 2020-2022, Staff Scientist I, Data Science 2022-ongoing, PhD Student 2024-ongoing

bulk and single cell RNA-seq, spatial transcriptomics, RNA scope data, DEGs, data visualization

Manisha Barse

Research Assistant I, Data Science 2024-ongoing

bulk RNA-seq, scRNA-seq, spatial transcriptomics, building computational pipelines

Nicholas J. Eagles

Research Assistant 2018-2021, Research Associate I 2021-2024, Research Associate II 2024-ongoing

bulk and single cell RNA-seq, spatial transcriptomics, building computational pipelines

Remote Members

Daianna Gonzalez-Padilla

LIBD Summer Intern 2022, Intern 2022-ongoing

Pharmaceutics, Drug discovery, Pharmacogenomics

Gabriel Ramírez-Vilchis

Intern 2025-ongoing

Transcriptomics, Neuroimmunology, R programming

Melissa Mayén-Quiroz

Intern 2023-ongoing

Developmental biology, Transcriptomics - Epigenetics, ML

Alumni

Amy Peterson

JHBSPH MPH 2017-2018

R programming, Biostatistics, Clinical Research

Arta Seyedian

Research Associate 2020-2021

transcriptomics, human evolutionary genomics

Ashkaun Razmara

JHBSPH MPH 2017-2018

Neuroscience, Reproducibility

Bernardo Chombo Álvarez

Intern 2024-2025

Brenda Pardo

LCG-UNAM-EJ 2020-2021

R programming, Regeneration, Stem cells

Cynthia S. Cardinault

Staff Scientist I, Data Science 2023-2025

bulk RNA-seq, workflows, machine learning methodologies

Hedia Tnani

Staff Scientist I, Data Science 2022-2024

bulk and single cell RNA-seq, building computational pipelines

Joshua M. Stolz

Research Associate 2018-2022

RNA-seq, gene-networks, data processing/pipeline building

Renee Garcia-Flores

Contractor 2023 Feb-Aug

R programming, Human diseases, Science communication

Projects

Last updated on Tue, May 9, 2023

deconvolution

Deciphering proportions of cell types on human brain postmortem bulk RNA-seq data using snRNA-seq data as the reference

Last updated on Tue, May 9, 2023

recount3

recount3: summaries and queries for large-scale RNA-seq expression and splicing

Last updated on Tue, May 9, 2023

spatial

Human brain spatial transcriptomics work using Visium from 10x Genomics

Last updated on Tue, May 9, 2023

BrainSEQ™ Consortium

BrainSeq Consortium lead by LIBD to understand the genetics and gene expression variability in psychiatric disorder including schizorphenia, bipolar disorder, and major depression disorder.

Last updated on Tue, May 9, 2023

recount2

Uniform processing of human RNA-seq data to improve usability and power methods development

Last updated on Tue, May 9, 2023

derfinder

Annotation-agnostic methods for gene expression data

Favorite talks

Leonardo Collado-Torres

Last updated on Tue, Jun 20, 2023

Navigating human brain gene expression measurements at different resolutions to study psychiatric disorders

Invited seminar by Mina Ryten

Joshua M. Stolz, Nicholas J. Eagles, Louise A. Huuki-Myers, Leonardo Collado-Torres

Last updated on Tue, May 9, 2023

R/Bioconductor-powered Team Data Science (TLDR 2022)

For an overview of our recent work, check this video and companion slides.

Leonardo Collado-Torres

Last updated on Fri, May 14, 2021

Promoting the next wave of R/Bioconductor developers in Latin America starting in Mexico

Presentation about our work at CDSB for the JSM 2020 session organized by Stephanie Hicks; Show Me the Data: Making Statistics and Data Science More Diverse and Inclusive in 2020.

Leonardo Collado-Torres

Last updated on Fri, May 14, 2021

Annotation-agnostic differential expression and binding analyses

L. Collado-Torres’s Johns Hopkins Bloomberg School of Public Health, Department of Biostatistics Ph.D. defense talk

All talks

Lessons From Working On The Edge Of Human Brain Spatially-Resolved Transcriptomics

Spatial Biology West Coast US 2024

Leonardo Collado-Torres

Last updated on Wed, Dec 11, 2024

Lessons From Working On The Edge Of Human Brain Spatially-Resolved Transcriptomics

Benchmarking cell type deconvolution methods with human brain data

I enjoyed presenting our deconvolution benchmark study for the #SingleCell genomics webinar series for Latin America organized with @eventsWCS support 📜 https://t.co/hUW5cz3ubm led by @lahuuki & https://t.co/jXHOzyOrf8 Slides by @lahuuki + minor updateshttps://t.

Leonardo Collado-Torres

Last updated on Sun, Nov 10, 2024

Benchmarking cell type deconvolution methods with human brain data

Lessons from spatially-resolved transcriptomics of postmortem human brain data projects

Leonardo Collado-Torres

Last updated on Mon, Aug 12, 2024

Lessons from spatially-resolved transcriptomics of postmortem human brain data projects

2024 Current Topics in Biostatistics guest seminar

Thanks @kenzhou86 & Margaret Taub for inviting to present at the Current Topics in Biostatistics seminar It was fun reminiscing about the past & sharing my experience. Got them to laugh a few times ^^

Leonardo Collado-Torres

Last updated on Mon, Feb 26, 2024

2024 Current Topics in Biostatistics guest seminar

Lessons from spatially-resolved transcriptomics of postmortem human brain data projects

Join me and others for free** at #FOGBoston October 4-5 2023 https://t.co/5BGgkBKZuN ⚾️🧦🎉 I'll talk about our work at @LieberInstitute @jhubiostat on spatially-resolved transcriptomics using #Visium #VisiumSPG by @10xGenomics 🎫 https://t.

Leonardo Collado-Torres

Last updated on Mon, Feb 26, 2024

See all

Leonardo’s recent publications

and posters

Quickly discover relevant content by filtering publications.

† indicates corresponding author, * indicates equal contribution

You can also find Leonardo’s publications list at NCBI, ORCiD, and Google Scholar.

An integrated single-nucleus and spatial transcriptomics atlas reveals the molecular landscape of the human hippocampus

Cell types in the hippocampus with unique morphology, physiology and connectivity serve specialized functions associated with cognition …

Jacqueline R. Thompson __*__, Erik D. Nelson __*__, Madhavi Tippani __*__, Anthony D. Ramnauth, Heena R. Divecha, Ryan A. Miller, Nicholas J. Eagles, Elizabeth A. Pattie, Sang Ho Kwon, Svitlana V. Bach, Uma M. Kaipa, Jianing Yao, Christine Hou, Joel E. Kleinman, Leonardo Collado-Torres, Shizhong Han, Kristen R. Maynard, Thomas M. Hyde, Keri Martinowich, Stephanie C. Page &dagger;, Stephanie C. Hicks &dagger;

An integrated single-nucleus and spatial transcriptomics atlas reveals the molecular landscape of the human hippocampus

In the human brain, the dorsal anterior cingulate cortex (dACC) plays key roles in various components of cognitive control, and is …

Kinnary Shah, Michael S. Totty, Svitlana V. Bach, Madeline R. Valentine, Atharv Chandra, Heena R. Divecha, Ryan A. Miller, Sang Ho Kwon, Anthony D. Ramnauth, Madhavi Tippani, Sanjana Tyagi, Joel E. Kleinman, Leonardo Collado-Torres, Shizhong Han, Thomas M. Hyde, Stephanie C. Page, Kristen R. Maynard, Stephanie C. Hicks &dagger;, Keri Martinowich &dagger;

BDNF-DT and BDNF-AS-DT: Novel Genes in the BDNF locus

Divergent transcription from bidirectional promoters is frequently observed in eukaryotic genomes, but the biological relevance of …

Svitlana V. Bach __*__, Giovanna Punzi __*__, Nuri E. Smith, Sreya Mukherjee, Joo Heon Shin, Qiang Chen, Geo Pertea, Leonardo Collado-Torres, Kristen R. Maynard, Stephanie C. Page, Joel E. Kleinman, Thomas M. Hyde, Daniel R. Weinberger, Keri Martinowich &dagger;, Gianluca Ursini &dagger;

BDNF-DT and BDNF-AS-DT: Novel Genes in the BDNF locus

lute: estimating the cell composition of heterogeneous tissue with varying cell sizes using gene expression

Background: Relative cell type fraction estimates in bulk RNA-sequencing data are important to control for cell composition differences …

Sean K. Maden, Louise A. Huuki-Myers, Sang Ho Kwon, Leonardo Collado-Torres, Kristen R. Maynard, Stephanie C. Hicks

lute: estimating the cell composition of heterogeneous tissue with varying cell sizes using gene expression

Large-scale transcriptomic analyses of major depressive disorder reveal convergent dysregulation of synaptic pathways in excitatory neurons

Major Depressive Disorder (MDD) is a common, complex disorder that is a leading cause of disability worldwide and a significant risk …

Fernando S. Goes, Leonardo Collado-Torres, Peter P. Zandi, Louise A. Huuki-Myers, Ran Tao, Andrew E. Jaffe, Geo Pertea, Joo Heon Shin, Daniel R. Weinberger, Joel E. Kleinman, Thomas M. Hyde

Large-scale transcriptomic analyses of major depressive disorder reveal convergent dysregulation of synaptic pathways in excitatory neurons

See all publications

Leonardo’s recent blog posts

Posts with the rstats category can also be found at RBloggers and R Weekly. Also check the LIBD rstats club where Leonardo is a contributor. You can also view posts grouped by category or tag.

Last updated on Fri, Mar 14, 2025 8 min read rstats, LIBD

Initial impressions from testing Posit Connect

Two weeks ago I finished testing Posit Connect as a replacement to shinyapps.io. At the end of the trial period, I presented my conclusions during a LIBD rstats club session. Here you can read the detailed notes and watch the resulting video:

Last updated on Fri, Mar 7, 2025 11 min read UNAM

Teaching omics methods to LCG-UNAM students

Notes from a second semester introductory class to the world of omics research techniques at LCG-UNAM. That is, my undergrad alma mater. Thank you for the invitation! Valentina Arias Ojeda is as teaching assistant for the class “Omics Research Techniques”, taught by Dr.

Last updated on Wed, Nov 13, 2024 7 min read UNAM

Genomics scientists from and in Mexico community building

I recently gave a remote seminar presentation for the Applications of Genomics 2024-2025 course organized by Esperanza Martínez Romero and Alejandra Zayas Del Moral. The idea was to talk about my career and showcase some recent research work from my team.

Leonardo Collado-Torres

Last updated on Sat, May 25, 2024 13 min read Science, LIBD

HumanPilot: first spatially-resolved transcriptomics study using Visium

Would you prefer a video walkthrough over reading this blog post? Check out this explainer video 🎥 A few years ago now (2021) we published a study we refer to with a very generic name: HumanPilot (Maynard, Collado-Torres, Weber, Uytingco et al.

Leonardo Collado-Torres

Last updated on Fri, May 24, 2024 24 min read Computing, LIBD, rstats

How to reduce the size of a large GitHub repo

This work was done by Leo with Erik Nelson and Ryan Miller. Thank you for your time and input! Oh ohh! My GitHub repository is huge! 😱 What do I do now?

See all posts

Courses taught by Leonardo

LIBD

2026

Instructor of the Introduction to RNA-seq data analysis with Bioconductor course for LCG-UNAM students.

2025

Instructor of the Introduction to RNA-seq data analysis with Bioconductor course for LCG-UNAM students.
Instructor for 140.776 Statistical Computing course at the Johns Hopkins Bloomberg School of Public Health.

2024

Instructor of the Introduction to RNA-seq data analysis with Bioconductor course for LCG-UNAM students.
Invited instructor for this EMBL-EBI virtual course.
Instructor for a portion of the Statistical Analysis of Genome Scale Data course at Cold Spring Harbor Laboratory.
Instructor for 140.776 Statistical Computing course at the Johns Hopkins Bloomberg School of Public Health.

2023

Instructor of the Introduction to RNA-seq data analysis with Bioconductor course for LCG-UNAM students.
Instructor for a portion of the Statistical Analysis of Genome Scale Data course at Cold Spring Harbor Laboratory.
Instructor for 140.776 Statistical Computing course at the Johns Hopkins Bloomberg School of Public Health.
Invited instructor for CDSB 2023.
Invited instructor for this PROINNOVA-funded course for Centro Universitario Los Altos.

2022

Instructor of the Introduction to RNA-seq data analysis with Bioconductor course for LCG-UNAM students.

2021

Organizer and instructor for the CDSB 2021 workshop: analysis of scRNA-seq data with Bioconductor.
Instructor of the Introduction to RNA-seq data analysis with Bioconductor course for LCG-UNAM students.
Instructor of the Getting started with scRNA-seq analyses with Bioconductor 2 hour workshop for the Human Cell Atlas - Latin America workshop.
Instructor of the Interactive exploration of RNA-seq data with iSEE mini course that is part of the mini courses series organized by CDSB, RMB and NNB-CCG (UNAM).

2020

Instructor for the R/Bioconductor Data Science LIBD bootcamps.
Instructor and member of the Organizing Committee for the CDSB Workshop 2020: Building workflows with RStudio and Bioconductor for single cell RNA-seq analysis
Instructor of the Analyzing scRNA-seq data with Bioconductor for LCG-EJ-UNAM students.

2019

Instructor and member of the Organizing Committee for the CDSB Workshop 2019: How to Build and Create Tidy Tools

2018

Keynote speaker and member of the Organizing Committee for the Latin American R/BioConductor Developers Workshop 2018

2016

Biostatistics and Stata instructor at a workshop for Kandahar University Faculty, organized by Johns Hopkins University.
Invited instructor for the Genomeeting 2016 course taught at INMEGEN, Mexico City, Mexico.

JHBSPH

2015-2016

Teaching assistant and guest lecturer for Introduction to R for Public Health Researchers.
Teaching assistant for Statistical Methods in Public Health I (140.621).
Lead teaching assistant for Statistical Methods in Public Health II (140.622).
Teaching assistant for the MPH capstone project.

2014-2015

Lead teaching assistant for Statistical Methods in Public Health I (140.621) and II (140.622).
Teaching assistant for the MPH capstone project.

2013-2014

Teaching assistant for Statistical Methods in Public Health I (140.621) and II (140.622).
Teaching assistant for the MPH capstone project. Developed a shiny application that allows students to sign up for a TA session (code) and wrote a report of the number of TA sessions available here.

2012-2013

Teaching assistant for Statistical Methods in Public Health I (140.621), II (140.622), III (140.623), and IV (140.624) courses.

UNAM

PDCB

While working at Winter Genomics I taught two courses for students of the Biomedical Sciences PhD Program (PDCB) from the National Autonomous University of Mexico (UNAM).

Analysis of High-Throughput Sequencing data with Bioconductor Aug-Dec 2010.
Introduction to R and Biostatistics (along with two other teachers).

IBT

While I was at the Institute of Biotechnology (UNAM) working with the Winter Genomics crew I organized two courses. One was a series of various bioinformatics and biology mini-courses and another one involved members of different academic institutions.

Introduction to R for bench biologists Oct-Nov 2009. This mini-course has quite a bit of material on learning how to make plots with R.
Statistical Methods and Analysis of Genomic Data Jan 2010. This one week course had lectures about Perl, using a Cluster, high-throughput technologies, R and Bioconductor, C, and biology overviews.

LCG

I taught three courses during my undergrad stage at the Undergraduate Program on Genomic Sciences (LCG). Each of these courses has its own website organizing the material. These are:

Intensive course on R/Bioconductor Oct-Nov 2008
Principles of Statistics Feb-June 2009
Seminar III: R/Bioconductor Aug-Dec 2009

Leonardo’s curriculum vitae

Download Leonardo’s cv or view it on GitHub.

Contact

If you have questions about the R/Bioconductor packages that Leonardo maintains, please read this post. If you send him an email, he’ll simply refer you to the same blog post.

lcolladotor@gmail.com
+1-301-450-2083
855 N. Wolfe, Room 382, Baltimore, MD 21205
Enter the Rangos building, register at the Security fron desk, contact Mattie Cox by email or via the phone listed on the LIBD website, take the elevator to the third floor, and register at the LIBD front desk with Mattie.
Book an appointment
DM Leonardo

Translational Neuroscience Division, Data Science I

JHPCE: lieber_lcolladotor

R/Bioconductor-powered Team Data Science

Check out the content we share

Join the team

Fields

Team

Principal Investigator

Investigator @ LIBD, Assistant Professor, Department of Biostatistics @ JHBSPH

LIBD Staff

Research Associate 2020-2022, Staff Scientist I, Data Science 2022-ongoing, PhD Student 2024-ongoing

Research Assistant I, Data Science 2024-ongoing

Research Assistant 2018-2021, Research Associate I 2021-2024, Research Associate II 2024-ongoing

Remote Members

LIBD Summer Intern 2022, Intern 2022-ongoing

Intern 2025-ongoing

Intern 2023-ongoing

Alumni

JHBSPH MPH 2017-2018

Research Associate 2020-2021

JHBSPH MPH 2017-2018

Intern 2024-2025

LCG-UNAM-EJ 2020-2021

Staff Scientist I, Data Science 2023-2025

Staff Scientist I, Data Science 2022-2024

Research Associate 2018-2022

Contractor 2023 Feb-Aug

Projects

Favorite talks

All talks

Leonardo’s recent publications

Leonardo’s recent blog posts

Courses taught by Leonardo

LIBD

2026

2025

2024

2023

2022

2021

2020

2019

2018

2016

JHBSPH

2015-2016

2014-2015

2013-2014

2012-2013

UNAM

PDCB

IBT

LCG

Leonardo’s curriculum vitae

Contact