This code is part of a presentation for the Human Cell Atlas - Latin America workshop on scRNA-seq data analysis. The code is based on Peter Hickey’s WEHI scRNA-seq workshop which itself is based on the Orchestrating Single Cell Analysis (OSCA) with Bioconductor book and paper.
You can find the Slides that accompany this code on Google Slides.
If you don’t have the latest R release (currently 4.0), you will need to install it following these instructions http://bioconductor.org/install/#install-R. You might also be interested in this video where I show how to install multiple R versions on macOS and winOS.
Yesterday at @LIBDrstats I explained how to install multiple R & @Bioconductor versions on macOS and winOS using
— 🇲🇽 Leonardo Collado-Torres (@lcolladotor) May 1, 2021
* CRAN
* RSwitch https://t.co/uJ023n6ZKq by @hrbrmstr
* @rstudio
Useful now that R 4.1 is in “alpha” & BioC 3.13 is almost ready#rstats
📹 https://t.co/FzfXpp6nrU
Once you have R installed, you can then install Bioconductor.
## Install BiocManager which is the utility for installing
## Bioconductor packages
## For installing Bioconductor packages
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
Once you have BiocManager
installed, you should verify that you are indeed using the latest Bioconductor release, which right now is 3.12.
BiocManager::version()
## [1] '3.12'
Next we can install the R/Bioconductor packages we’ll use later in the session.
## Install required packages
BiocManager::install(
c(
'SingleCellExperiment',
# 'usethis',
# 'here',
'scran',
'scater',
'scRNAseq',
# 'org.Mm.eg.db',
'AnnotationHub',
'ExperimentHub',
# 'BiocFileCache',
# 'DropletUtils',
# 'EnsDb.Hsapiens.v86',
# 'TENxPBMCData',
# 'BiocSingular',
# 'batchelor',
'uwot',
'Rtsne',
# 'pheatmap',
# 'fossil',
# 'ggplot2',
# 'cowplot',
# 'RColorBrewer',
# 'plotly',
# 'iSEE',
'pryr',
'sessioninfo'
)
)
Now we can use the code from https://github.com/lcolladotor/osca_LIIGH_UNAM_2020/blob/master/02-data-infrastructure-and-import.R#L57-L208, which I’ve copy-pasted below. The equivalent version with comments in Spanish is available from https://github.com/ComunidadBioInfo/cdsb2020/blob/master/scRNAseq/02-data-infrastructure-and-import.R#L57-L215.
## ----all_code, cache=TRUE--------------------------------------------------------------------------------------------
library('scRNAseq')
## Loading required package: SingleCellExperiment
## Loading required package: SummarizedExperiment
## Loading required package: MatrixGenerics
## Loading required package: matrixStats
##
## Attaching package: 'MatrixGenerics'
## The following objects are masked from 'package:matrixStats':
##
## colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
## colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
## colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
## colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
## colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
## colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
## colWeightedMeans, colWeightedMedians, colWeightedSds,
## colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
## rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
## rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
## rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
## rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
## rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
## rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
## rowWeightedSds, rowWeightedVars
## Loading required package: GenomicRanges
## Loading required package: stats4
## Loading required package: BiocGenerics
## Loading required package: parallel
##
## Attaching package: 'BiocGenerics'
## The following objects are masked from 'package:parallel':
##
## clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
## clusterExport, clusterMap, parApply, parCapply, parLapply,
## parLapplyLB, parRapply, parSapply, parSapplyLB
## The following objects are masked from 'package:stats':
##
## IQR, mad, sd, var, xtabs
## The following objects are masked from 'package:base':
##
## anyDuplicated, append, as.data.frame, basename, cbind, colnames,
## dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
## grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
## order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
## rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
## union, unique, unsplit, which.max, which.min
## Loading required package: S4Vectors
##
## Attaching package: 'S4Vectors'
## The following object is masked from 'package:base':
##
## expand.grid
## Loading required package: IRanges
## Loading required package: GenomeInfoDb
## Loading required package: Biobase
## Welcome to Bioconductor
##
## Vignettes contain introductory material; view with
## 'browseVignettes()'. To cite Bioconductor, see
## 'citation("Biobase")', and for packages 'citation("pkgname")'.
##
## Attaching package: 'Biobase'
## The following object is masked from 'package:MatrixGenerics':
##
## rowMedians
## The following objects are masked from 'package:matrixStats':
##
## anyMissing, rowMedians
sce.416b <- LunSpikeInData(which = "416b")
## using temporary cache /var/folders/cx/n9s558kx6fb7jf5z_pgszgb80000gn/T//RtmpD4KAVX/BiocFileCache
## snapshotDate(): 2020-10-27
## see ?scRNAseq and browseVignettes('scRNAseq') for documentation
## downloading 1 resources
## retrieving 1 resource
## loading from cache
## see ?scRNAseq and browseVignettes('scRNAseq') for documentation
## downloading 1 resources
## retrieving 1 resource
## loading from cache
## see ?scRNAseq and browseVignettes('scRNAseq') for documentation
## downloading 1 resources
## retrieving 1 resource
## loading from cache
## snapshotDate(): 2020-10-27
## see ?scRNAseq and browseVignettes('scRNAseq') for documentation
## downloading 1 resources
## retrieving 1 resource
## loading from cache
## snapshotDate(): 2020-10-27
## loading from cache
## require("ensembldb")
# Load the SingleCellExperiment package
library('SingleCellExperiment')
# Extract the count matrix from the 416b dataset
counts.416b <- counts(sce.416b)
# Construct a new SCE from the counts matrix
sce <- SingleCellExperiment(assays = list(counts = counts.416b))
# Inspect the object we just created
sce
## class: SingleCellExperiment
## dim: 46604 192
## metadata(0):
## assays(1): counts
## rownames(46604): ENSMUSG00000102693 ENSMUSG00000064842 ...
## ENSMUSG00000095742 CBFB-MYH11-mcherry
## rowData names(0):
## colnames(192): SLX-9555.N701_S502.C89V9ANXX.s_1.r_1
## SLX-9555.N701_S503.C89V9ANXX.s_1.r_1 ...
## SLX-11312.N712_S508.H5H5YBBXX.s_8.r_1
## SLX-11312.N712_S517.H5H5YBBXX.s_8.r_1
## colData names(0):
## reducedDimNames(0):
## altExpNames(0):
## How big is it?
pryr::object_size(sce)
## Registered S3 method overwritten by 'pryr':
## method from
## print.bytes Rcpp
## 40.1 MB
# Access the counts matrix from the assays slot
# WARNING: This will flood RStudio with output!
# 1. The general method
assay(sce, "counts")[1:6, 1:3]
## SLX-9555.N701_S502.C89V9ANXX.s_1.r_1
## ENSMUSG00000102693 0
## ENSMUSG00000064842 0
## ENSMUSG00000051951 0
## ENSMUSG00000102851 0
## ENSMUSG00000103377 0
## ENSMUSG00000104017 0
## SLX-9555.N701_S503.C89V9ANXX.s_1.r_1
## ENSMUSG00000102693 0
## ENSMUSG00000064842 0
## ENSMUSG00000051951 0
## ENSMUSG00000102851 0
## ENSMUSG00000103377 0
## ENSMUSG00000104017 0
## SLX-9555.N701_S504.C89V9ANXX.s_1.r_1
## ENSMUSG00000102693 0
## ENSMUSG00000064842 0
## ENSMUSG00000051951 0
## ENSMUSG00000102851 0
## ENSMUSG00000103377 0
## ENSMUSG00000104017 0
# 2. The special method for the assay named "counts"
counts(sce)[1:6, 1:3]
## SLX-9555.N701_S502.C89V9ANXX.s_1.r_1
## ENSMUSG00000102693 0
## ENSMUSG00000064842 0
## ENSMUSG00000051951 0
## ENSMUSG00000102851 0
## ENSMUSG00000103377 0
## ENSMUSG00000104017 0
## SLX-9555.N701_S503.C89V9ANXX.s_1.r_1
## ENSMUSG00000102693 0
## ENSMUSG00000064842 0
## ENSMUSG00000051951 0
## ENSMUSG00000102851 0
## ENSMUSG00000103377 0
## ENSMUSG00000104017 0
## SLX-9555.N701_S504.C89V9ANXX.s_1.r_1
## ENSMUSG00000102693 0
## ENSMUSG00000064842 0
## ENSMUSG00000051951 0
## ENSMUSG00000102851 0
## ENSMUSG00000103377 0
## ENSMUSG00000104017 0
sce <- scater::logNormCounts(sce)
# Inspect the object we just updated
sce
## class: SingleCellExperiment
## dim: 46604 192
## metadata(0):
## assays(2): counts logcounts
## rownames(46604): ENSMUSG00000102693 ENSMUSG00000064842 ...
## ENSMUSG00000095742 CBFB-MYH11-mcherry
## rowData names(0):
## colnames(192): SLX-9555.N701_S502.C89V9ANXX.s_1.r_1
## SLX-9555.N701_S503.C89V9ANXX.s_1.r_1 ...
## SLX-11312.N712_S508.H5H5YBBXX.s_8.r_1
## SLX-11312.N712_S517.H5H5YBBXX.s_8.r_1
## colData names(1): sizeFactor
## reducedDimNames(0):
## altExpNames(0):
## How big is it?
pryr::object_size(sce)
## 112 MB
# 1. The general method
assay(sce, "logcounts")[1:6, 1:3]
## SLX-9555.N701_S502.C89V9ANXX.s_1.r_1
## ENSMUSG00000102693 0
## ENSMUSG00000064842 0
## ENSMUSG00000051951 0
## ENSMUSG00000102851 0
## ENSMUSG00000103377 0
## ENSMUSG00000104017 0
## SLX-9555.N701_S503.C89V9ANXX.s_1.r_1
## ENSMUSG00000102693 0
## ENSMUSG00000064842 0
## ENSMUSG00000051951 0
## ENSMUSG00000102851 0
## ENSMUSG00000103377 0
## ENSMUSG00000104017 0
## SLX-9555.N701_S504.C89V9ANXX.s_1.r_1
## ENSMUSG00000102693 0
## ENSMUSG00000064842 0
## ENSMUSG00000051951 0
## ENSMUSG00000102851 0
## ENSMUSG00000103377 0
## ENSMUSG00000104017 0
# 2. The special method for the assay named "logcounts"
logcounts(sce)[1:6, 1:3]
## SLX-9555.N701_S502.C89V9ANXX.s_1.r_1
## ENSMUSG00000102693 0
## ENSMUSG00000064842 0
## ENSMUSG00000051951 0
## ENSMUSG00000102851 0
## ENSMUSG00000103377 0
## ENSMUSG00000104017 0
## SLX-9555.N701_S503.C89V9ANXX.s_1.r_1
## ENSMUSG00000102693 0
## ENSMUSG00000064842 0
## ENSMUSG00000051951 0
## ENSMUSG00000102851 0
## ENSMUSG00000103377 0
## ENSMUSG00000104017 0
## SLX-9555.N701_S504.C89V9ANXX.s_1.r_1
## ENSMUSG00000102693 0
## ENSMUSG00000064842 0
## ENSMUSG00000051951 0
## ENSMUSG00000102851 0
## ENSMUSG00000103377 0
## ENSMUSG00000104017 0
# assign a new entry to assays slot
assay(sce, "counts_100") <- assay(sce, "counts") + 100
# List the assays in the object
assays(sce)
## List of length 3
## names(3): counts logcounts counts_100
assayNames(sce)
## [1] "counts" "logcounts" "counts_100"
## How big is it?
pryr::object_size(sce)
## 183 MB
# Extract the sample metadata from the 416b dataset
colData.416b <- colData(sce.416b)
# Add some of the sample metadata to our SCE
colData(sce) <- colData.416b[, c("phenotype", "block")]
# Inspect the object we just updated
sce
## class: SingleCellExperiment
## dim: 46604 192
## metadata(0):
## assays(3): counts logcounts counts_100
## rownames(46604): ENSMUSG00000102693 ENSMUSG00000064842 ...
## ENSMUSG00000095742 CBFB-MYH11-mcherry
## rowData names(0):
## colnames(192): SLX-9555.N701_S502.C89V9ANXX.s_1.r_1
## SLX-9555.N701_S503.C89V9ANXX.s_1.r_1 ...
## SLX-11312.N712_S508.H5H5YBBXX.s_8.r_1
## SLX-11312.N712_S517.H5H5YBBXX.s_8.r_1
## colData names(2): phenotype block
## reducedDimNames(0):
## altExpNames(0):
# Access the sample metadata from our SCE
colData(sce)
## DataFrame with 192 rows and 2 columns
## phenotype block
## <character> <integer>
## SLX-9555.N701_S502.C89V9ANXX.s_1.r_1 wild type phenotype 20160113
## SLX-9555.N701_S503.C89V9ANXX.s_1.r_1 wild type phenotype 20160113
## SLX-9555.N701_S504.C89V9ANXX.s_1.r_1 wild type phenotype 20160113
## SLX-9555.N701_S505.C89V9ANXX.s_1.r_1 induced CBFB-MYH11 o.. 20160113
## SLX-9555.N701_S506.C89V9ANXX.s_1.r_1 induced CBFB-MYH11 o.. 20160113
## ... ... ...
## SLX-11312.N712_S505.H5H5YBBXX.s_8.r_1 induced CBFB-MYH11 o.. 20160325
## SLX-11312.N712_S506.H5H5YBBXX.s_8.r_1 induced CBFB-MYH11 o.. 20160325
## SLX-11312.N712_S507.H5H5YBBXX.s_8.r_1 induced CBFB-MYH11 o.. 20160325
## SLX-11312.N712_S508.H5H5YBBXX.s_8.r_1 induced CBFB-MYH11 o.. 20160325
## SLX-11312.N712_S517.H5H5YBBXX.s_8.r_1 wild type phenotype 20160325
# Access a specific column of sample metadata from our SCE
table(sce$block)
##
## 20160113 20160325
## 96 96
# Example of function that adds extra fields to colData
sce <- scater::addPerCellQC(sce.416b)
# Access the sample metadata from our updated SCE
colData(sce)
## DataFrame with 192 rows and 18 columns
## Source Name cell line
## <character> <character>
## SLX-9555.N701_S502.C89V9ANXX.s_1.r_1 SLX-9555.N701_S502.C.. 416B
## SLX-9555.N701_S503.C89V9ANXX.s_1.r_1 SLX-9555.N701_S503.C.. 416B
## SLX-9555.N701_S504.C89V9ANXX.s_1.r_1 SLX-9555.N701_S504.C.. 416B
## SLX-9555.N701_S505.C89V9ANXX.s_1.r_1 SLX-9555.N701_S505.C.. 416B
## SLX-9555.N701_S506.C89V9ANXX.s_1.r_1 SLX-9555.N701_S506.C.. 416B
## ... ... ...
## SLX-11312.N712_S505.H5H5YBBXX.s_8.r_1 SLX-11312.N712_S505... 416B
## SLX-11312.N712_S506.H5H5YBBXX.s_8.r_1 SLX-11312.N712_S506... 416B
## SLX-11312.N712_S507.H5H5YBBXX.s_8.r_1 SLX-11312.N712_S507... 416B
## SLX-11312.N712_S508.H5H5YBBXX.s_8.r_1 SLX-11312.N712_S508... 416B
## SLX-11312.N712_S517.H5H5YBBXX.s_8.r_1 SLX-11312.N712_S517... 416B
## cell type
## <character>
## SLX-9555.N701_S502.C89V9ANXX.s_1.r_1 embryonic stem cell
## SLX-9555.N701_S503.C89V9ANXX.s_1.r_1 embryonic stem cell
## SLX-9555.N701_S504.C89V9ANXX.s_1.r_1 embryonic stem cell
## SLX-9555.N701_S505.C89V9ANXX.s_1.r_1 embryonic stem cell
## SLX-9555.N701_S506.C89V9ANXX.s_1.r_1 embryonic stem cell
## ... ...
## SLX-11312.N712_S505.H5H5YBBXX.s_8.r_1 embryonic stem cell
## SLX-11312.N712_S506.H5H5YBBXX.s_8.r_1 embryonic stem cell
## SLX-11312.N712_S507.H5H5YBBXX.s_8.r_1 embryonic stem cell
## SLX-11312.N712_S508.H5H5YBBXX.s_8.r_1 embryonic stem cell
## SLX-11312.N712_S517.H5H5YBBXX.s_8.r_1 embryonic stem cell
## single cell well quality
## <character>
## SLX-9555.N701_S502.C89V9ANXX.s_1.r_1 OK
## SLX-9555.N701_S503.C89V9ANXX.s_1.r_1 OK
## SLX-9555.N701_S504.C89V9ANXX.s_1.r_1 OK
## SLX-9555.N701_S505.C89V9ANXX.s_1.r_1 OK
## SLX-9555.N701_S506.C89V9ANXX.s_1.r_1 OK
## ... ...
## SLX-11312.N712_S505.H5H5YBBXX.s_8.r_1 OK
## SLX-11312.N712_S506.H5H5YBBXX.s_8.r_1 OK
## SLX-11312.N712_S507.H5H5YBBXX.s_8.r_1 OK
## SLX-11312.N712_S508.H5H5YBBXX.s_8.r_1 OK
## SLX-11312.N712_S517.H5H5YBBXX.s_8.r_1 OK
## genotype
## <character>
## SLX-9555.N701_S502.C89V9ANXX.s_1.r_1 Doxycycline-inducibl..
## SLX-9555.N701_S503.C89V9ANXX.s_1.r_1 Doxycycline-inducibl..
## SLX-9555.N701_S504.C89V9ANXX.s_1.r_1 Doxycycline-inducibl..
## SLX-9555.N701_S505.C89V9ANXX.s_1.r_1 Doxycycline-inducibl..
## SLX-9555.N701_S506.C89V9ANXX.s_1.r_1 Doxycycline-inducibl..
## ... ...
## SLX-11312.N712_S505.H5H5YBBXX.s_8.r_1 Doxycycline-inducibl..
## SLX-11312.N712_S506.H5H5YBBXX.s_8.r_1 Doxycycline-inducibl..
## SLX-11312.N712_S507.H5H5YBBXX.s_8.r_1 Doxycycline-inducibl..
## SLX-11312.N712_S508.H5H5YBBXX.s_8.r_1 Doxycycline-inducibl..
## SLX-11312.N712_S517.H5H5YBBXX.s_8.r_1 Doxycycline-inducibl..
## phenotype strain
## <character> <character>
## SLX-9555.N701_S502.C89V9ANXX.s_1.r_1 wild type phenotype B6D2F1-J
## SLX-9555.N701_S503.C89V9ANXX.s_1.r_1 wild type phenotype B6D2F1-J
## SLX-9555.N701_S504.C89V9ANXX.s_1.r_1 wild type phenotype B6D2F1-J
## SLX-9555.N701_S505.C89V9ANXX.s_1.r_1 induced CBFB-MYH11 o.. B6D2F1-J
## SLX-9555.N701_S506.C89V9ANXX.s_1.r_1 induced CBFB-MYH11 o.. B6D2F1-J
## ... ... ...
## SLX-11312.N712_S505.H5H5YBBXX.s_8.r_1 induced CBFB-MYH11 o.. B6D2F1-J
## SLX-11312.N712_S506.H5H5YBBXX.s_8.r_1 induced CBFB-MYH11 o.. B6D2F1-J
## SLX-11312.N712_S507.H5H5YBBXX.s_8.r_1 induced CBFB-MYH11 o.. B6D2F1-J
## SLX-11312.N712_S508.H5H5YBBXX.s_8.r_1 induced CBFB-MYH11 o.. B6D2F1-J
## SLX-11312.N712_S517.H5H5YBBXX.s_8.r_1 wild type phenotype B6D2F1-J
## spike-in addition block sum
## <character> <integer> <numeric>
## SLX-9555.N701_S502.C89V9ANXX.s_1.r_1 ERCC+SIRV 20160113 865936
## SLX-9555.N701_S503.C89V9ANXX.s_1.r_1 ERCC+SIRV 20160113 1076277
## SLX-9555.N701_S504.C89V9ANXX.s_1.r_1 ERCC+SIRV 20160113 1180138
## SLX-9555.N701_S505.C89V9ANXX.s_1.r_1 ERCC+SIRV 20160113 1342593
## SLX-9555.N701_S506.C89V9ANXX.s_1.r_1 ERCC+SIRV 20160113 1668311
## ... ... ... ...
## SLX-11312.N712_S505.H5H5YBBXX.s_8.r_1 Premixed 20160325 776622
## SLX-11312.N712_S506.H5H5YBBXX.s_8.r_1 Premixed 20160325 1299950
## SLX-11312.N712_S507.H5H5YBBXX.s_8.r_1 Premixed 20160325 1800696
## SLX-11312.N712_S508.H5H5YBBXX.s_8.r_1 Premixed 20160325 46731
## SLX-11312.N712_S517.H5H5YBBXX.s_8.r_1 Premixed 20160325 1866692
## detected altexps_ERCC_sum
## <numeric> <numeric>
## SLX-9555.N701_S502.C89V9ANXX.s_1.r_1 7618 65278
## SLX-9555.N701_S503.C89V9ANXX.s_1.r_1 7521 74748
## SLX-9555.N701_S504.C89V9ANXX.s_1.r_1 8306 60878
## SLX-9555.N701_S505.C89V9ANXX.s_1.r_1 8143 60073
## SLX-9555.N701_S506.C89V9ANXX.s_1.r_1 7154 136810
## ... ... ...
## SLX-11312.N712_S505.H5H5YBBXX.s_8.r_1 8174 61575
## SLX-11312.N712_S506.H5H5YBBXX.s_8.r_1 8956 94982
## SLX-11312.N712_S507.H5H5YBBXX.s_8.r_1 9530 113707
## SLX-11312.N712_S508.H5H5YBBXX.s_8.r_1 6649 7580
## SLX-11312.N712_S517.H5H5YBBXX.s_8.r_1 10964 48664
## altexps_ERCC_detected
## <numeric>
## SLX-9555.N701_S502.C89V9ANXX.s_1.r_1 39
## SLX-9555.N701_S503.C89V9ANXX.s_1.r_1 40
## SLX-9555.N701_S504.C89V9ANXX.s_1.r_1 42
## SLX-9555.N701_S505.C89V9ANXX.s_1.r_1 42
## SLX-9555.N701_S506.C89V9ANXX.s_1.r_1 44
## ... ...
## SLX-11312.N712_S505.H5H5YBBXX.s_8.r_1 39
## SLX-11312.N712_S506.H5H5YBBXX.s_8.r_1 41
## SLX-11312.N712_S507.H5H5YBBXX.s_8.r_1 40
## SLX-11312.N712_S508.H5H5YBBXX.s_8.r_1 44
## SLX-11312.N712_S517.H5H5YBBXX.s_8.r_1 39
## altexps_ERCC_percent altexps_SIRV_sum
## <numeric> <numeric>
## SLX-9555.N701_S502.C89V9ANXX.s_1.r_1 6.80658 27828
## SLX-9555.N701_S503.C89V9ANXX.s_1.r_1 6.28030 39173
## SLX-9555.N701_S504.C89V9ANXX.s_1.r_1 4.78949 30058
## SLX-9555.N701_S505.C89V9ANXX.s_1.r_1 4.18567 32542
## SLX-9555.N701_S506.C89V9ANXX.s_1.r_1 7.28887 71850
## ... ... ...
## SLX-11312.N712_S505.H5H5YBBXX.s_8.r_1 7.17620 19848
## SLX-11312.N712_S506.H5H5YBBXX.s_8.r_1 6.65764 31729
## SLX-11312.N712_S507.H5H5YBBXX.s_8.r_1 5.81467 41116
## SLX-11312.N712_S508.H5H5YBBXX.s_8.r_1 13.48898 1883
## SLX-11312.N712_S517.H5H5YBBXX.s_8.r_1 2.51930 16289
## altexps_SIRV_detected
## <numeric>
## SLX-9555.N701_S502.C89V9ANXX.s_1.r_1 7
## SLX-9555.N701_S503.C89V9ANXX.s_1.r_1 7
## SLX-9555.N701_S504.C89V9ANXX.s_1.r_1 7
## SLX-9555.N701_S505.C89V9ANXX.s_1.r_1 7
## SLX-9555.N701_S506.C89V9ANXX.s_1.r_1 7
## ... ...
## SLX-11312.N712_S505.H5H5YBBXX.s_8.r_1 7
## SLX-11312.N712_S506.H5H5YBBXX.s_8.r_1 7
## SLX-11312.N712_S507.H5H5YBBXX.s_8.r_1 7
## SLX-11312.N712_S508.H5H5YBBXX.s_8.r_1 7
## SLX-11312.N712_S517.H5H5YBBXX.s_8.r_1 7
## altexps_SIRV_percent total
## <numeric> <numeric>
## SLX-9555.N701_S502.C89V9ANXX.s_1.r_1 2.90165 959042
## SLX-9555.N701_S503.C89V9ANXX.s_1.r_1 3.29130 1190198
## SLX-9555.N701_S504.C89V9ANXX.s_1.r_1 2.36477 1271074
## SLX-9555.N701_S505.C89V9ANXX.s_1.r_1 2.26741 1435208
## SLX-9555.N701_S506.C89V9ANXX.s_1.r_1 3.82798 1876971
## ... ... ...
## SLX-11312.N712_S505.H5H5YBBXX.s_8.r_1 2.313165 858045
## SLX-11312.N712_S506.H5H5YBBXX.s_8.r_1 2.224004 1426661
## SLX-11312.N712_S507.H5H5YBBXX.s_8.r_1 2.102562 1955519
## SLX-11312.N712_S508.H5H5YBBXX.s_8.r_1 3.350892 56194
## SLX-11312.N712_S517.H5H5YBBXX.s_8.r_1 0.843271 1931645
# Inspect the object we just updated
sce
## class: SingleCellExperiment
## dim: 46604 192
## metadata(0):
## assays(1): counts
## rownames(46604): ENSMUSG00000102693 ENSMUSG00000064842 ...
## ENSMUSG00000095742 CBFB-MYH11-mcherry
## rowData names(1): Length
## colnames(192): SLX-9555.N701_S502.C89V9ANXX.s_1.r_1
## SLX-9555.N701_S503.C89V9ANXX.s_1.r_1 ...
## SLX-11312.N712_S508.H5H5YBBXX.s_8.r_1
## SLX-11312.N712_S517.H5H5YBBXX.s_8.r_1
## colData names(18): Source Name cell line ... altexps_SIRV_percent total
## reducedDimNames(0):
## altExpNames(2): ERCC SIRV
## How big is it?
pryr::object_size(sce)
## 41.4 MB
## Add the lognorm counts again
sce <- scater::logNormCounts(sce)
## How big is it?
pryr::object_size(sce)
## 113 MB
# E.g., subset data to just wild type cells
# Remember, cells are columns of the SCE
sce[, sce$phenotype == "wild type phenotype"]
## class: SingleCellExperiment
## dim: 46604 96
## metadata(0):
## assays(2): counts logcounts
## rownames(46604): ENSMUSG00000102693 ENSMUSG00000064842 ...
## ENSMUSG00000095742 CBFB-MYH11-mcherry
## rowData names(1): Length
## colnames(96): SLX-9555.N701_S502.C89V9ANXX.s_1.r_1
## SLX-9555.N701_S503.C89V9ANXX.s_1.r_1 ...
## SLX-11312.N712_S504.H5H5YBBXX.s_8.r_1
## SLX-11312.N712_S517.H5H5YBBXX.s_8.r_1
## colData names(19): Source Name cell line ... total sizeFactor
## reducedDimNames(0):
## altExpNames(2): ERCC SIRV
# Access the feature metadata from our SCE
# It's currently empty!
rowData(sce)
## DataFrame with 46604 rows and 1 column
## Length
## <integer>
## ENSMUSG00000102693 1070
## ENSMUSG00000064842 110
## ENSMUSG00000051951 6094
## ENSMUSG00000102851 480
## ENSMUSG00000103377 2819
## ... ...
## ENSMUSG00000094621 121
## ENSMUSG00000098647 99
## ENSMUSG00000096730 3077
## ENSMUSG00000095742 243
## CBFB-MYH11-mcherry 2998
# Example of function that adds extra fields to rowData
sce <- scater::addPerFeatureQC(sce)
# Access the feature metadata from our updated SCE
rowData(sce)
## DataFrame with 46604 rows and 3 columns
## Length mean detected
## <integer> <numeric> <numeric>
## ENSMUSG00000102693 1070 0.0000000 0.000000
## ENSMUSG00000064842 110 0.0000000 0.000000
## ENSMUSG00000051951 6094 0.0000000 0.000000
## ENSMUSG00000102851 480 0.0000000 0.000000
## ENSMUSG00000103377 2819 0.0104167 0.520833
## ... ... ... ...
## ENSMUSG00000094621 121 0.0 0
## ENSMUSG00000098647 99 0.0 0
## ENSMUSG00000096730 3077 0.0 0
## ENSMUSG00000095742 243 0.0 0
## CBFB-MYH11-mcherry 2998 50375.7 100
## How big is it?
pryr::object_size(sce)
## 114 MB
# Download the relevant Ensembl annotation database
# using AnnotationHub resources
library('AnnotationHub')
## Loading required package: BiocFileCache
## Loading required package: dbplyr
##
## Attaching package: 'AnnotationHub'
## The following object is masked from 'package:Biobase':
##
## cache
ah <- AnnotationHub()
## snapshotDate(): 2020-10-27
query(ah, c("Mus musculus", "Ensembl", "v97"))
## AnnotationHub with 1 record
## # snapshotDate(): 2020-10-27
## # names(): AH73905
## # $dataprovider: Ensembl
## # $species: Mus musculus
## # $rdataclass: EnsDb
## # $rdatadateadded: 2019-05-02
## # $title: Ensembl 97 EnsDb for Mus musculus
## # $description: Gene and protein annotations for Mus musculus based on Ensem...
## # $taxonomyid: 10090
## # $genome: GRCm38
## # $sourcetype: ensembl
## # $sourceurl: http://www.ensembl.org
## # $sourcesize: NA
## # $tags: c("97", "AHEnsDbs", "Annotation", "EnsDb", "Ensembl", "Gene",
## # "Protein", "Transcript")
## # retrieve record with 'object[["AH73905"]]'
# Annotate each gene with its chromosome location
ensdb <- ah[["AH73905"]]
## loading from cache
chromosome <- mapIds(ensdb,
keys = rownames(sce),
keytype = "GENEID",
column = "SEQNAME")
## Warning: Unable to map 563 of 46604 requested IDs.
rowData(sce)$chromosome <- chromosome
# Access the feature metadata from our updated SCE
rowData(sce)
## DataFrame with 46604 rows and 4 columns
## Length mean detected chromosome
## <integer> <numeric> <numeric> <character>
## ENSMUSG00000102693 1070 0.0000000 0.000000 1
## ENSMUSG00000064842 110 0.0000000 0.000000 1
## ENSMUSG00000051951 6094 0.0000000 0.000000 1
## ENSMUSG00000102851 480 0.0000000 0.000000 1
## ENSMUSG00000103377 2819 0.0104167 0.520833 1
## ... ... ... ... ...
## ENSMUSG00000094621 121 0.0 0 GL456372.1
## ENSMUSG00000098647 99 0.0 0 GL456381.1
## ENSMUSG00000096730 3077 0.0 0 JH584292.1
## ENSMUSG00000095742 243 0.0 0 JH584295.1
## CBFB-MYH11-mcherry 2998 50375.7 100 NA
## How big is it?
pryr::object_size(sce)
## 114 MB
# E.g., subset data to just genes on chromosome 3
# NOTE: which() needed to cope with NA chromosome names
sce[which(rowData(sce)$chromosome == "3"), ]
## class: SingleCellExperiment
## dim: 2876 192
## metadata(0):
## assays(2): counts logcounts
## rownames(2876): ENSMUSG00000098982 ENSMUSG00000098307 ...
## ENSMUSG00000105990 ENSMUSG00000075903
## rowData names(4): Length mean detected chromosome
## colnames(192): SLX-9555.N701_S502.C89V9ANXX.s_1.r_1
## SLX-9555.N701_S503.C89V9ANXX.s_1.r_1 ...
## SLX-11312.N712_S508.H5H5YBBXX.s_8.r_1
## SLX-11312.N712_S517.H5H5YBBXX.s_8.r_1
## colData names(19): Source Name cell line ... total sizeFactor
## reducedDimNames(0):
## altExpNames(2): ERCC SIRV
# Access the metadata from our SCE
# It's currently empty!
metadata(sce)
## list()
# The metadata slot is Vegas - anything goes
metadata(sce) <- list(favourite_genes = c("Shh", "Nck1", "Diablo"),
analyst = c("Pete"))
# Access the metadata from our updated SCE
metadata(sce)
## $favourite_genes
## [1] "Shh" "Nck1" "Diablo"
##
## $analyst
## [1] "Pete"
# E.g., add the PCA of logcounts
# NOTE: We'll learn more about PCA later
sce <- scater::runPCA(sce)
# Inspect the object we just updated
sce
## class: SingleCellExperiment
## dim: 46604 192
## metadata(2): favourite_genes analyst
## assays(2): counts logcounts
## rownames(46604): ENSMUSG00000102693 ENSMUSG00000064842 ...
## ENSMUSG00000095742 CBFB-MYH11-mcherry
## rowData names(4): Length mean detected chromosome
## colnames(192): SLX-9555.N701_S502.C89V9ANXX.s_1.r_1
## SLX-9555.N701_S503.C89V9ANXX.s_1.r_1 ...
## SLX-11312.N712_S508.H5H5YBBXX.s_8.r_1
## SLX-11312.N712_S517.H5H5YBBXX.s_8.r_1
## colData names(19): Source Name cell line ... total sizeFactor
## reducedDimNames(1): PCA
## altExpNames(2): ERCC SIRV
# Access the PCA matrix from the reducedDims slot
reducedDim(sce, "PCA")[1:6, 1:3]
## PC1 PC2 PC3
## SLX-9555.N701_S502.C89V9ANXX.s_1.r_1 18.717668 27.598132 -5.939654
## SLX-9555.N701_S503.C89V9ANXX.s_1.r_1 2.480705 27.564583 -4.916567
## SLX-9555.N701_S504.C89V9ANXX.s_1.r_1 42.034018 7.552435 -12.126964
## SLX-9555.N701_S505.C89V9ANXX.s_1.r_1 -8.494303 -31.833727 -15.760853
## SLX-9555.N701_S506.C89V9ANXX.s_1.r_1 -49.737390 -4.226795 -6.123169
## SLX-9555.N701_S507.C89V9ANXX.s_1.r_1 -44.528081 3.215503 -10.384939
# E.g., add a t-SNE representation of logcounts
# NOTE: We'll learn more about t-SNE later
sce <- scater::runTSNE(sce)
# Inspect the object we just updated
sce
## class: SingleCellExperiment
## dim: 46604 192
## metadata(2): favourite_genes analyst
## assays(2): counts logcounts
## rownames(46604): ENSMUSG00000102693 ENSMUSG00000064842 ...
## ENSMUSG00000095742 CBFB-MYH11-mcherry
## rowData names(4): Length mean detected chromosome
## colnames(192): SLX-9555.N701_S502.C89V9ANXX.s_1.r_1
## SLX-9555.N701_S503.C89V9ANXX.s_1.r_1 ...
## SLX-11312.N712_S508.H5H5YBBXX.s_8.r_1
## SLX-11312.N712_S517.H5H5YBBXX.s_8.r_1
## colData names(19): Source Name cell line ... total sizeFactor
## reducedDimNames(2): PCA TSNE
## altExpNames(2): ERCC SIRV
# Access the t-SNE matrix from the reducedDims slot
head(reducedDim(sce, "TSNE"))
## [,1] [,2]
## SLX-9555.N701_S502.C89V9ANXX.s_1.r_1 6.117248 2.3252320
## SLX-9555.N701_S503.C89V9ANXX.s_1.r_1 3.223016 -0.7115595
## SLX-9555.N701_S504.C89V9ANXX.s_1.r_1 6.331325 4.9120344
## SLX-9555.N701_S505.C89V9ANXX.s_1.r_1 -1.653247 4.0296996
## SLX-9555.N701_S506.C89V9ANXX.s_1.r_1 -4.892326 -7.0317003
## SLX-9555.N701_S507.C89V9ANXX.s_1.r_1 -3.711729 -7.6197516
# E.g., add a 'manual' UMAP representation of logcounts
# NOTE: We'll learn more about UMAP later and a
# simpler way to compute it.
u <- uwot::umap(t(logcounts(sce)), n_components = 2)
# Add the UMAP matrix to the reducedDims slot
# Access the UMAP matrix from the reducedDims slot
reducedDim(sce, "UMAP") <- u
# List the dimensionality reduction results stored in # the object
reducedDims(sce)
## List of length 3
## names(3): PCA TSNE UMAP
Below I copy-pasted the code from https://github.com/lcolladotor/osca_LIIGH_UNAM_2020/blob/master/03-quality-control.R#L57-L226. The equivalent version with comments in Spanish is available from https://github.com/ComunidadBioInfo/cdsb2020/blob/master/scRNAseq/03-quality-control.R#L57-L233.
## ----all_code, cache=TRUE--------------------------------------------------------------------------------------------
## Data
library('scRNAseq')
sce.416b <- LunSpikeInData(which = "416b")
## snapshotDate(): 2020-10-27
## see ?scRNAseq and browseVignettes('scRNAseq') for documentation
## loading from cache
## see ?scRNAseq and browseVignettes('scRNAseq') for documentation
## loading from cache
## see ?scRNAseq and browseVignettes('scRNAseq') for documentation
## loading from cache
## snapshotDate(): 2020-10-27
## see ?scRNAseq and browseVignettes('scRNAseq') for documentation
## loading from cache
## snapshotDate(): 2020-10-27
## loading from cache
sce.416b$block <- factor(sce.416b$block)
# Download the relevant Ensembl annotation database
# using AnnotationHub resources
library('AnnotationHub')
ah <- AnnotationHub()
## snapshotDate(): 2020-10-27
query(ah, c("Mus musculus", "Ensembl", "v97"))
## AnnotationHub with 1 record
## # snapshotDate(): 2020-10-27
## # names(): AH73905
## # $dataprovider: Ensembl
## # $species: Mus musculus
## # $rdataclass: EnsDb
## # $rdatadateadded: 2019-05-02
## # $title: Ensembl 97 EnsDb for Mus musculus
## # $description: Gene and protein annotations for Mus musculus based on Ensem...
## # $taxonomyid: 10090
## # $genome: GRCm38
## # $sourcetype: ensembl
## # $sourceurl: http://www.ensembl.org
## # $sourcesize: NA
## # $tags: c("97", "AHEnsDbs", "Annotation", "EnsDb", "Ensembl", "Gene",
## # "Protein", "Transcript")
## # retrieve record with 'object[["AH73905"]]'
# Annotate each gene with its chromosome location
ens.mm.v97 <- ah[["AH73905"]]
## loading from cache
location <- mapIds(
ens.mm.v97,
keys = rownames(sce.416b),
keytype = "GENEID",
column = "SEQNAME"
)
## Warning: Unable to map 563 of 46604 requested IDs.
# Identify the mitochondrial genes
is.mito <- which(location == "MT")
library('scater')
## Loading required package: ggplot2
sce.416b <- addPerCellQC(sce.416b,
subsets = list(Mito = is.mito))
## ----qc_metrics, cache=TRUE, dependson='all_code'--------------------------------------------------------------------
plotColData(sce.416b, x = "block", y = "detected")
plotColData(sce.416b, x = "block", y = "detected") +
scale_y_log10()
plotColData(sce.416b,
x = "block",
y = "detected",
other_fields = "phenotype") +
scale_y_log10() +
facet_wrap( ~ phenotype)
## ----all_code_part2, cache = TRUE, dependson='all_code'--------------------------------------------------------------
# Example thresholds
qc.lib <- sce.416b$sum < 100000
qc.nexprs <- sce.416b$detected < 5000
qc.spike <- sce.416b$altexps_ERCC_percent > 10
qc.mito <- sce.416b$subsets_Mito_percent > 10
discard <- qc.lib | qc.nexprs | qc.spike | qc.mito
# Summarize the number of cells removed for each reason
DataFrame(
LibSize = sum(qc.lib),
NExprs = sum(qc.nexprs),
SpikeProp = sum(qc.spike),
MitoProp = sum(qc.mito),
Total = sum(discard)
)
## DataFrame with 1 row and 5 columns
## LibSize NExprs SpikeProp MitoProp Total
## <integer> <integer> <integer> <integer> <integer>
## 1 3 0 19 14 33
qc.lib2 <- isOutlier(sce.416b$sum, log = TRUE, type = "lower")
qc.nexprs2 <- isOutlier(sce.416b$detected, log = TRUE,
type = "lower")
qc.spike2 <- isOutlier(sce.416b$altexps_ERCC_percent,
type = "higher")
qc.mito2 <- isOutlier(sce.416b$subsets_Mito_percent,
type = "higher")
discard2 <- qc.lib2 | qc.nexprs2 | qc.spike2 | qc.mito2
# Extract the thresholds
attr(qc.lib2, "thresholds")
## lower higher
## 434082.9 Inf
attr(qc.nexprs2, "thresholds")
## lower higher
## 5231.468 Inf
# Summarize the number of cells removed for each reason.
DataFrame(
LibSize = sum(qc.lib2),
NExprs = sum(qc.nexprs2),
SpikeProp = sum(qc.spike2),
MitoProp = sum(qc.mito2),
Total = sum(discard2)
)
## DataFrame with 1 row and 5 columns
## LibSize NExprs SpikeProp MitoProp Total
## <integer> <integer> <integer> <integer> <integer>
## 1 4 0 1 2 6
## More checks
plotColData(sce.416b,
x = "block",
y = "detected",
other_fields = "phenotype") +
scale_y_log10() +
facet_wrap( ~ phenotype)
batch <- paste0(sce.416b$phenotype, "-", sce.416b$block)
qc.lib3 <- isOutlier(sce.416b$sum,
log = TRUE,
type = "lower",
batch = batch)
qc.nexprs3 <- isOutlier(sce.416b$detected,
log = TRUE,
type = "lower",
batch = batch)
qc.spike3 <- isOutlier(sce.416b$altexps_ERCC_percent,
type = "higher",
batch = batch)
qc.mito3 <- isOutlier(sce.416b$subsets_Mito_percent,
type = "higher",
batch = batch)
discard3 <- qc.lib3 | qc.nexprs3 | qc.spike3 | qc.mito3
# Extract the thresholds
attr(qc.lib3, "thresholds")
## induced CBFB-MYH11 oncogene expression-20160113
## lower 461073.1
## higher Inf
## induced CBFB-MYH11 oncogene expression-20160325
## lower 399133.7
## higher Inf
## wild type phenotype-20160113 wild type phenotype-20160325
## lower 599794.9 370316.5
## higher Inf Inf
attr(qc.nexprs3, "thresholds")
## induced CBFB-MYH11 oncogene expression-20160113
## lower 5399.24
## higher Inf
## induced CBFB-MYH11 oncogene expression-20160325
## lower 6519.74
## higher Inf
## wild type phenotype-20160113 wild type phenotype-20160325
## lower 7215.887 7586.402
## higher Inf Inf
# Summarize the number of cells removed for each reason
DataFrame(
LibSize = sum(qc.lib3),
NExprs = sum(qc.nexprs3),
SpikeProp = sum(qc.spike3),
MitoProp = sum(qc.mito3),
Total = sum(discard3)
)
## DataFrame with 1 row and 5 columns
## LibSize NExprs SpikeProp MitoProp Total
## <integer> <integer> <integer> <integer> <integer>
## 1 5 4 6 2 9
## ----use_case, cache=TRUE, dependson= c('all_code', 'all_code_part2')------------------------------------------------
sce.grun <- GrunPancreasData()
## snapshotDate(): 2020-10-27
## see ?scRNAseq and browseVignettes('scRNAseq') for documentation
## downloading 1 resources
## retrieving 1 resource
## loading from cache
## snapshotDate(): 2020-10-27
## see ?scRNAseq and browseVignettes('scRNAseq') for documentation
## loading from cache
sce.grun <- addPerCellQC(sce.grun)
plotColData(sce.grun, x = "donor", y = "altexps_ERCC_percent")
## Warning: Removed 10 rows containing non-finite values (stat_ydensity).
## Warning: Removed 10 rows containing missing values (position_quasirandom).
discard.ercc <- isOutlier(sce.grun$altexps_ERCC_percent,
type = "higher",
batch = sce.grun$donor)
## Warning in .get_med_and_mad(metric, batch = batch, subset = subset,
## share.medians = share.medians, : missing values ignored during outlier detection
discard.ercc2 <- isOutlier(
sce.grun$altexps_ERCC_percent,
type = "higher",
batch = sce.grun$donor,
subset = sce.grun$donor %in% c("D17", "D2", "D7")
)
## Warning in .get_med_and_mad(metric, batch = batch, subset = subset,
## share.medians = share.medians, : missing values ignored during outlier detection
plotColData(
sce.grun,
x = "donor",
y = "altexps_ERCC_percent",
colour_by = data.frame(discard = discard.ercc)
)
## Warning: Removed 10 rows containing non-finite values (stat_ydensity).
## Warning: Removed 10 rows containing missing values (position_quasirandom).
plotColData(
sce.grun,
x = "donor",
y = "altexps_ERCC_percent",
colour_by = data.frame(discard = discard.ercc2)
)
## Warning: Removed 10 rows containing non-finite values (stat_ydensity).
## Warning: Removed 10 rows containing missing values (position_quasirandom).
# Add info about which cells are outliers
sce.416b$discard <- discard2
# Look at this plot for each QC metric
plotColData(
sce.416b,
x = "block",
y = "sum",
colour_by = "discard",
other_fields = "phenotype"
) +
facet_wrap( ~ phenotype) +
scale_y_log10()
# Another useful diagnostic plot
plotColData(
sce.416b,
x = "sum",
y = "subsets_Mito_percent",
colour_by = "discard",
other_fields = c("block", "phenotype")
) +
facet_grid(block ~ phenotype)
Below is the R session information used for making this document.
## Reproducibility information
print('Reproducibility information:')
## [1] "Reproducibility information:"
Sys.time()
## [1] "2021-05-02 15:02:20 EDT"
proc.time()
## user system elapsed
## 112.925 9.019 176.469
options(width = 120)
sessioninfo::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
## setting value
## version R version 4.0.5 Patched (2021-03-31 r80247)
## os macOS Big Sur 10.16
## system x86_64, darwin17.0
## ui X11
## language (EN)
## collate en_US.UTF-8
## ctype en_US.UTF-8
## tz America/New_York
## date 2021-05-02
##
## ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
## package * version date lib source
## AnnotationDbi * 1.52.0 2020-10-27 [1] Bioconductor
## AnnotationFilter * 1.14.0 2020-10-27 [1] Bioconductor
## AnnotationHub * 2.22.1 2021-04-16 [1] Bioconductor
## askpass 1.1 2019-01-13 [1] CRAN (R 4.0.2)
## assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.2)
## beachmat 2.6.4 2020-12-20 [1] Bioconductor
## beeswarm 0.3.1 2021-03-07 [1] CRAN (R 4.0.2)
## Biobase * 2.50.0 2020-10-27 [1] Bioconductor
## BiocFileCache * 1.14.0 2020-10-27 [1] Bioconductor
## BiocGenerics * 0.36.1 2021-04-16 [1] Bioconductor
## BiocManager 1.30.12 2021-03-28 [1] CRAN (R 4.0.2)
## BiocNeighbors 1.8.2 2020-12-07 [1] Bioconductor
## BiocParallel 1.24.1 2020-11-06 [1] Bioconductor
## BiocSingular 1.6.0 2020-10-27 [1] Bioconductor
## BiocStyle * 2.18.1 2020-11-24 [1] Bioconductor
## BiocVersion 3.12.0 2020-05-14 [1] Bioconductor
## biomaRt 2.46.3 2021-02-09 [1] Bioconductor
## Biostrings 2.58.0 2020-10-27 [1] Bioconductor
## bit 4.0.4 2020-08-04 [1] CRAN (R 4.0.2)
## bit64 4.0.5 2020-08-30 [1] CRAN (R 4.0.2)
## bitops 1.0-7 2021-04-24 [1] CRAN (R 4.0.2)
## blob 1.2.1 2020-01-20 [1] CRAN (R 4.0.2)
## bookdown 0.22 2021-04-22 [1] CRAN (R 4.0.2)
## bslib 0.2.4 2021-01-25 [1] CRAN (R 4.0.3)
## cachem 1.0.4 2021-02-13 [1] CRAN (R 4.0.2)
## cli 2.5.0 2021-04-26 [1] CRAN (R 4.0.5)
## codetools 0.2-18 2020-11-04 [1] CRAN (R 4.0.5)
## colorout 1.2-2 2020-11-03 [1] Github (jalvesaq/colorout@726d681)
## colorspace 2.0-0 2020-11-11 [1] CRAN (R 4.0.2)
## cowplot 1.1.1 2020-12-30 [1] CRAN (R 4.0.2)
## crayon 1.4.1 2021-02-08 [1] CRAN (R 4.0.2)
## curl 4.3.1 2021-04-30 [1] CRAN (R 4.0.2)
## DBI 1.1.1 2021-01-15 [1] CRAN (R 4.0.2)
## dbplyr * 2.1.1 2021-04-06 [1] CRAN (R 4.0.2)
## DelayedArray 0.16.3 2021-03-24 [1] Bioconductor
## DelayedMatrixStats 1.12.3 2021-02-03 [1] Bioconductor
## digest 0.6.27 2020-10-24 [1] CRAN (R 4.0.2)
## dplyr 1.0.5 2021-03-05 [1] CRAN (R 4.0.2)
## ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.0.2)
## ensembldb * 2.14.1 2021-04-19 [1] Bioconductor
## evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.1)
## ExperimentHub 1.16.1 2021-04-16 [1] Bioconductor
## fansi 0.4.2 2021-01-15 [1] CRAN (R 4.0.2)
## farver 2.1.0 2021-02-28 [1] CRAN (R 4.0.2)
## fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.0.2)
## FNN 1.1.3 2019-02-15 [1] CRAN (R 4.0.2)
## generics 0.1.0 2020-10-31 [1] CRAN (R 4.0.2)
## GenomeInfoDb * 1.26.7 2021-04-08 [1] Bioconductor
## GenomeInfoDbData 1.2.4 2020-11-03 [1] Bioconductor
## GenomicAlignments 1.26.0 2020-10-27 [1] Bioconductor
## GenomicFeatures * 1.42.3 2021-04-04 [1] Bioconductor
## GenomicRanges * 1.42.0 2020-10-27 [1] Bioconductor
## ggbeeswarm 0.6.0 2017-08-07 [1] CRAN (R 4.0.2)
## ggplot2 * 3.3.3 2020-12-30 [1] CRAN (R 4.0.2)
## glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.2)
## gridExtra 2.3 2017-09-09 [1] CRAN (R 4.0.2)
## gtable 0.3.0 2019-03-25 [1] CRAN (R 4.0.2)
## highr 0.9 2021-04-16 [1] CRAN (R 4.0.2)
## hms 1.0.0 2021-01-13 [1] CRAN (R 4.0.2)
## htmltools 0.5.1.1 2021-01-22 [1] CRAN (R 4.0.2)
## httpuv 1.6.0 2021-04-23 [1] CRAN (R 4.0.2)
## httr 1.4.2 2020-07-20 [1] CRAN (R 4.0.2)
## interactiveDisplayBase 1.28.0 2020-10-27 [1] Bioconductor
## IRanges * 2.24.1 2020-12-12 [1] Bioconductor
## irlba 2.3.3 2019-02-05 [1] CRAN (R 4.0.2)
## jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.0.5)
## jsonlite 1.7.2 2020-12-09 [1] CRAN (R 4.0.2)
## knitr 1.33 2021-04-24 [1] CRAN (R 4.0.2)
## labeling 0.4.2 2020-10-20 [1] CRAN (R 4.0.2)
## later 1.2.0 2021-04-23 [1] CRAN (R 4.0.2)
## lattice 0.20-44 2021-05-02 [1] CRAN (R 4.0.5)
## lazyeval 0.2.2 2019-03-15 [1] CRAN (R 4.0.2)
## lifecycle 1.0.0 2021-02-15 [1] CRAN (R 4.0.2)
## magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.0.2)
## Matrix 1.3-2 2021-01-06 [1] CRAN (R 4.0.5)
## MatrixGenerics * 1.2.1 2021-01-30 [1] Bioconductor
## matrixStats * 0.58.0 2021-01-29 [1] CRAN (R 4.0.2)
## memoise 2.0.0 2021-01-26 [1] CRAN (R 4.0.3)
## mime 0.10 2021-02-13 [1] CRAN (R 4.0.2)
## munsell 0.5.0 2018-06-12 [1] CRAN (R 4.0.2)
## openssl 1.4.4 2021-04-30 [1] CRAN (R 4.0.2)
## pillar 1.6.0 2021-04-13 [1] CRAN (R 4.0.2)
## pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.2)
## prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.0.2)
## progress 1.2.2 2019-05-16 [1] CRAN (R 4.0.2)
## promises 1.2.0.1 2021-02-11 [1] CRAN (R 4.0.2)
## ProtGenerics 1.22.0 2020-10-27 [1] Bioconductor
## pryr 0.1.4 2018-02-18 [1] CRAN (R 4.0.2)
## purrr 0.3.4 2020-04-17 [1] CRAN (R 4.0.2)
## R6 2.5.0 2020-10-28 [1] CRAN (R 4.0.2)
## rappdirs 0.3.3 2021-01-31 [1] CRAN (R 4.0.2)
## Rcpp 1.0.6 2021-01-15 [1] CRAN (R 4.0.2)
## RCurl 1.98-1.3 2021-03-16 [1] CRAN (R 4.0.2)
## rlang 0.4.11 2021-04-30 [1] CRAN (R 4.0.2)
## rmarkdown 2.7 2021-02-19 [1] CRAN (R 4.0.4)
## Rsamtools 2.6.0 2020-10-27 [1] Bioconductor
## RSpectra 0.16-0 2019-12-01 [1] CRAN (R 4.0.2)
## RSQLite 2.2.7 2021-04-22 [1] CRAN (R 4.0.2)
## rsvd 1.0.5 2021-04-16 [1] CRAN (R 4.0.2)
## rtracklayer 1.50.0 2020-10-27 [1] Bioconductor
## Rtsne 0.15 2018-11-10 [1] CRAN (R 4.0.2)
## S4Vectors * 0.28.1 2020-12-09 [1] Bioconductor
## sass 0.3.1 2021-01-24 [1] CRAN (R 4.0.2)
## scales 1.1.1 2020-05-11 [1] CRAN (R 4.0.2)
## scater * 1.18.6 2021-02-26 [1] Bioconductor
## scRNAseq * 2.4.0 2020-11-09 [1] Bioconductor
## scuttle 1.0.4 2020-12-17 [1] Bioconductor
## sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.2)
## shiny 1.6.0 2021-01-25 [1] CRAN (R 4.0.3)
## SingleCellExperiment * 1.12.0 2020-10-27 [1] Bioconductor
## sparseMatrixStats 1.2.1 2021-02-02 [1] Bioconductor
## stringi 1.5.3 2020-09-09 [1] CRAN (R 4.0.2)
## stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.2)
## SummarizedExperiment * 1.20.0 2020-10-27 [1] Bioconductor
## tibble 3.1.1 2021-04-18 [1] CRAN (R 4.0.2)
## tidyselect 1.1.1 2021-04-30 [1] CRAN (R 4.0.2)
## utf8 1.2.1 2021-03-12 [1] CRAN (R 4.0.2)
## uwot 0.1.10 2020-12-15 [1] CRAN (R 4.0.2)
## vctrs 0.3.8 2021-04-29 [1] CRAN (R 4.0.2)
## vipor 0.4.5 2017-03-22 [1] CRAN (R 4.0.2)
## viridis 0.6.0 2021-04-15 [1] CRAN (R 4.0.2)
## viridisLite 0.4.0 2021-04-13 [1] CRAN (R 4.0.2)
## withr 2.4.2 2021-04-18 [1] CRAN (R 4.0.2)
## xfun 0.22 2021-03-11 [1] CRAN (R 4.0.2)
## XML 3.99-0.6 2021-03-16 [1] CRAN (R 4.0.2)
## xml2 1.3.2 2020-04-23 [1] CRAN (R 4.0.2)
## xtable 1.8-4 2019-04-21 [1] CRAN (R 4.0.2)
## XVector 0.30.0 2020-10-28 [1] Bioconductor
## yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.2)
## zlibbioc 1.36.0 2020-10-28 [1] Bioconductor
##
## [1] /Library/Frameworks/R.framework/Versions/4.0/Resources/library
Code for making the R file.
knitr::knit("2021-05-03_code_HCA_LA.Rmd", tangle = TRUE)