5 iSEE introduction
Instructor: Leo
5.1 Toy RSE data
## Lets build a simple SummarizedExperiment object following information
## from the documentation
library("SummarizedExperiment")
## ?SummarizedExperiment
## Adapted from the official documentation:
## First we create the data pieces that we'll use to build our
## SummarizedExperiment object. In this case, we'll have 200 genes
## measured in 6 samples.
<- 200
nrows <- 6
ncols
## Let's make up some count numbers at random
set.seed(20210223)
<- matrix(runif(nrows * ncols, 1, 1e4), nrows)
counts
## Then some basic infomratino for our genes
<- GRanges(
rowRanges rep(c("chr1", "chr2"), c(50, 150)),
IRanges(floor(runif(200, 1e5, 1e6)), width = 100),
strand = sample(c("+", "-"), 200, TRUE),
feature_id = sprintf("ID%03d", 1:200)
)names(rowRanges) <- paste0("gene_", seq_len(length(rowRanges)))
## Next, we create some information about samples
<- DataFrame(
colData Treatment = rep(c("ChIP", "Input"), 3),
row.names = LETTERS[1:6]
)
## Finally we put all these pieces together in a single R object
<- SummarizedExperiment(
rse assays = SimpleList(counts = counts),
rowRanges = rowRanges,
colData = colData
)
## Overview
rse#> class: RangedSummarizedExperiment
#> dim: 200 6
#> metadata(0):
#> assays(1): counts
#> rownames(200): gene_1 gene_2 ... gene_199 gene_200
#> rowData names(1): feature_id
#> colnames(6): A B ... E F
#> colData names(1): Treatment
5.2 iSEE
How can you make plots from SummarizedExperiment
objects without having to
write any code? The answer is with iSEE
🎨
- http://bioconductor.org/packages/iSEE
- http://bioconductor.org/packages/release/bioc/vignettes/iSEE/inst/doc/basic.html
## Let's explore the `rse` object interactively
library("iSEE")
::iSEE(rse) iSEE
5.3 Exercise with data from spatialLIBD
- We’ll download a
SingleCellExperiment
object, which is similar toSummarizedExperiment
as it extends it.
## Lets get some data using spatialLIBD
<- spatialLIBD::fetch_data("sce_layer")
sce_layer #> adding rname 'https://www.dropbox.com/s/bg8xwysh2vnjwvg/Human_DLPFC_Visium_processedData_sce_scran_sce_layer_spatialLIBD.Rdata?dl=1'
#> 2023-07-11 22:07:04.530933 loading file /github/home/.cache/R/BiocFileCache/48f15efe0b7_Human_DLPFC_Visium_processedData_sce_scran_sce_layer_spatialLIBD.Rdata%3Fdl%3D1
sce_layer#> class: SingleCellExperiment
#> dim: 22331 76
#> metadata(0):
#> assays(2): counts logcounts
#> rownames(22331): ENSG00000243485 ENSG00000238009 ... ENSG00000278384 ENSG00000271254
#> rowData names(10): source type ... is_top_hvg is_top_hvg_sce_layer
#> colnames(76): 151507_Layer1 151507_Layer2 ... 151676_Layer6 151676_WM
#> colData names(13): sample_name layer_guess ... layer_guess_reordered_short spatialLIBD
#> reducedDimNames(6): PCA TSNE_perplexity5 ... UMAP_neighbors15 PCAsub
#> mainExpName: NULL
#> altExpNames(0):
## We can check how big the object is with lobstr
::obj_size(sce_layer)
lobstr#> 33.99 MB
- Just like with our
rse
object, we can useiSEE::iSEE()
to explore the data.
::iSEE(sce_layer) iSEE
Exercise 1: Create a plot and download a PDF that reproduces as closely as possible the plot on the right side of the following slide.
Exercise 2:
Explore with a heatmap the expression of the genes MOBP
, MBP
, and PCP4
. If we use clustering (group genes based on similar expression patterns), which two genes are most similar to each other?
Exercise 3:
In which dorsolateral prefrontal cortex (DLPFC) layers (L1
, L2
, …, L6
grey matter layers, and WM
for white matter) do we see the highest expression for the genes MOBP and MBP?
This list of ENSEMBL IDs will be useful:
ENSG00000168314
ENSG00000183036
ENSG00000197971
5.4 Community
iSEE
authors:
- Kévin Rue-Albrecht https://twitter.com/KevinRUE67
- Federico Marini https://twitter.com/FedeBioinfo
- Charlotte Soneson https://twitter.com/CSoneson
- Aaron Lun https://twitter.com/realAaronLun
- Another example exploring data with
SummarizedExperiment
andiSEE
:
Today we explored RNA-seq data from @StefanoBerto83 et al who made it easy to re-use. Thank you! ^^@lcolladotor used #shiny + #ggpubr as well as #iSEE
— LIBD rstats club (@LIBDrstats) February 12, 2021
📔 https://t.co/iUQHE0xqRc
🗞️ https://t.co/qhAdXbhY9c#rstats @Bioconductorhttps://t.co/OXTukByhoo
Are you making a heatmap to 👀 gene expression? Have you wondered whether to center &scale the data?
— 🇲🇽 Leonardo Collado-Torres (@lcolladotor) December 1, 2022
I made this 5 min video to help answer these ❓
Shoutout to #iSEE by @KevinRUE67 @FedeBioinfo @CSoneson et al #rstats @Bioconductor@LieberInstitute https://t.co/KwQHLODTQV