5 iSEE introduction
Instructor: Leo
5.1 Toy RSE data
## Lets build a simple SummarizedExperiment object following information
## from the documentation
library("SummarizedExperiment")
## ?SummarizedExperiment
## Adapted from the official documentation:
## First we create the data pieces that we'll use to build our
## SummarizedExperiment object. In this case, we'll have 200 genes
## measured in 6 samples.
nrows <- 200
ncols <- 6
## Let's make up some count numbers at random
set.seed(20210223)
counts <- matrix(runif(nrows * ncols, 1, 1e4), nrows)
## Then some basic infomratino for our genes
rowRanges <- GRanges(
rep(c("chr1", "chr2"), c(50, 150)),
IRanges(floor(runif(200, 1e5, 1e6)), width = 100),
strand = sample(c("+", "-"), 200, TRUE),
feature_id = sprintf("ID%03d", 1:200)
)
names(rowRanges) <- paste0("gene_", seq_len(length(rowRanges)))
## Next, we create some information about samples
colData <- DataFrame(
Treatment = rep(c("ChIP", "Input"), 3),
row.names = LETTERS[1:6]
)
## Finally we put all these pieces together in a single R object
rse <- SummarizedExperiment(
assays = SimpleList(counts = counts),
rowRanges = rowRanges,
colData = colData
)
## Overview
rse
#> class: RangedSummarizedExperiment
#> dim: 200 6
#> metadata(0):
#> assays(1): counts
#> rownames(200): gene_1 gene_2 ... gene_199 gene_200
#> rowData names(1): feature_id
#> colnames(6): A B ... E F
#> colData names(1): Treatment5.2 iSEE
How can you make plots from SummarizedExperiment objects without having to
write any code? The answer is with iSEE 🎨
- http://bioconductor.org/packages/iSEE
- http://bioconductor.org/packages/release/bioc/vignettes/iSEE/inst/doc/basic.html
## Let's explore the `rse` object interactively
library("iSEE")
iSEE::iSEE(rse)5.3 Exercise with data from spatialLIBD
- We’ll download a
SingleCellExperimentobject, which is similar toSummarizedExperimentas it extends it.
## Lets get some data using spatialLIBD
sce_layer <- spatialLIBD::fetch_data("sce_layer")
#> adding rname 'https://www.dropbox.com/s/bg8xwysh2vnjwvg/Human_DLPFC_Visium_processedData_sce_scran_sce_layer_spatialLIBD.Rdata?dl=1'
#> 2023-07-11 22:07:04.530933 loading file /github/home/.cache/R/BiocFileCache/48f15efe0b7_Human_DLPFC_Visium_processedData_sce_scran_sce_layer_spatialLIBD.Rdata%3Fdl%3D1
sce_layer
#> class: SingleCellExperiment
#> dim: 22331 76
#> metadata(0):
#> assays(2): counts logcounts
#> rownames(22331): ENSG00000243485 ENSG00000238009 ... ENSG00000278384 ENSG00000271254
#> rowData names(10): source type ... is_top_hvg is_top_hvg_sce_layer
#> colnames(76): 151507_Layer1 151507_Layer2 ... 151676_Layer6 151676_WM
#> colData names(13): sample_name layer_guess ... layer_guess_reordered_short spatialLIBD
#> reducedDimNames(6): PCA TSNE_perplexity5 ... UMAP_neighbors15 PCAsub
#> mainExpName: NULL
#> altExpNames(0):
## We can check how big the object is with lobstr
lobstr::obj_size(sce_layer)
#> 33.99 MB- Just like with our
rseobject, we can useiSEE::iSEE()to explore the data.
iSEE::iSEE(sce_layer)Exercise 1: Create a plot and download a PDF that reproduces as closely as possible the plot on the right side of the following slide.
Exercise 2:
Explore with a heatmap the expression of the genes MOBP, MBP, and PCP4. If we use clustering (group genes based on similar expression patterns), which two genes are most similar to each other?
Exercise 3:
In which dorsolateral prefrontal cortex (DLPFC) layers (L1, L2, …, L6 grey matter layers, and WM for white matter) do we see the highest expression for the genes MOBP and MBP?
This list of ENSEMBL IDs will be useful:
ENSG00000168314
ENSG00000183036
ENSG00000197971
5.4 Community
iSEE authors:
- Kévin Rue-Albrecht https://twitter.com/KevinRUE67
- Federico Marini https://twitter.com/FedeBioinfo
- Charlotte Soneson https://twitter.com/CSoneson
- Aaron Lun https://twitter.com/realAaronLun
- Another example exploring data with
SummarizedExperimentandiSEE:
Today we explored RNA-seq data from @StefanoBerto83 et al who made it easy to re-use. Thank you! ^^@lcolladotor used #shiny + #ggpubr as well as #iSEE
— LIBD rstats club (@LIBDrstats) February 12, 2021
📔 https://t.co/iUQHE0xqRc
🗞️ https://t.co/qhAdXbhY9c#rstats @Bioconductorhttps://t.co/OXTukByhoo
Are you making a heatmap to 👀 gene expression? Have you wondered whether to center &scale the data?
— 🇲🇽 Leonardo Collado-Torres (@lcolladotor) December 1, 2022
I made this 5 min video to help answer these ❓
Shoutout to #iSEE by @KevinRUE67 @FedeBioinfo @CSoneson et al #rstats @Bioconductor@LieberInstitute https://t.co/KwQHLODTQV