R/makeGenomicState.R
makeGenomicState.Rd
This function summarizes the annotation contained in a TxDb at each given base of the genome based on annotated transcripts. It groups contiguous base pairs classified as the same type into regions.
A TxDb object with chromosome lengths
(check seqlengths(txdb)
). If you are using a
TxDb object created from a GFF/GTF file, you will
find this https://support.bioconductor.org/p/93235/ useful.
The names of the chromosomes to use as denoted in the
txdb
object. Check isActiveSeq.
Arguments passed to extendedMapSeqlevels.
A GRangesList
object with two elements: fullGenome
and
codingGenome
. Both have metadata information for the type of region
(theRegion), transcript IDs (tx_id), transcript name (tx_name), and gene ID
(gene_id). fullGenome
classifies each region as either being exon,
intron or intergenic. codingGenome
classfies the regions as being
promoter, exon, intro, 5UTR, 3UTR or intergenic.
## Load the example data base from the GenomicFeatures vignette
library("GenomicFeatures")
#> Loading required package: AnnotationDbi
#> Loading required package: Biobase
#> Welcome to Bioconductor
#>
#> Vignettes contain introductory material; view with
#> 'browseVignettes()'. To cite Bioconductor, see
#> 'citation("Biobase")', and for packages 'citation("pkgname")'.
samplefile <- system.file("extdata", "hg19_knownGene_sample.sqlite",
package = "GenomicFeatures"
)
txdb <- loadDb(samplefile)
## Generate genomic state object, only for chr6
sampleGenomicState <- makeGenomicState(txdb, chrs = "chr6")
#> 'select()' returned 1:1 mapping between keys and columns
#
if (FALSE) {
## Create the GenomicState object for Hsapiens.UCSC.hg19.knownGene
txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene::TxDb.Hsapiens.UCSC.hg19.knownGene
## Creating this GenomicState object takes around 8 min for all chrs and
## around 30 secs for chr21
GenomicState.Hsapiens.UCSC.hg19.knownGene.chr21 <-
makeGenomicState(txdb = txdb, chrs = "chr21")
## For convinience, this object is already included in derfinder
library("testthat")
expect_that(
GenomicState.Hsapiens.UCSC.hg19.knownGene.chr21,
is_equivalent_to(genomicState)
)
## Hsapiens ENSEMBL GRCh37
library("GenomicFeatures")
## Can take several minutes and speed will depend on your internet speed
xx <- makeTxDbPackageFromBiomart(
version = "0.99", maintainer = "Your Name",
author = "Your Name"
)
txdb <- loadDb(file.path(
"TxDb.Hsapiens.BioMart.ensembl.GRCh37.p11", "inst",
"extdata", "TxDb.Hsapiens.BioMart.ensembl.GRCh37.p11.sqlite"
))
## Creating this GenomicState object takes around 13 min
GenomicState.Hsapiens.ensembl.GRCh37.p11 <- makeGenomicState(
txdb = txdb,
chrs = c(1:22, "X", "Y")
)
## Save for later use
save(GenomicState.Hsapiens.ensembl.GRCh37.p11,
file = "GenomicState.Hsapiens.ensembl.GRCh37.p11.Rdata"
)
}