This function merges the results from running analyzeChr on several
chromosomes and assigns genomic states using annotateRegions. It
re-calculates the p-values and q-values using the pooled areas from the null
regions from all chromosomes. Once the results have been merged,
derfinderReport::generateReport
can be used to generate an HTML
report of the results. The derfinderReport
package is available at
https://github.com/lcolladotor/derfinderReport.
The chromosomes of the files to be merged.
The main data directory path, which can be useful if analyzeChr is used for several parameters and the results are saved in different directories.
A vector of length two specifiying the cutoffs used to determine significance. The first element is used to determine significance for the P-values and FWER adjusted P-values, while the second element is used for the Q-values (FDR adjusted P-values) similar to calculatePvalues.
A GRanges object created with makeGenomicState.
It can be either the genomicState$fullGenome
or
genomicState$codingGenome
component.
Determines the mininum overlap needed when annotating regions with annotateRegions.
If TRUE
the output from preprocessCoverage is
merged.
Arguments passed to other methods and/or advanced arguments. Advanced arguments:
If TRUE
basic status updates will be printed along
the way.
The options used in analyzeChr. By default
NULL
and will be inferred from the output files.
The actual F-statistic cutoff used. This can be
obtained from the logs or from the output of analyzeChr. If
NULL
then this function will attempt to re-calculate it.
Passed to annotateRegions and extendedMapSeqlevels.
Seven Rdata files.
Full F-statistics from all chromosomes in a list of Rle objects.
Timing information from all chromosomes.
A DataFrame with the null region information: statistic, width, chromosome and permutation identifier. It's ordered by the statistics
GRanges object with regions found and with full
annotation from matchGenes. Note that the column
strand
from matchGenes is renamed to
annoStrand
to comply with GRanges specifications.
A list with the pre-processed coverage data from all chromosomes.
A list as constructed in annotateRegions with the assigned genomic states.
A list with the options used when merging the
results. Used in derfinderReport::generateReport
.
If you want to calculate the FWER adjusted P-values, supply
optionsStats
which is produced by analyzeChr.
## The output will be saved in the 'generateReport-example' directory
dir.create("generateReport-example", showWarnings = FALSE, recursive = TRUE)
## For convenience, the derfinder output has been pre-computed
file.copy(system.file(file.path("extdata", "chr21"),
package = "derfinder",
mustWork = TRUE
), "generateReport-example", recursive = TRUE)
#> [1] TRUE
## Merge the results from the different chromosomes. In this case, there's
## only one: chr21
mergeResults(
chrs = "21", prefix = "generateReport-example",
genomicState = genomicState$fullGenome
)
#> extendedMapSeqlevels: sequence names mapped from NCBI to UCSC for species homo_sapiens
#> 2023-05-07 06:01:26.968032 mergeResults: Saving options used
#> 2023-05-07 06:01:26.968683 Loading chromosome chr21
#> Neither 'cutoffFstatUsed' nor 'optionsStats' were supplied, so the FWER calculation step will be skipped.
#> 2023-05-07 06:01:27.012941 mergeResults: Saving fullNullSummary
#> 2023-05-07 06:01:27.013651 mergeResults: Re-calculating the p-values
#> 2023-05-07 06:01:27.07809 mergeResults: Saving fullRegions
#> 2023-05-07 06:01:27.079335 mergeResults: assigning genomic states
#> 2023-05-07 06:01:27.187405 annotateRegions: counting
#> 2023-05-07 06:01:27.258744 annotateRegions: annotating
#> 2023-05-07 06:01:27.285194 mergeResults: Saving fullAnnotatedRegions
#> 2023-05-07 06:01:27.286116 mergeResults: Saving fullFstats
#> 2023-05-07 06:01:27.286791 mergeResults: Saving fullTime
if (FALSE) {
## You can then explore the wallclock time spent on each step
load(file.path("generateReport-example", "fullRegions.Rdata"))
## Process the time info
time <- lapply(fullTime, function(x) data.frame(diff(x)))
time <- do.call(rbind, time)
colnames(time) <- "sec"
time$sec <- as.integer(round(time$sec))
time$min <- time$sec / 60
time$chr <- paste0("chr", gsub("\\..*", "", rownames(time)))
time$step <- gsub(".*\\.", "", rownames(time))
rownames(time) <- seq_len(nrow(time))
## Make plot
library("ggplot2")
ggplot(time, aes(x = step, y = min, colour = chr)) +
geom_point() +
labs(title = "Wallclock time by step") +
scale_colour_discrete(limits = chrs) +
scale_x_discrete(limits = names(fullTime[[1]])[-1]) +
ylab("Time (min)") +
xlab("Step")
}