NEWS.md
BUG FIXES
GenomicState::gencode_genomic_state()
that ultimately was due to makeGenomicState()
and the transition in R to have as default data.frame(stringsAsFactors = FALSE)
instead of TRUE
.SIGNIFICANT USER-VISIBLE CHANGES
makeGenomicState()
now restores the GenomicFeatures::isActiveSeq()
on the txdb
object before finishing to avoid issues, like running regionReport::renderReport()
on two different sets of regions (different chrs). There’s a new unit test for this.BUG FIXES
SIGNIFICANT USER-VISIBLE CHANGES
SIGNIFICANT USER-VISIBLE CHANGES
BUG FIXES
findRegions()
that affected the end positions of the regions when maxRegionGap
was supplied with a value greater than the default of 0
and the data was filtered (so position
was not all TRUE
in the findRegions()
call). To check this scenario now there is a new unit test under tests/testthat/test-maxRegionGap.R
.getRegionCoverage()
and thus for regionMatrix()
for the same type of situations (filtered data with a non-zero maxRegionGap
). The coverage values are ok, it’s just the end positions of the regions returned by findRegions()
that were incorrect and that would need to be re-computed with the fixed version.bumphunter::loessByCluster()
instead of bumphunter::runmedByCluster()
given some issues with the second one.BUG FIXES
railMatrix()
and loadCoverage()
helper functions had an issue when the input set of regions was duplicated. This could be reproduced withsampleFile <- c('SRR387777' = 'http://duffel.rail.bio/recount/SRP009615/bw/SRR387777.bw')
regs <- GenomicRanges::GRanges('chrY', IRanges(start = c(1, 1), width = 10), strand = '-')
names(regs) <- c(1:2)
result <- rtracklayer::import(sampleFile, selection = regs, as = 'RleList')
This error affected recount and other reverse dependencies that use derfinder for processing BigWig files.
BUG FIXES
railMatrix()
and loadCoverage()
helper functions now attempt to import a BigWig file 3 times before giving up. Based on http://bioconductor.org/developers/how-to/web-query/ and https://github.com/leekgroup/recount/commit/8da982b309e2d19638166f263057d9f85bb64e3f which will make these functions more robust to occasional web access issues.NEW FEATURES
BUG FIXES
SIGNIFICANT USER-VISIBLE CHANGES
outfile
to log
when invoking BiocParallel::SnowParam()
. Thus define_cluster()
now has a mc.log
argument instead of mc.outfile
.NEW FEATURES
regionMatrix()
in response to https://support.bioconductor.org/p/103591
BUG FIXES
analyzeChr()
and indirectly running preprocessCoverage()
. See https://support.bioconductor.org/p/99400/ for details.SIGNIFICANT USER-VISIBLE CHANGES
BiocStyle::html_document
that was recently released.BUG FIXES
regionMatrix()
will now pass the hidden arguments species
and currentStyle
to getRegionCoverage()
so they can be used by extendedMapSeqlevels()
. Related to https://support.bioconductor.org/p/95721/.BUG FIXES
extendedMapSeqlevels()
. Related to https://support.bioconductor.org/p/95521/.filterData()
based on https://github.com/lcolladotor/derfinder/issues/38
BUG FIXES
define_cluster()
to match recent changes in BiocParallel and fixed an if clause in regionMatrix()
that could lead to warnings in some situations.SIGNIFICANT USER-VISIBLE CHANGES
regionMatrix()
now has explicit arguments totalMapped
and targetSize
so that users will almost always normalize by library size when using this function (if they see the help page) or in the steps prior to using regionMatrix()
.BUG FIXES
mc.cores
and mc.cores.load
in fullCoverage()
thanks to feedback from Emily E Burke https://github.com/emilyburke.SIGNIFICANT USER-VISIBLE CHANGES
advancedArg()
.NEW FEATURES
getTotalMapped()
for calculating the total number of mapped reads for a BAM file or the area under the curve (AUC) for a BigWig file. This information can then be used with fullCoverage(), filterData()
and other functions. Note that if you totalMapped
in fullCoverage()
you should not use totalMapped
again in filterData()
.BUG FIXES
SIGNIFICANT USER-VISIBLE CHANGES
SIGNIFICANT USER-VISIBLE CHANGES
BUG FIXES
railMatrix()
’s flexibility for defining the cluster used for loading the BigWig files. You can now use BPPARAM.railChr
which will take priority over file.cores
. Also, if file.cores = 1L
, then the default will be to use SerialParam()
, which was the implementation available prior to 1.5.11.SIGNIFICANT USER-VISIBLE CHANGES
coverageToExon()
, regionMatrix()
and railMatrix()
can take an L
argument of length equal to the number of samples in case not all samples have the same read length.railMatrix()
has a new argument called file.cores
for controlling how many cores are used for loading the BigWig files. In theory this allows using railMatrix()
with BPPARAM.custom
equal to a BiocParallel::BatchJobsParam()
to submit 1 job per chromosome, then file.cores
determines the number of cores for reading the files. This is a highly experimental feature.SIGNIFICANT USER-VISIBLE CHANGES
SIGNIFICANT USER-VISIBLE CHANGES
NEW FEATURES
railMatrix()
which generates similar output to regionMatrix()
but is much faster and less memory intensive. It achieves this by extracting the required information from BigWig files.SIGNIFICANT USER-VISIBLE CHANGES
mc.outfile
argument for specifying the outfile
argument in SnowParam()
. See more details at https://stat.ethz.ch/pipermail/bioc-devel/2015-May/007531.html
SIGNIFICANT USER-VISIBLE CHANGES
SIGNIFICANT USER-VISIBLE CHANGES
SIGNIFICANT USER-VISIBLE CHANGES
BUG FIXES
BiocParallel::SnowParam()
no longer has an outfile
argument.SIGNIFICANT USER-VISIBLE CHANGES
analyzeChr()
now uses annotateTranscripts()
and matchGenes()
from bumphunter version 1.7.3 (or greater). As announced at https://support.bioconductor.org/p/63568/ these changes in bumphunter allow straight forward use of non-human annotation. In analyzeChr()
using a different organism can be used by changing the txdb
argument: finer control can be achieved through ...
. For example, by specifying the annotationPackage
argument used in annotateTranscripts()
.BUG FIXES
makeGenomicState()
incorrectly labeled regions as intragenic. The correct name is intergenic.BUG FIXES
calculatePvalues()
! Basically, internally maxRegionGap
was set to 300 instead of 0 in one step by default. Thus the process of mapping regions to genomic coordinates was messed up. If you have results prior to this fix you can try using https://gist.github.com/bf85e2c7d5d1f8197707 to fix the results as much as possible. Basically, regions will be correct but the p-values will be approximated with the available information from the null regions. Truly fixing the p-values can only be done by re-running derfinder.NEW FEATURES
extendedMapSeqlevels()
for using GenomeInfoDb
when there is information regarding the species and naming style of interest. otherwise sequence names are left unchanged. If used with verbose = TRUE
, a message is printed whenever GenomeInfoDb
could not be used or if some information had to be guessed.BUG FIXES
NEW FEATURES
loadCoverage()
and fullCoverage()
now support BamFile
and BigWigFile
objects.BUG FIXES
loadCoverage()
when the input was a BamFileList
. Implemented tests based on the bug. Bug reported at https://support.bioconductor.org/p/62073
SIGNIFICANT USER-VISIBLE CHANGES
mergeResults()
can now calculate FWER adjusted p-values when provided with optionsStats
. Updated analyzeChr()
to supply the required information.NEW FEATURES
advancedArg()
and its alias advanced_arg()
which links to the docs for the advanced arguments by opening a browser window with the relevant information from GitHub.getRegionCoverage()
and coverageToExon()
now have the files
argument which is used only when fullCov
is NULL
. Both functions will attempt to extract the coverage data from the raw files for the regions of interest in that case.Special care has to be taken in order to guarantee that the coverage is the same as some reads might be discarded if the region is too narrow. See the advanced argument protectWhich
in loadCoverage()
for more information. Also, if totalMapped
and targetSize
were used prior to filtering, they should be used again. * loadCoverage()
has new advanced arguments that help when reading a specific region (or regions) of the genome.
SIGNIFICANT USER-VISIBLE CHANGES
loadCoverage()
and fullCoverage()
argument dirs
has been renamed to files
for greater consistency with what it represents.SIGNIFICANT USER-VISIBLE CHANGES
regionMatrix()
now returns the output of getRegionCoverage()
so you don’t have to run it twice if you are interested in using derfinderPlot::plotRegionCoverage()
.regionMatrix()$regions
now guesses the seqlengthsBUG FIXES
.advanced_argument()
to work in nested functionsgetRegionCoverage()
where fullCov$position
was provided but it was NULL
.regionMatrix(totalMapped, targetSize)
case which would previously lead to an error in the getRegionCoverage()
step.SIGNIFICANT USER-VISIBLE CHANGES
SIGNIFICANT USER-VISIBLE CHANGES
coerceGR()
, createBwSample()
and createBw()
for exporting output from fullCoverage()
into BigWig files.SIGNIFICANT USER-VISIBLE CHANGES
analyze_chr()
is the new alias for analyzeChr()
.makeBamList()
has been renamed to rawFiles()
since it can be used to identify a list of BigWig files instead of BAM files.SIGNIFICANT USER-VISIBLE CHANGES
loadCoverage()
and fullCoverage()
now have a tilewidth
argument. When specified GenomicFiles
is used to read the coverage in chunks. In theory, this can lead to lower memory usage at the expense of time.SIGNIFICANT USER-VISIBLE CHANGES
preprocessCoverage()
now has a toMatrix
argument which is only used when lowMemDir
is not NULL
. It controls whether to save the chunks as DataFrame
objects or dgCMatrix
objects and the idea is that it can time by just transforming the data once instead of doing so at each permutation.SIGNIFICANT USER-VISIBLE CHANGES
fstats.apply()
has been moved to it’s own package: derfinderHelper
. This will speed up the run time when using BiocParallel::SnowParam()
as derfinderHelper
takes much less time to load than derfinder.plotCluster()
, plotOverview()
and plotRegionCoverage()
were all moved to their new own package: derfinderPlot
. This will make maintenance easier as the dependency ggbio
is still under active development.SIGNIFICANT USER-VISIBLE CHANGES
fstats.apply()
. Note that improving .transformSparseMatrix()
would speed up the Matrix
method.system.time(library(derfinder))
to see how long the overhead is. It only pays off to use more cores if the calculations are taking longer than the overhead.SIGNIFICANT USER-VISIBLE CHANGES
fullCoverage()
has several new arguments and now is a full parallel implementation of loadCoverage()
. These changes were introduced since fullCoverage()
no longer blows up in memory since version 0.0.62 and thus the new recommended use case is to call fullCoverage()
instead of running one job with loadCoverage()
per chromosome or using a lapply()
loop.NEW FEATURES
fstats.apply()
now has method
and scalefac
arguments. The method
argument controls which of the 3 implementations to use. The old method is called regular
now. The new method Rle
calculates the F-statistics without de-compressing the data, which is good for memory but gets considerably slower as the number of samples increases. The default method is Matrix
which uses the Matrix package and is both faster (given that the coercion doesn’t take long) and less memory intensive than the regular
method.SIGNIFICANT USER-VISIBLE CHANGES
analyzeChr()
, calculatePvalues()
and calculateStats()
now have arguments method
and scalefac
to match the changes in fstats.apply()
.NEW FEATURES
derfinder now uses BiocParallel::blapply()
instead of parallel::mclapply()
When mc.cores
is greater than 1, BiocParallel::SnowParam()
is used to construct the cluster. Otherwise, BiocParallel::SerialParam()
is used. This change reduces memory load when using the functions that have the mc.cores
argument greater than 1.
Functions analyzeChr()
, calculatePvalues()
, calculateStats()
, coverageToExon()
, fullCoverage()
, getRegionCoverage()
, regionMatrix()
all have a new argument mc.output
. This is passed to BiocParallel::SnowParam(outfile)
.
SIGNIFICANT USER-VISIBLE CHANGES
fullCoverage()
without problems and should no longer encounter errors due to longer vectors not being implemented.fullCoverage()
now use much less memory and do not blow up as you increase mc.cores
. Note however that the memory does increase, but now it`s close to linear.mc.cores
greater than 1, but that is due to the small setup overhead of BiocParallel::SnowParam()
which is minimal compared to the overall speed gains with real data sets.SIGNIFICANT USER-VISIBLE CHANGES
filterData()
and loadCoverage()
now have arguments totalMapped
and targetSize
getRegionCoverage()
and regionMatrix()
can now work with list output from loadCoverage()
with a non-NULL cutoffregionMatrix()
now has an argument runFilter
so it can be used with previous output from loadCoverage()
/filterData()
with returnMean=TRUE
SIGNIFICANT USER-VISIBLE CHANGES
analyzeChr()
, annotateRegions()
, calculatePvalues()
, coverageToExon()
, findRegions()
, fullCoverage()
, getRegionCoverage()
, makeGenomicState()
, mergeResults()
, plotCluster()
, plotOverview()
now all have chrsStyle
as an argument to specify the chromosome naming convention used. Defaults to UCSC.makeGenomicState()
no longer has the addChrPrefix
argument. It has been replaced by chrsStyle
to use GenomeInfoDb
to set the naming style.chrnums
has been renamed to chrs
in fullCoverage()
and mergeResults()
chrnum
has been renamed to chr
in analyzeChr()
NEW FEATURES
loadCoverage()
and fullCoverage()
can now import data from BigWig files.SIGNIFICANT USER-VISIBLE CHANGES
regionMatrix()
now relies on getRegionCoverage()
instead of coverageToExon()
making it faster and less memory intensive.NEW FEATURES
regionMatrix()
for filtering coverage data and using the resulting regions to construct a count matrix. Uses several derfinder functions.SIGNIFICANT USER-VISIBLE CHANGES
coverageToExon()
more robust for different names in fullCov
.filterData()
and loadCoverage()
have new arguments filter
, returnMean
, and returnCoverage
which allow speeding up regionMatrix()
. preprocessCoverage()
was changed accordingly.getRegionCoverage()
now internally uses USCS names.BUG FIXES
coverageToExon()
.SIGNIFICANT USER-VISIBLE CHANGES
NEWS
file with curated information from the git commit history.SIGNIFICANT USER-VISIBLE CHANGES
mergeResults()
. Now all functions have examples.SIGNIFICANT USER-VISIBLE CHANGES
annotateRegions()
, getRegionCoverage()
, and coverageToExon()
.SIGNIFICANT USER-VISIBLE CHANGES
loadCoverage()
now allows specifying which strand you want to load. More at https://github.com/lcolladotor/derfinder/issues/16
SIGNIFICANT USER-VISIBLE CHANGES
getRegionCoverage()
now has a depth-adjustment argumentBUG FIXES
SIGNIFICANT USER-VISIBLE CHANGES
Rcpp
and RcppArmadillo
from F-stats calculation. More at https://github.com/lcolladotor/derfinder/pull/17
?genomeDataRaw
, ?genomeFstats
, ?genomeRegions
BUG FIXES
verbose
for getRegionCoverage()
plotRegionCoverage()
now matches latest getRegionCoverage()
outputSIGNIFICANT USER-VISIBLE CHANGES
getRegionCoverage()
with a new method for sub setting the coverage matrices, allowing for coverage estimates from overlapping regions. Now also uses mclapply()
.BUG FIXES
NAMESPACE
to match current bioc-devel (2.14) as suggested by Tim Triche.SIGNIFICANT USER-VISIBLE CHANGES
BUG FIXES
analyzeChr()
to handle correctly the new lowMemDir
argument.SIGNIFICANT USER-VISIBLE CHANGES
lowMemDir
argument to preprocessCoverage()
, calculateStats()
, calculatePvalues()
, fstats.apply()
, and analyzeChr()
. Reduces peak memory usage at the expense of some input-output.SIGNIFICANT USER-VISIBLE CHANGES
mergeResults()
will not merge pre-processed data by defaultcoverageToExon()
now uses mclapply()
when possibleSIGNIFICANT USER-VISIBLE CHANGES
SIGNIFICANT USER-VISIBLE CHANGES
preprocessCoverage()
now uses Reduce()
instead of .rowMeans()
NEW FEATURES
collapseFullCoverage()
SIGNIFICANT USER-VISIBLE CHANGES
sampleDepth()
has been greatly changed. It is now based on Hector Corrada`s ideas implemented in metagenomeSeq.BUG FIXES
SIGNIFICANT USER-VISIBLE CHANGES
bai
argument to fullCoverage()
loadCoverage()
can now work with a pre-defined BamFileList object.SIGNIFICANT USER-VISIBLE CHANGES
center
in sampleDepth()
to FALSE
.runAnnotation
argument to analyzeChr()
.bai
argument to loadCoverage()
.adjustF
argument to all stats functions. Useful for cases when the RSS of the alternative model is very small.BUG FIXES
plotRegionCoverage()
and plotCluster()
for unexpected cases.NEW FEATURES
sampleDepth()
SIGNIFICANT USER-VISIBLE CHANGES
generateReport()
has been moved to it’s own new package called derfinderReport
. It is available at https://github.com/lcolladotor/derfinderReport
analyzeChr()
have been updated now that sampleDepth()
was addedSIGNIFICANT USER-VISIBLE CHANGES
derfinder2
to derfinder
to comply with Bioconductor guidelines.SIGNIFICANT USER-VISIBLE CHANGES
makeModels()
deals with cases when mod and mod0 are not full rank.plotCluster()
now no longer depends on an active Internet connection for hg19 = TRUE
.BUG FIXES
plotRegionCoverage()
, plotCluster()
and in generateReport()
BUG FIXES
plotRegionCoverage()
for cases when in annotateRegions(minoverlap=x)
lead to no overlaps being found between a region and annotation.BUG FIXES
calculatePvalues()
when no null regions or only some were found.colsubset
on analyzeChr()
.testvars
in makeModels()
had unused levels.qvalue::qvalue()
fails due to incorrect estimation of pi0
SIGNIFICANT USER-VISIBLE CHANGES
generateReport()
now weight the mean by the number of samples in each group. Also removed the mean coverage vs area section. generateReport()
now also has a nBestClusters
argument.plotCluster()
now uses scales and has a forceLarge
argumentBUG FIXES
plotCluster()
now includes code that was used for visualizing the bug.NEW FEATURES
SIGNIFICANT USER-VISIBLE CHANGES
plotRegion()
to plotCluster()
plus it no longer shows the exons track as it is redundant informationmergeResults()
now also runs annotateRegions()
generateReport()
now uses plotRegionCoverage()
and includes MA-style plotsSIGNIFICANT USER-VISIBLE CHANGES
preprocessCoverage()
now uses the groupInfo
argumentcalculatePvalues()
now calculates log2 fold changes (without scaling or adjusting for library size)generateReport()
SIGNIFICANT USER-VISIBLE CHANGES
getSegmentsRle()
was greatly simplifiedBUG FIXES
analyzeChr()
, completed mergeResults()
NEW FEATURES
analyzeChr()
and mergeResults()
SIGNIFICANT USER-VISIBLE CHANGES
makeModels()
now uses testvars
instead of group
and has a new arguments groupInfo
, center
and testIntercept
calculatePvalues()
now uses area of regions instead of mean to calculate the p-values.preprocessCoverage()
now calculates the mean coverage at each baseSIGNIFICANT USER-VISIBLE CHANGES
plotOverview()
and plotRegion()
SIGNIFICANT USER-VISIBLE CHANGES
calculatePvalues()
now uses qvalue::qvalue()
instead of p.adjust()
SIGNIFICANT USER-VISIBLE CHANGES
plotRegion()
BUG FIXES
makeModels()
can now handle a vector for the adjustvars
argumentSIGNIFICANT USER-VISIBLE CHANGES
calculatePvalues()
will adjust the p.values
now using p.adjust()
makeModels()
now can handle a matrix for the group
argumentBUG FIXES
getSegmentsRle()
will now work properly in the case that no segments are foundSIGNIFICANT USER-VISIBLE CHANGES
calculateStats()
and calculatePvalues()
NEW FEATURES
preprocessCoverage()
can now automatically select the chunksize
SIGNIFICANT USER-VISIBLE CHANGES
fstats()
and is now part of fstats.apply()
NEW FEATURES
method
argument for getSegmetnsRle()BUG FIXES
calculatePvalues()
NEW FEATURES
makeBamList()
, makeModels()
, and preprocessCoverage()
?genomeData
and ?genomeInfo
NEW FEATURES
calculateStats()
, filterData()
, fstats()
, and fstats.apply()
SIGNIFICANT USER-VISIBLE CHANGES
makeCoverage()
to loadCoverage()
NAMESPACE
NEW FEATURES
derfinder2
) from derfinder
https://github.com/alyssafrazee/derfinder version 1.0.2 This version is available at https://github.com/alyssafrazee/derfinder/tree/d49f7b28c26f075da36a50ab67c9d192ab2fd63d