Tips and Tricks¶
Here are some helpful tips and tricks to get the most out of blockify.
Transposition data can be sparse, particularly if the transposase is constricted to specific motifs (e.g. piggyBac). Sparse data lead to broader peaks, which can be harder to interpret. Here are some strategies to increase resolution:
Omit or decrease the
-d/--distanceflag to minimize merging of significant blocks
-t/--tight, which will pull in peak boundaries so they overlap qBED entries
For the sharpest intervals, use the
-s/--summitflag to return each peak’s maximum
Finally, increasing the value of
--p0(default: 0.05) can lead to more peaks being called, at the risk of returning more false positives.
Although the Tutorial demonstrated first generating a list of blocks for input into
blockify call, this step is not strictly necessary. If a regions file is not supplied, blockify will generate one behind the scenes using the default settings in
blockify segment. However, this can result in considerable memory usage. Pre-computing the blocks file is one way to minimize memory consumption and improve performance.
Similarly, the regions over which to run
blockify callneed not be Bayesian blocks. The program can operate on any set of intervals provided in BED format. This flexibility can be useful if there are a set of features that are biologically meaningful to your analysis. For example, this could be a file of promoter regions or accessible loci where a TF might be bound.
Peaks are output in BED6 format with a generic annotation, like
peak_1743. The program does not re-calculate post hoc p-values on peaks. If you want to further calculate the significance or normalized density of these peaks, simply re-run
blockify callwith the
--intermediateflag set and supply the peaks file to
-r/--regions. Then inspect the intermediate file for these details. Picking up with the BRD4 example from the Tutorial:
> blockify call -i HCT-116_PBase.ccf -r HCT-116_PBase_peaks.bed -bg hg38_TTAA.bed -c 0 -p 1e-30 -d 12500 --intermediate HCT-116_PBase_peaks_annotated.csv > /dev/null > head -n 2 HCT-116_PBase_peaks_annotated.csv ,chrom,start,end,name,score,strand,Input,Background,Normed_bg,Net_density,pValue,negLog10pValue,rejected 0,chr1,7298597,7304456,peak_0,1,.,130.0,32,2.533130940448789,0.021755738020063357,3.74243334103237e-169,168.42684592642368,True