Using bioconda, you can install everything you need for this example with
conda install --channel bioconda pybedtools bedtools htslib matplotlib
Run get-data.sh
to download data from ENCODE.
bash get-data.sh
Input files have the following format (UCSC broadPeak and narrowPeak formats, which ar variants of BED format):
chr1 569797 570055 . 1000 . 38.118451 16.0 -1
chr1 724125 2647713 . 258 . 1.259053 11.2 -1
chr1 752542 752779 . 658 . 10.178273 1.9 -1
Then run binary_heatmaps.py
to generate the plot, a summary file, and a
directory of interval files for each class.
python binary_heatmaps.py
Rows are genomic intervals (as output by bedtools multiinter
); columns are
input BED files; black indicates that factor was found in that genomic
interval.
Summary of how many genomic intervals for each combinatorial class:
LSD1: 16181
LSD1,TAL1: 15120
TAL1: 7989
GATA1,LSD1,TAL1: 3009
GATA1,LSD1: 654
GATA1: 231
GATA1,TAL1: 214
For each of the above classes, a BED file of the indicated intervals. For example,
track name="LSD1_and_TAL1"
chr1 778211 778487
chr1 854053 854329
chr1 948500 948776
...