Skip to content

Instantly share code, notes, and snippets.

View Phlya's full-sized avatar

Ilya Flyamer Phlya

  • FMI
  • Basel, Switzerland
  • 11:24 (UTC +02:00)
  • X @phlya
View GitHub Profile
@Phlya
Phlya / MCM_distiller_project.yml
Last active February 21, 2022 15:03
distiller-nf project file used to analyse the bulk Hi-C and micro-C data for MCM-AID HCT116 cell line
# Fastqs can be provided as:
# -- a pairs of relative/absolute paths
# -- sra:<SRA_NUMBER>, optionally followed by the indices of the first and
# the last entry in the SRA in the form of "?start=<first>&end=<last>
# [to implement] -- as a path to a folder with fastqs '<base_folder>', with the structure
# <base_folder>/<library_name>/<run_name>/, with each folder containing only
# two fastq.gz files
input:
raw_reads_paths:
# Hi-C
# Fastqs can be provided as:
# -- a pairs of relative/absolute paths
# -- sra:<SRA_NUMBER>, optionally followed by the indices of the first and
# the last entry in the SRA in the form of "?start=<first>&end=<last>
# [to implement] -- as a path to a folder with fastqs '<base_folder>', with the structure
# <base_folder>/<library_name>/<run_name>/, with each folder containing only
# two fastq.gz files
input:
raw_reads_paths:
Schmitt2016_STL002_Adrenal-1:
@Phlya
Phlya / scaling_from_expected.py
Last active July 18, 2019 09:38
Get scaling plot and data from expected files
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Thu Jul 18 10:07:49 2019
@author: s1529682
"""
from cooltools.lib import numutils
import numpy as np
@Phlya
Phlya / Hi-C_z-scores.ipynb
Last active July 12, 2019 11:07
Hi-C z-scores
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@Phlya
Phlya / compare_hic_region.py
Last active June 26, 2019 16:30
Compare regions in two Hi-C maps
def fetch_random_means(clr1, exp1, clr2, exp2, region_left, region_right=None, n=1000):
assert clr1.binsize==clr2.binsize
if region_right is None:
region_right = region_left
chrom_left, start_left, end_left = cooler.util.parse_region_string(region_left)
chrom_right, start_right, end_right = cooler.util.parse_region_string(region_right)
assert chrom_left == chrom_right
bin1_left = start_left//clr1.binsize
def pcolormesh_45deg(matrix_c, ax, start=0, resolution=1, *args, **kwargs):
start_pos_vector = [start+resolution*i for i in range(len(matrix_c)+1)]
import itertools
n = matrix_c.shape[0]
t = np.array([[1, 0.5], [-1, 0.5]])
matrix_a = np.dot(np.array([(i[1], i[0])
for i in itertools.product(start_pos_vector[::-1],
start_pos_vector)]), t)
x = matrix_a[:, 1].reshape(n + 1, n + 1)
y = matrix_a[:, 0].reshape(n + 1, n + 1)
### do required imports
### make a dict with coolers, like this {'name':cooler.Cooler('path'), ...}
### make a dict with coordinates, smth like this: {'300kb_1': ('chr6', 80200000, 84200000), ...}
# Here is the key function to plot heatmaps like triangles
def pcolormesh_45deg(matrix_c, ax, start=0, resolution=1, *args, **kwargs):
start_pos_vector = [start+resolution*i for i in range(len(matrix_c)+1)]
import itertools
n = matrix_c.shape[0]
@Phlya
Phlya / dedup_dots
Last active November 11, 2018 17:16
def dedup(dots, hiccups_filter=True):
newdots = []
ress = list(sorted(set(dots['res'])))
for chrom in sorted(set(dots['chrom1'])):
chromdots = dots[dots['chrom1']==chrom].sort_values(['start1', 'start2']).reset_index(drop=True)
for res in ress:
chromdots['Supported_%s' % res] = (chromdots['res']==res)
tree = spatial.cKDTree(chromdots[['start1', 'start2']]) #Not sure what's the best coordinate to use, of the strongest pixel, centroid, or the center of the dot?
drop = []
for i, j in tree.query_pairs(r=50000):
@Phlya
Phlya / plot_distiller_stats.py
Last active January 4, 2022 16:30
Saves a few useful plots to visually check quality of Hi-C libraries based on *.stats files from distiller (pairtools stats output)
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Created on Tue Sep 4 11:54:20 2018
@author: Ilya Flyamer
"""
import pandas as pd
import numpy as np
@Phlya
Phlya / cluster.config
Last active May 8, 2020 10:51
SGE cluster config for distiller
process{
// SGE node config
executor = 'sge'
penv = 'sharedmem'
clusterOptions = '-l h_vmem=4G -V'
time='12h'
cpus = 1
maxRetries = 2
distillerTmpDir='/exports/eddie/scratch/ifliamer'