Skip to content

Instantly share code, notes, and snippets.

View felipealbrecht's full-sized avatar
🎯
Focusing

Felipe Albrecht felipealbrecht

🎯
Focusing
View GitHub Profile
library(DeepBlueR)
library(gplots)
library(RColorBrewer)
library(matrixStats)
library(stringr)
#crest_data <- deepblue_list_experiments(project = "CREST", genome="grch38", epigenetic_mark = "DNA Methylation")
#crest_data <- crest_data[grep("_call", deepblue_extract_names(crest_data)),]
@felipealbrecht
felipealbrecht / extract_annotations.py
Created April 27, 2017 12:38
Extracting the DeepBlue annotations - useful for backup
import os
import xmlrpclib
import time
import StringIO
import gzip
url = "http://deepblue.mpi-inf.mpg.de/xmlrpc"
genome = "grch38"
user_key = "anonymous_key"
@felipealbrecht
felipealbrecht / 01-14-count-motif.py
Last active June 17, 2018 09:02
Count how many times a motif appears in the region sequence
#Our objective is to extract the average DNA methylation (beta value) in all regulatory elements defined in the BLUEPRINT regulatory build (modified from~\cite{Zerbino2015}) across all BLUEPRINT samples. After downloading these data from DeepBlue we use the package gplots to create a heatmap showing the most variable regions (rows) across samples (columns). Moreover, we cluster samples by their pairwise Spearman correlation coefficients (Figure~\ref{Fig:1}).
#In Listing 1 we demonstrate how DeepBlueR can be used to quickly and efficiently aggregate a large data set on the DeepBlue server in preparation for downstream analysis in R. We omit the R code for generating the heatmap here. It can be found in the supplemental material. To retrieve the required data, we first select all bisulfite sequencing experiments from the BLUEPRINT project (lines $5$-$9$). Our selection comprises $206$ data files and $47$ distinct cell types after filtering for the correct file type (lines $9$-$11$). For each file, we select th
@felipealbrecht
felipealbrecht / blueprint_gene_exprs_fpkm.r
Last active October 29, 2016 06:52 — forked from mlist/blueprint_gene_exprs_fpkm.r
Download all BLUEPRINT gene expression data and format as numeric matrix
# Load dependencies
# install DeepBlueR from bioconductor
# http://bioconductor.org/packages/release/bioc/html/DeepBlueR.html
library(DeepBlueR)
library(dplyr)
library(tidyr)
# List all BLUEPRINT samples
blueprint_samples <- deepblue_list_samples(
extra_metadata = list("source" = "BLUEPRINT Epigenome"))
@felipealbrecht
felipealbrecht / example_signal_regions.py
Created July 22, 2016 14:32
Aggregate genes regions
import xmlrpclib
import time
url = "http://deepblue.mpi-inf.mpg.de/xmlrpc"
user_key = "anonymous_key"
server = xmlrpclib.Server(url, allow_none=True)
GENOME = "hg19"
@felipealbrecht
felipealbrecht / deepblue-changelog.md
Last active May 4, 2016 22:06
DeepBlue Change Log

DeepBlue Epigenomic Data Server change log

Version 1.7.0 - (04.04.2016)

  • Stable and published versions.

Version 1.7.3 - (19.04.2016)

  • Command select_genes supports filtering genes by chromosome location. This version breake old code where the select_genes were used. For fixinging it, just include Null,Null, and Null, after the gene_set parameter. Examples and code were updated.

Version 1.7.5 - (27.04.2016)

import random
def mp(Matrix):
for m in Matrix:
print m
def fill(Matrix, pos, start_x, end_x, start_y, end_y):
if (start_x == end_x) or (start_y == end_y):
return
@felipealbrecht
felipealbrecht / deepblue-listing-experiments.py
Last active June 17, 2018 09:41
List all peak experiments from the biosources "inflammatory macrophage" and "macrophage" from the Blueprint project
import xmlrpclib
url = "http://deepblue.mpi-inf.mpg.de/xmlrpc"
user_key = "anonymous_key"
server = xmlrpclib.Server(url, allow_none=True)
# List all peak experiments from the biosources "inflammatory macrophage" and "macrophage" from the Blueprint project
(status, experiments) = server.list_experiments(None, "peaks", "H3K4me3", ["inflammatory macrophage", "macrophage"], None, None, "BLUEPRINT Epigenome", user_key)
@felipealbrecht
felipealbrecht / find_first_ocorrece_where_value_is_higher.cpp
Last active December 12, 2015 18:55
Binary search of the find_first_ocorrece_where_value_is_higher
#include <iostream>
#include <vector>
typedef std::vector<int> Array;
int find_first_ocorrece_where_value_is_higher(Array& a, int value);
int main() {
Array v = {0, 10, 20, 30, 40, 50};
@felipealbrecht
felipealbrecht / frog.cpp
Created December 12, 2015 17:43
Jumping frog
// The frog start at position -1, the river starts at position 0, the frog must go to position river size (path) or bigger
// Compile with c++11, e.g. in clang++: clang++ frog.cpp -std=c++11
#include <vector>
#include <iostream>
#include <tuple>
typedef std::vector<int> Path;
typedef std::vector<std::tuple<int, int> > Visited;