This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
In R: | |
png("heatmap_before_sorting.png") | |
image(distance_matrix) | |
dev.off() | |
library("gclus") | |
ordered <- order.single(distance_matrix, clusters=NULL) | |
sorted_matrix <- distance_matrix[ordered, ordered] | |
In Python: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
urls={"1":"ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/ALL.chr1.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz", | |
"10":"ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/ALL.chr10.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz", | |
"11":"ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/ALL.chr11.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz", | |
"12":"ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/ALL.chr12.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz", | |
"13":"ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/ALL.chr13.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz", | |
"14":"ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/ALL.chr14.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz", | |
"15":"ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/release/20130502/ALL.chr15.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz", | |
"16": |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ python2.7 parse.py | |
0.797344923019 | |
$ julia parse.jl | |
16789.66212272644 | |
o.O | |
Tested against a 91mb big tab-delimited input-file. | |
With a 800mb big file: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
min = 1000000000 | |
5.times do | |
t = Time.now | |
File.open("./backParsedTapidorContigs.csv").each do |r| | |
r.split("\t") | |
end | |
finish = (Time.now - t) | |
if finish < min | |
min = finish | |
end |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
------------------------------------------------------- | |
C++: | |
#include <iostream> | |
#include <string> | |
#include <fstream> | |
#include <boost/tokenizer.hpp> | |
#include <vector> | |
#include <iterator> | |
#include <sys/time.h> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
return_dict = {} | |
for line in file_handle: | |
line_list = line.split() | |
if line_list[0] in return_dict.keys(): | |
return_dict[line_list[0]].append(line_list) | |
else: | |
return_dict[line_list[0]] = [line_list] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
from collections import defaultdict | |
d = defaultdict(list) | |
for line in file_handle: | |
line_list = line.split | |
d[line_list[0]].append(line_list) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Fileformat is | |
# Gene Chr Pos Cov | |
# gene_A A01 40 0 | |
# gene_A A01 41 2 | |
# ... | |
# gene_D A01 508 41 | |
# gene_D A01 509 42 | |
# ... | |
geneCov <- read.table("./test_genes.txt", sep="\t", header=F) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<!DOCTYPE html> | |
<meta charset="utf-8"> | |
<style> | |
.node { | |
stroke: #fff; | |
stroke-width: 1.5px; | |
} | |
.link { |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<!DOCTYPE html> | |
<meta charset="utf-8"> | |
<style> | |
.node { | |
stroke: #fff; | |
stroke-width: 1.5px; | |
} | |
.link { |
OlderNewer