Skip to content

Instantly share code, notes, and snippets.

View dakl's full-sized avatar
💌
ML Engineer @ Epidemic Sound

Daniel Klevebring dakl

💌
ML Engineer @ Epidemic Sound
  • Epidemic Sound
  • Stockholm
View GitHub Profile
@dakl
dakl / tensorboard_logging.py
Created December 20, 2017 14:16 — forked from gyglim/tensorboard_logging.py
Logging to tensorboard with manually generated summaries (not relying on summary ops)
"""Simple example on how to log scalars and images to tensorboard without tensor ops.
License: Copyleft
"""
__author__ = "Michael Gygli"
import tensorflow as tf
from StringIO import StringIO
import matplotlib.pyplot as plt
import numpy as np
#/usr/bin/env bash
sudo apt install -y bzip2 git htop
######################################################
# FISH SHELL
sudo su -
echo 'deb http://download.opensuse.org/repositories/shells:/fish:/release:/2/Debian_9.0/ /' > /etc/apt/sources.list.d/fish.list
apt-get update -y
11540659 + 0 in total (QC-passed reads + QC-failed reads)
6365 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
11540643 + 0 mapped (100.00%:-nan%)
11534294 + 0 paired in sequencing
5767147 + 0 read1
5767147 + 0 read2
11518384 + 0 properly paired (99.86%:-nan%)
11534266 + 0 with itself and mate mapped
Sample HS_LIBRARY_SIZE HS_PENALTY_50X PCT_TARGET_BASES_10X OFF_BAIT_BASES ON_BAIT_BASES PCT_OFF_BAIT ON_BAIT_VS_SELECTED HS_PENALTY_30X PCT_TARGET_BASES_30X READ_GROUP
HS_PENALTY_40X TOTAL_READS ON_TARGET_BASES MEAN_TARGET_COVERAGE ZERO_CVG_TARGETS_PCT PCT_PF_UQ_READS BAIT_TERRITORY PCT_USABLE_BASES_ON_TARGET PCT_PF_READS LIBRARY PF_READS PCT_TARGET_BASES_2X GC_DROPOUT AT_DROPOUT PCT_TARGET_BASES_50X GENOME_SIZE PCT_PF_UQ_READS_ALIGNED FOLD_80_BASE_PENALTY HS_PENALTY_10X PCT_USABLE_BASES_ON_BAIT HS_PENALTY_20X PF_UNIQUE_READS PCT_TARGET_BASES_20X SAMPLE NEAR_BAIT_BASES TARGET_TERRITORY FOLD_ENRICHMENT PCT_TARGET_BASES_40X PCT_TARGET_BASES_100X BAIT_DESIGN_EFFICIENCY BAIT_SET HS_PENALTY_100X PCT_SELECTED_BASES MEAN_BAIT_COVERAGE PF_UQ_BASES_ALIGNED PF_UQ_READS_ALIGNED
P-00017852-N-03021179-TD2-CB1 0.0 0.0 899777371.0 524812.0 0.997626 0.245107 0.0 0.0
0.0 9609180.0 524812.0 0.899072 0.836774 1.0 1689097.0 0.000543 1.0 9609180.0 0.045124 2.779323 2.431879 0.0 3137454505.0 0.938492 ? 0.0 0.000543 0.0 960918
COMMAND LINE: skewer -z -t 2 --quiet -o trimmed test_1.fastq.gz test_2.fastq.gz
Input file: test_1.fastq.gz
Paired file: test_2.fastq.gz
trimmed: trimmed-pair1.fastq.gz, trimmed-pair2.fastq.gz
Parameters used:
-- 3' end adapter sequence (-x): AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC
-- paired 3' end adapter sequence (-y): AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTA
-- maximum error ratio allowed (-r): 0.100
-- maximum indel error ratio allowed (-d): 0.030
@dakl
dakl / oxog.txt
Last active February 10, 2016 13:26
## htsjdk.samtools.metrics.StringHeader
# picard.analysis.CollectOxoGMetrics INPUT=virtual-tumor.bam OUTPUT=oxog.txt REFERENCE_SEQUENCE=/proj/b2010040/private/nobackup/autoseqer-genome/genome/human_g1k_v37_decoy.fasta MINIMUM_QUALITY_SCORE=20 MINIMUM_MAPPING_QUALITY=30 MINIMUM_INSERT_SIZE=60 MAXIMUM_INSERT_SIZE=600 USE_OQ=true CONTEXT_SIZE=1 STOP_AFTER=2147483647 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false
## htsjdk.samtools.metrics.StringHeader
# Started on: Wed Feb 10 11:23:36 UTC 2016
## METRICS CLASS picard.analysis.CollectOxoGMetrics$CpcgMetrics
SAMPLE_ALIAS LIBRARY CONTEXT TOTAL_SITES TOTAL_BASES REF_NONOXO_BASES REF_OXO_BASES REF_TOTAL_BASES ALT_NONOXO_BASES ALT_OXO_BASES OXIDATION_ERROR_RATE OXIDATION_Q C_REF_REF_BASES G_REF_REF_BASES C_REF_ALT_BASES G_REF_ALT_BASES C_REF_OXO_ERROR_RATE C_REF_OXO_Q G_REF_OXO_ERROR_RATE G_REF_OXO_Q
virtual-tumor virtual-tumor ACT 27 388 238 150 388 0 0 0.002577 25.888
$ vagrant up
Bringing machine 'stackstorm-master' up with 'virtualbox' provider...
Bringing machine 'testarteria1' up with 'virtualbox' provider...
Bringing machine 'testuppmax' up with 'virtualbox' provider...
==> stackstorm-master: Importing base box 'ubuntu/trusty64'...
==> stackstorm-master: Matching MAC address for NAT networking...
==> stackstorm-master: Checking if box 'ubuntu/trusty64' is up to date...
==> stackstorm-master: Setting the name of the VM: arteria-provisioning_stackstorm-master_1445365324370_61846
==> stackstorm-master: Clearing any previously set forwarded ports...
==> stackstorm-master: Fixed port collision for 22 => 2222. Now on port 2201.
#Define the list of machines
machines = {
:reportcreatormahcine => {
:hostname => "reportcreatormahcine",
:ipaddress => "10.10.10.99" #gretzky
},
}
#--------------------------------------
# General provisioning inline script
@dakl
dakl / tcga-vcfs-to-pancan19.sh
Last active November 23, 2015 14:30
Merge VCFs from 19 TCGA projects to a pancan19 VCF
#!/bin/bash
REF=/proj/b2010040/private/nobackup/autoseqer-genome/genome/human_g1k_v37_decoy.fasta
DICT=/proj/b2010040/private/nobackup/autoseqer-genome/genome/human_g1k_v37_decoy.dict
GATKJAR=/home/daniel.klevebring/projects/tcga-maf-to-vcf/GenomeAnalysisTK.jar
WORKDIR=$HOME/Crisp/dakl/tcga-maf-to-vcf
mkdir -p $WORKDIR
cd $WORKDIR
@dakl
dakl / tcga-maf-to-vcf.sh
Last active February 12, 2021 14:06
Convert TCGA MAFs to VCFs to use to annotate other VCF files with variant frequency in the various TCGA projects (including pancan12).
REF=/proj/b2010040/private/nobackup/autoseqer-genome/genome/human_g1k_v37_decoy.fasta
DICT=/proj/b2010040/private/nobackup/autoseqer-genome/genome/human_g1k_v37_decoy.dict
GATKJAR=/home/daniel.klevebring/projects/tcga-maf-to-vcf/GenomeAnalysisTK.jar
WORKDIR=$HOME/Crisp/dakl/tcga-maf-to-vcf
mkdir -p $WORKDIR
cd $WORKDIR
#unpack maf to workdir
tar xvfz ~/projects/tcga-maf-to-vcf/pancan_cleaned_mafs.tar.gz