Skip to content

Instantly share code, notes, and snippets.

{
"metadata": {
"name": "",
"signature": "sha256:5e2aa6b0192161bf476557fbe7d2307d8e28ab5c3ce5229e942d671d5dd9689b"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
@inodb
inodb / rst2code
Last active August 29, 2015 14:10
Bash function to get the code parts from rst file (uses awk). Use either with ``rst_to_code filename`` or ``cat *.rst | rst_to_code -``
#!/bin/sh
# Get only code blocks from rst file, use like:
# rst2code index.rst
# or
# cat index.rst | rst2code -
# pipe to bash if it is bash code e.g.
# rst2code index.rst | bash -x
rst2code() {
awk 'BEGIN {code=0}
{
@inodb
inodb / deinterleave_fastq_merge
Last active August 29, 2015 14:02
If you have a single fastq file that has single reads inbetween paris then this script can help you separate the pairs in r1, r2 and single reads.
cat ../Batch1.fastq | \
awk -v se=se.fastq -v r1=r1.fastq -v r2=r2.fastq '
BEGIN {l5=0}
{
if (l5 == 0) {
l1=$0; getline
}
else {
l1=l5
}
@inodb
inodb / mummer_duplicate_contigs
Created May 16, 2014 15:14
Map contigs against itself with MUMmer. Then find duplicate contigs
for row in $(cat nucmer.coords | awk -v OFS=, '{if ($10 == 100.00 && $11 == 100.00 && $7 == 100.00 && $12 != $13) {print $12,$13}}')
do
echo $row | tr ',' '\n' | sort | tr '\n' ',' | sed 's/,$//'
echo
done | cut -d, -f2 | sort -u > duplicatecontigs.txt
{
"metadata": {
"name": ""
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
@inodb
inodb / bam_length_readcount
Created March 13, 2014 16:49
Count the number of reads mapping to contigs of a given length
#!/bin/sh
samtools idxstats map.bam | awk '{split ($0, a, "\t")} {if (a[2]>999) SUM+=a[3]} END {print SUM}'
@inodb
inodb / rightpath
Last active January 3, 2016 13:29
Get all the access rights of a file or directory. Use like rightpath /path/to/file.txt, or rightpath /path/to/directory
#!/bin/sh
rightpath() {
path=""
for d in `readlink -f $1 | tr / " "`
do
path="$path/$d"
stat -c "%A %a %U %G %n" $path
done | column -t
}
@inodb
inodb / parallel_sickle
Created January 16, 2014 14:10
Run sickle on paired end fastq files in parallel. The paired fastq files should be ending on _1.fastq.gz and _2.fastq.gz. The resulting files will be in the current directory with name $base.sickle.{pe1,pe2,se}.fastq. The filenames/directory structure is based on reads delivered by the sequencing facility at SciLifeLab, Sweden to the UPPMAX clus…
#!/bin/sh
parallel lib1={}';' \
lib2='$('echo {} '|' sed s/_1.fastq.gz/_2.fastq.gz/');' \
base='$('basename {} _1.fastq.gz');' \
sickle pe -f '$lib1' -r '$lib2' -t sanger \
-o '$base'.sickle.pe1.fastq \
-p '$base'.sickle.pe2.fastq \
-s '$base'.sickle.se.fastq \
::: /proj/b2010008/INBOX/A.Andersson_13_06/*/*/*_1.fastq.gz