Skip to content

Instantly share code, notes, and snippets.

View Tabea-K's full-sized avatar

Tabea Kischka Tabea-K

  • Münster, Germany
View GitHub Profile
@Tabea-K
Tabea-K / sort_blastn_by_bitscore.sh
Last active June 1, 2016 12:39
Sorts a BLAST output file in format X (option -X) by the highest bitscore. From http://seqanswers.com/forums/showthread.php?t=23166
sort -k1,1 -k12,12gr -k11,11g -k3,3gr blastout.txt | sort -u -k1,1 --merge > bestHits
@Tabea-K
Tabea-K / add_numbers_to_fastq_headers.sh
Created August 25, 2015 08:15
Adds incrementing numbers to the headers in a fastq file.
awk 'BEGIN{s=0}{if(NR % 4==1){print $0"_"s; s=s+1}else{print $0}}'
@Tabea-K
Tabea-K / get_fastq_seq_lengths.sh
Created August 25, 2015 08:17
Prints the header line and lengths of sequences in a fastq file in a tab-separated format. Example for output: @seq1 2324 @seq2 1365
awk '{if(NR % 4==1){printf $0"\t"}else if(NR % 4==2){print length($0)}}'
@Tabea-K
Tabea-K / pdf2png.sh
Created August 26, 2015 06:45
Uses imagemagick to convert a pdf file into png without much quality loss.
#!/usr/bin/env bash
convert -density 300 -resize 25% $1 $2
@Tabea-K
Tabea-K / fastq_check_lengths.py
Created August 26, 2015 06:57
Prints the lengths of the quality and the sequence lines of a fastq file, and prints a boolean indicating whether the quality line and the sequence line have the same length.
#!/usr/bin/env python
import sys
import os
import argparse
q_ascii = "!#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~"
q_ascii += '"'
def id_error(line_no, name, line):
@Tabea-K
Tabea-K / fastqgrep.py
Last active June 1, 2016 12:37
Script to search for specific strings in the annotation lines of fastq files. Similar to grep, therefore the name!
#!/usr/bin/env python
'''
This script extracts specific fastq sequences
from a multi-fastq sequence file, with the ID's
of the sequences to be extracted in a txt file.
'''
import sys
import argparse
import os
@Tabea-K
Tabea-K / draw_aln.py
Created August 27, 2015 04:24
Uses TeX to create a pdf file with the visualization of an alignment.
#!/usr/bin/env python
"""
Creates a pdf alignment
"""
import sys
import os
import tempfile
input_filename = sys.argv[1]
@Tabea-K
Tabea-K / maf2fasta.py
Created August 27, 2015 04:25
Converts a maf alignment file into fasta file.
#!/usr/bin/env python
"""
Created by Tabea Kischka at 2015-01-27 10:34:05
converts a last maf alignment into a fasta alignment
"""
import sys
path_to_alignio = '/home/tabeah/Scripts/alignio-maf'
@Tabea-K
Tabea-K / nr_of_common_lines_per_column.sh
Last active July 9, 2020 08:17
Prints the number of identical rows between different columns for two csv files. The first argument is the column number which should be used. For example, you can compare the IDs given in a csv file. Mainly a wrapper around the comm command.
#!/usr/bin/env bash
# Prints the number of identical rows between different columns for two
# csv files. The first argument is the column number which should be used.
# For example, you can compare the IDs given in a csv file.
cut -f $1 $2 | sort > .file1
cut -f $1 $3 | sort > .file2
# With no options, comm produces three-column output.
@Tabea-K
Tabea-K / fasta2fastq.sh
Last active June 1, 2016 12:34
convert fasta to fastq, using a single character for quality ("I"). This is useful for creating dummy files to use for debugging.
awk '/^>/ {printf("\n%s\n",$0);next; } { printf("%s",$0);} END {printf("\n");}' | awk 'BEGIN {RS = ">" ; FS = "\n"} NR > 1 {print "@"$1"\n"$2"\n+"$1"\n"gensub(/./, "I", "g", $2)}'