Last active
June 15, 2017 17:16
-
-
Save ShaiberAlon/4bade075f0b65ca707196c798e9e6391 to your computer and use it in GitHub Desktop.
Create a fake gene call from nucleotides in the range between two pre-existing gene calls
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Short bash to create an artificial gene call for a sequence for a collection of genes | |
# the artificial gene call would start at the first nucleotide of the first gene (denoted n_i) and end at the | |
# last nucleotide of the second gene (denoted n_f). The assumption is that n_i < n_f. | |
# Another assumption: both genes are from the same contig. | |
# The output is a file in the external gene calls format of anvi'o | |
# Example | |
# $ bash create_my_gene_calls_file.shx MY_GENE_CALL_FILE.txt CONTIGS.db 45 105 | |
# this will give a gene call starting from the first nucleotide of gene 45 and ending at the last nucleotide of gene 105 | |
# INPUTS: | |
output=$1 # the name of the output file | |
db=$2 # path to contigs database | |
start_gene=$3 # gene call for the first gene | |
end_gene=$4 # gene call for the last gene | |
n_i=`sqlite3 $db "select * from genes_in_contigs where gene_callers_id==$start_gene" | cut -f 3 -d \|` | |
n_f=`sqlite3 $db "select * from genes_in_contigs where gene_callers_id==$end_gene" | cut -f 4 -d \|` | |
contig_id=`sqlite3 $db "select * from genes_in_contigs where gene_callers_id==$start_gene" | cut -f 2 -d \|` | |
echo -e "gene_callers_id\tcontig\tstart\tstop\tdirection\tpartial\tsource\tversion" > $output | |
echo -e "0\t$contig_id\t$n_i\t$n_f\tf\t0\tTHE_BEST_GENE_CALL_SOFTWARE\tv_inf" >> $output |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment