Skip to content

Instantly share code, notes, and snippets.

View avrilcoghlan's full-sized avatar

Avril Coghlan avrilcoghlan

View GitHub Profile
@avrilcoghlan
avrilcoghlan / find_bam_insertsize.pl
Last active October 12, 2015 15:07
Perl script to estimate the mean, median, standard deviation and IQR of insert sizes in a BAM file
#!/usr/bin/env perl
=head1 NAME
find_bam_insertsize.pl
=head1 SYNOPSIS
find_bam_insertsize.pl bam outputdir rpath
where bam is the input bam file,
@avrilcoghlan
avrilcoghlan / get_treefam_alns.pl
Last active October 12, 2015 15:07
Perl script to retrieve all multiple alignments for animal gene families from the TreeFam database
#!/usr/local/bin/perl
=head1 NAME
get_treefam_alns.pl
=head1 SYNOPSIS
get_treefam_alns.pl treefam_version output outputdir
@avrilcoghlan
avrilcoghlan / translate_treefam_cigars_to_alns.pl
Last active October 12, 2015 15:08
Perl script to translate TreeFam cigar-format alignments to fasta-format alignments
#!/usr/local/bin/perl
=head1 NAME
translate_treefam_cigars_to_alns.pl
=head1 SYNOPSIS
translate_treefam_cigars_to_alns.pl treefam_version output outputdir cigars alntype start_with
@avrilcoghlan
avrilcoghlan / get_treefam_family_seqs.pl
Last active October 12, 2015 20:08
Perl script to get all the protein sequences in a family in a particular version of the TreeFam database
#!/usr/local/bin/perl
=head1 NAME
get_treefam_family_seqs.pl
=head1 SYNOPSIS
get_treefam_family_seqs.pl treefam_version output outputdir core_only
where treefam_version is the version of TreeFam to use,
@avrilcoghlan
avrilcoghlan / overlap_alignment.pl
Created November 20, 2012 21:52
Perl script to get overlap alignment
#! /usr/bin/perl -w
# Avril Coghlan
# 20-Nov-2012
# Algorithm on page 204 of 'Computational Genome Analysis' by Deonier et al
# define the two sequences
$seqA = "AGGCTAAA";
$seqB = "CAAACGTCT";
print "Sequences:\n";
print "A: $seqA\n";
@avrilcoghlan
avrilcoghlan / embl_to_fasta.pl
Created January 7, 2013 11:56
Perl script to convert an embl format file to a fasta format file
#!/usr/local/bin/perl
=head1 NAME
embl_to_fasta.pl
=head1 SYNOPSIS
embl_to_fasta.pl input_embl output_fasta outputdir
where input_embl is the input embl file,
@avrilcoghlan
avrilcoghlan / convert_chado_gff_to_gtf.pl
Last active December 10, 2015 19:39
Perl script to convert a gff file from the Chado database to a gtf format file for the RNA-SeqQC software
#!/usr/local/bin/perl
=head1 NAME
convert_chado_gff_to_gtf.pl
=head1 SYNOPSIS
convert_chado_gff_to_gtf.pl input_gff output_gtf outputdir
where input_gff is the input embl file,
@avrilcoghlan
avrilcoghlan / find_best_genewise_genes.pl
Last active December 11, 2015 10:28
Perl script to take a gff file of gene predictions from GeneWise (or from some other program), and to make an output file with just the set of highest-scoring non-overlapping gene predictions.
#!/usr/local/bin/perl
=head1 NAME
find_best_genewise_genes.pl
=head1 SYNOPSIS
find_best_genewise_genes.pl input_gff path_to_bedtools output_gff outputdir
where input_gff is the input GeneWise gff file,
@avrilcoghlan
avrilcoghlan / get_spliced_transcripts_from_gff.pl
Last active December 11, 2015 13:08
Perl script to infer the DNA sequences of transcripts for genes, given an input gff file of gene predictions
#!/usr/local/bin/perl
=head1 NAME
get_spliced_transcripts_from_gff.pl
=head1 SYNOPSIS
get_spliced_transcripts_from_gff.pl input_gff input_fasta output_fasta outputdir ignore_phase ignore_semicolons from_augustus
where input_gff is the input gff file,
@avrilcoghlan
avrilcoghlan / run_cnvnator_on_assembly.pl
Last active December 12, 2015 07:39
Perl script to run CNVnator on a genome assembly
#!/usr/local/bin/perl
=head1 NAME
run_cnvnator_on_assembly.pl
=head1 SYNOPSIS
run_cnvnator_on_assembly.pl input_fasta input_bam output outputdir path_to_cnvnator windowsize
where input_fasta is the input fasta file,