Skip to content

Instantly share code, notes, and snippets.

View avrilcoghlan's full-sized avatar

Avril Coghlan avrilcoghlan

View GitHub Profile
@avrilcoghlan
avrilcoghlan / make_genewise_paramfile.pl
Last active October 13, 2017 15:54
Perl script to make an intron parameter file for GeneWise
#!/usr/local/bin/perl
=head1 NAME
make_genewise_paramfile.pl
=head1 SYNOPSIS
make_genewise_paramfile.pl training_exon_gff fasta outputdir exons_names_sameas_genes output
where training_exon_gff is the input gff file of exons in training set genes,
@avrilcoghlan
avrilcoghlan / run_genewisedb.pl
Last active November 22, 2017 02:11
Perl script to run GeneWise, comparing a file of multiple of HMMs to a fasta file of multiple sequences
#!/usr/local/bin/perl
=head1 NAME
run_genewisedb.pl
=head1 SYNOPSIS
run_genewisedb.pl input_fasta input_hmms output outputdir spliceflat parameterfile
where input_fasta is the input fasta file of scaffolds,
@avrilcoghlan
avrilcoghlan / find_bam_insertsize.pl
Last active October 12, 2015 15:07
Perl script to estimate the mean, median, standard deviation and IQR of insert sizes in a BAM file
#!/usr/bin/env perl
=head1 NAME
find_bam_insertsize.pl
=head1 SYNOPSIS
find_bam_insertsize.pl bam outputdir rpath
where bam is the input bam file,
@avrilcoghlan
avrilcoghlan / get_treefam_alns.pl
Last active October 12, 2015 15:07
Perl script to retrieve all multiple alignments for animal gene families from the TreeFam database
#!/usr/local/bin/perl
=head1 NAME
get_treefam_alns.pl
=head1 SYNOPSIS
get_treefam_alns.pl treefam_version output outputdir
@avrilcoghlan
avrilcoghlan / translate_treefam_cigars_to_alns.pl
Last active October 12, 2015 15:08
Perl script to translate TreeFam cigar-format alignments to fasta-format alignments
#!/usr/local/bin/perl
=head1 NAME
translate_treefam_cigars_to_alns.pl
=head1 SYNOPSIS
translate_treefam_cigars_to_alns.pl treefam_version output outputdir cigars alntype start_with
@avrilcoghlan
avrilcoghlan / run_genewisedb_afterblast.pl
Last active June 19, 2022 12:20
Perl script to run GeneWise by comparing a file of multiple of HMMs to a fasta file of multiple sequences, by running GeneWise on the regions of the DNA sequences where the proteins used to make the HMM have tblastn matches
This file has been truncated, but you can view the full file.
#!/usr/local/bin/perl
=head1 NAME
run_genewisedb_afterblast.pl
=head1 SYNOPSIS
run_genewisedb_afterblast.pl input_fasta input_hmms output outputdir spliceflat parameterfile treefam_seqs eval_cutoff flank_length blast_path
where input_fasta is the input fasta file of scaffolds,
@avrilcoghlan
avrilcoghlan / get_treefam_family_seqs.pl
Last active October 12, 2015 20:08
Perl script to get all the protein sequences in a family in a particular version of the TreeFam database
#!/usr/local/bin/perl
=head1 NAME
get_treefam_family_seqs.pl
=head1 SYNOPSIS
get_treefam_family_seqs.pl treefam_version output outputdir core_only
where treefam_version is the version of TreeFam to use,
@avrilcoghlan
avrilcoghlan / overlap_alignment.pl
Created November 20, 2012 21:52
Perl script to get overlap alignment
#! /usr/bin/perl -w
# Avril Coghlan
# 20-Nov-2012
# Algorithm on page 204 of 'Computational Genome Analysis' by Deonier et al
# define the two sequences
$seqA = "AGGCTAAA";
$seqB = "CAAACGTCT";
print "Sequences:\n";
print "A: $seqA\n";
@avrilcoghlan
avrilcoghlan / embl_to_fasta.pl
Created January 7, 2013 11:56
Perl script to convert an embl format file to a fasta format file
#!/usr/local/bin/perl
=head1 NAME
embl_to_fasta.pl
=head1 SYNOPSIS
embl_to_fasta.pl input_embl output_fasta outputdir
where input_embl is the input embl file,
@avrilcoghlan
avrilcoghlan / convert_chado_gff_to_gtf.pl
Last active December 10, 2015 19:39
Perl script to convert a gff file from the Chado database to a gtf format file for the RNA-SeqQC software
#!/usr/local/bin/perl
=head1 NAME
convert_chado_gff_to_gtf.pl
=head1 SYNOPSIS
convert_chado_gff_to_gtf.pl input_gff output_gtf outputdir
where input_gff is the input embl file,