Skip to content

Instantly share code, notes, and snippets.

View avrilcoghlan's full-sized avatar

Avril Coghlan avrilcoghlan

View GitHub Profile
@avrilcoghlan
avrilcoghlan / find_best_genewise_genes.pl
Last active December 11, 2015 10:28
Perl script to take a gff file of gene predictions from GeneWise (or from some other program), and to make an output file with just the set of highest-scoring non-overlapping gene predictions.
#!/usr/local/bin/perl
=head1 NAME
find_best_genewise_genes.pl
=head1 SYNOPSIS
find_best_genewise_genes.pl input_gff path_to_bedtools output_gff outputdir
where input_gff is the input GeneWise gff file,
@avrilcoghlan
avrilcoghlan / get_spliced_transcripts_from_gff.pl
Last active December 11, 2015 13:08
Perl script to infer the DNA sequences of transcripts for genes, given an input gff file of gene predictions
#!/usr/local/bin/perl
=head1 NAME
get_spliced_transcripts_from_gff.pl
=head1 SYNOPSIS
get_spliced_transcripts_from_gff.pl input_gff input_fasta output_fasta outputdir ignore_phase ignore_semicolons from_augustus
where input_gff is the input gff file,
@avrilcoghlan
avrilcoghlan / reformat_fasta.pl
Created February 8, 2013 11:09
Perl script to reformat a fasta file so that each sequence line has just 60 characters (amino acids or bases) per line.
#!/usr/local/bin/perl
=head1 NAME
reformat_fasta.pl
=head1 SYNOPSIS
reformat_fasta.pl input_fasta output_fasta outputdir
where input_fasta is the input fasta file,
@avrilcoghlan
avrilcoghlan / run_cnvnator_on_assembly.pl
Last active December 12, 2015 07:39
Perl script to run CNVnator on a genome assembly
#!/usr/local/bin/perl
=head1 NAME
run_cnvnator_on_assembly.pl
=head1 SYNOPSIS
run_cnvnator_on_assembly.pl input_fasta input_bam output outputdir path_to_cnvnator windowsize
where input_fasta is the input fasta file,
@avrilcoghlan
avrilcoghlan / run_cnvnator_on_farm_wrapper.pl
Last active December 12, 2015 07:39
Perl script to run CNVnator on the Sanger compute farm
#!/usr/local/bin/perl
=head1 NAME
run_cnvnator_on_farm_wrapper.pl
=head1 SYNOPSIS
run_cnvnator_on_farm_wrapper.pl input_fasta input_bam cnvnator window_size output_gff outputdir
where input_fasta is the input fasta file for the assembly,
@avrilcoghlan
avrilcoghlan / gff_to_embl.pl
Last active October 22, 2017 20:12
Perl script to convert a gff file to embl format
#!/usr/local/bin/perl
=head1 NAME
gff_to_embl.pl
=head1 SYNOPSIS
gff_to_embl.pl gff fasta outputdir exonerate cegma ratt augustus
where gff is the gff file of gene predictions,
@avrilcoghlan
avrilcoghlan / embl_to_gff.pl
Last active December 14, 2015 07:19
Perl script to convert an embl file to gff format
#!/usr/local/bin/perl
=head1 NAME
embl_to_gff.pl
=head1 SYNOPSIS
embl_to_gff.pl input_embl output_gff outputdir ratt
where input_embl is the input embl file,
@avrilcoghlan
avrilcoghlan / make_aln_and_hmm_for_treefam_family.pl
Created February 28, 2013 13:44
Perl script that, for a TreeFam family, gets sequences from the database, and builds an alignment & HMM
#!/usr/local/bin/perl
=head1 NAME
make_aln_and_hmm_for_treefam_family.pl
=head1 SYNOPSIS
make_aln_and_hmm_for_treefam_family.pl treefam_version output outputdir family hmmer_bin aln_output map_output
@avrilcoghlan
avrilcoghlan / translate_treefam_cigars_to_hmms.pl
Created February 28, 2013 13:46
Perl script that reads in a cigar-format alignment for a TreeFam familiy, and makes a HMM for the family
#!/usr/local/bin/perl
=head1 NAME
translate_treefam_cigars_to_hmms.pl
=head1 SYNOPSIS
translate_treefam_cigars_to_hmm.pl treefam_version hmms_output outputdir cigars alntype family hmmer_bin alns_output map_output
@avrilcoghlan
avrilcoghlan / treefam_gene_losses.pl
Created March 1, 2013 13:22
Perl script to identify gene losses in human since divergence from chimp, based on TreeFam trees
#!/usr/local/bin/perl
#
# Perl script treefam_genelosses.pl
# Written by Avril Coghlan (alc@sanger.ac.uk).
# 28-Aug-06.
#
# For the TreeFam project.
#
# This perl script connects to the MYSQL database of