Having an issue running GeneMark, within Braker.
Versions
perl /share/braker.pl --version
braker.pl version 1.6
perl /share/gm_et_linux_64/gmes_petap/gmes_petap.pl
# -------------------
Usage: /share/gm_et_linux_64/gmes_petap/gmes_petap.pl [options] --sequence [filename]
GeneMark-ES Suite version 4.30
includes transcript (GeneMark-ET) and protein (GeneMark-EP) based training and prediction
Running BRAKER:
perl /share/braker.pl \
--genome ../genome/Mya.genome.v1.1.1.fasta \
--bam ../genome/transcriptome.clam.v.1.11.bam \
--BAMTOOLS_PATH=/share/bamtools/bin/ \
--UTR on --cores 10 --species=Mya_a
perl /share/braker.pl --genome ../genome/Mya.genome.v1.1.1.fasta --bam ../genome/transcriptome.clam.v.1.11.bam --BAMTOOLS_PATH=/share/bamtools/bin/ --UTR on --cores 10 --species=Mya_a
NEXT STEP: check files and settings
NEXT STEP: check options
... options check complete.
NEXT STEP: check fasta headers
fasta headers check complete.
NEXT STEP: create SAM header file /mouse/Mya/maker/braker/Mya_a/transcriptome_header.sam.
SAM file /mouse/Mya/maker/braker/Mya_a/transcriptome_header.sam complete.
NEXT STEP: check BAM headers
headers check for BAM file /mouse/Mya/maker/../genome/transcriptome.clam.v.1.11.bam complete.
NEXT STEP: make hints from BAM file /mouse/Mya/maker/../genome/transcriptome.clam.v.1.11.bam
Wait a moment, calculating maximum block size that needs to be allocated... .. done
hints from BAM file /mouse/Mya/maker/../genome/transcriptome.clam.v.1.11.bam added.
NEXT STEP: sort hints
hints sorted.
NEXT STEP: summarize multiple identical hints to one
hints joined.
NEXT STEP: filter introns, find strand and change score to 'mult' entry
strands found and score changed.
hints file complete.
NEXT STEP: execute GeneMark-ET
failed to execute: perl /share/gm_et_linux_64/gmes_petap//gmes_petap.pl --sequence=/mouse/Mya/maker/braker/Mya_a/genome.fa --ET=/mouse/Mya/maker/braker/Mya_a/hintsfile.gff --cores=10 1>/mouse/Mya/maker/braker/Mya_a/GeneMark-ET
.stdout 2>/mouse/Mya/maker/braker/Mya_a/errors/GeneMark-ET.stderr
The stderr and stdout are empty
here is gmes.log
more gmes.log
/share/gm_et_linux_64/gmes_petap//gmes_petap.pl : [Mon Oct 5 08:52:40 2015] /share/gm_et_linux_64/gmes_petap/probuild --reformat_fasta --uppercase --allow_x --letters_per_line 60 --out data/dna.fna --label _dna --trace info/d
na.trace --in /mouse/Mya/maker/braker/Mya/genome.fa
/share/gm_et_linux_64/gmes_petap//gmes_petap.pl : [Mon Oct 5 08:52:41 2015] /share/gm_et_linux_64/gmes_petap/reformat_gff.pl --out data/et.gff --trace info/dna.trace --in /mouse/Mya/maker/braker/Mya/hintsfile.gff --quiet
/share/gm_et_linux_64/gmes_petap//gmes_petap.pl : [Mon Oct 5 08:53:54 2015] /share/gm_et_linux_64/gmes_petap/probuild --seq data/dna.fna --allow_x --stat info/dna.general
/share/gm_et_linux_64/gmes_petap//gmes_petap.pl : [Mon Oct 5 08:55:11 2015] /share/gm_et_linux_64/gmes_petap/probuild --seq data/dna.fna --allow_x --stat_fasta info/dna.multi_fasta
/share/gm_et_linux_64/gmes_petap//gmes_petap.pl : [Mon Oct 5 08:55:46 2015] /share/gm_et_linux_64/gmes_petap/probuild --seq data/dna.fna --allow_x --substring_n_distr info/dna.gap_distr
/share/gm_et_linux_64/gmes_petap//gmes_petap.pl : [Mon Oct 5 08:57:23 2015] /share/gm_et_linux_64/gmes_petap/gc_distr.pl --in data/dna.fna --out info/dna.gc.csv --w 1000,8000
/share/gm_et_linux_64/gmes_petap//gmes_petap.pl : [Mon Oct 5 08:58:38 2015] /share/gm_et_linux_64/gmes_petap/probuild --seq /mouse/Mya/maker/data/dna.fna --split dna.fa --max_contig 5000000 --min_contig 50000 --letters_per
_line 100 --split_at_n 5000 --split_at_x 5000 --allow_x --x_to_n --trace ../../info/training.trace
/share/gm_et_linux_64/gmes_petap//gmes_petap.pl : [Mon Oct 5 08:59:05 2015] /share/gm_et_linux_64/gmes_petap/rescale_gff.pl --in data/et.gff --trace info/training.trace --out data/et_training.gff
/share/gm_et_linux_64/gmes_petap//gmes_petap.pl : [Mon Oct 5 08:59:22 2015] /share/gm_et_linux_64/gmes_petap/probuild --seq data/training.fna --stat info/training.general --allow_x --GC_PRECISION 0
/share/gm_et_linux_64/gmes_petap//gmes_petap.pl : [Mon Oct 5 08:59:26 2015] /share/gm_et_linux_64/gmes_petap/parse_by_introns.pl --section ET_ini --cfg /mouse/Mya/maker/run.cfg --parse_dir /mouse/Mya/maker/run/ET_ini
/share/gm_et_linux_64/gmes_petap//gmes_petap.pl : [Mon Oct 5 08:59:27 2015] /share/gm_et_linux_64/gmes_petap/make_nt_freq_mat.pl --cfg /mouse/Mya/maker/run.cfg --section donor_GT --format DONOR
/share/gm_et_linux_64/gmes_petap//gmes_petap.pl : [Mon Oct 5 08:59:27 2015] error
/share/gm_et_linux_64/gmes_petap//gmes_petap.pl : [Mon Oct 5 09:06:36 2015] /share/gm_et_linux_64/gmes_petap/probuild --reformat_fasta --uppercase --allow_x --letters_per_line 60 --out data/dna.fna --label _dna --trace info/d
na.trace --in /mouse/Mya/maker/braker/Mya/genome.fa
/share/gm_et_linux_64/gmes_petap//gmes_petap.pl : [Mon Oct 5 09:06:37 2015] /share/gm_et_linux_64/gmes_petap/reformat_gff.pl --out data/et.gff --trace info/dna.trace --in /mouse/Mya/maker/braker/Mya/hintsfile.gff --quiet
/share/gm_et_linux_64/gmes_petap//gmes_petap.pl : [Mon Oct 5 09:07:45 2015] /share/gm_et_linux_64/gmes_petap/probuild --seq data/dna.fna --allow_x --stat info/dna.general
/share/gm_et_linux_64/gmes_petap//gmes_petap.pl : [Mon Oct 5 09:09:06 2015] /share/gm_et_linux_64/gmes_petap/probuild --seq data/dna.fna --allow_x --stat_fasta info/dna.multi_fasta
/share/gm_et_linux_64/gmes_petap//gmes_petap.pl : [Mon Oct 5 09:09:37 2015] /share/gm_et_linux_64/gmes_petap/probuild --seq data/dna.fna --allow_x --substring_n_distr info/dna.gap_distr
/share/gm_et_linux_64/gmes_petap//gmes_petap.pl : [Mon Oct 5 09:11:08 2015] /share/gm_et_linux_64/gmes_petap/gc_distr.pl --in data/dna.fna --out info/dna.gc.csv --w 1000,8000
/share/gm_et_linux_64/gmes_petap//gmes_petap.pl : [Mon Oct 5 09:12:23 2015] /share/gm_et_linux_64/gmes_petap/probuild --seq /mouse/Mya/maker/data/dna.fna --split dna.fa --max_contig 5000000 --min_contig 50000 --letters_per
_line 100 --split_at_n 5000 --split_at_x 5000 --allow_x --x_to_n --trace ../../info/training.trace
/share/gm_et_linux_64/gmes_petap//gmes_petap.pl : [Mon Oct 5 09:12:51 2015] /share/gm_et_linux_64/gmes_petap/rescale_gff.pl --in data/et.gff --trace info/training.trace --out data/et_training.gff
/share/gm_et_linux_64/gmes_petap//gmes_petap.pl : [Mon Oct 5 09:13:07 2015] /share/gm_et_linux_64/gmes_petap/probuild --seq data/training.fna --stat info/training.general --allow_x --GC_PRECISION 0
/share/gm_et_linux_64/gmes_petap//gmes_petap.pl : [Mon Oct 5 09:13:12 2015] /share/gm_et_linux_64/gmes_petap/parse_by_introns.pl --section ET_ini --cfg /mouse/Mya/maker/run.cfg --parse_dir /mouse/Mya/maker/run/ET_ini
/share/gm_et_linux_64/gmes_petap//gmes_petap.pl : [Mon Oct 5 09:13:12 2015] /share/gm_et_linux_64/gmes_petap/make_nt_freq_mat.pl --cfg /mouse/Mya/maker/run.cfg --section donor_GT --format DONOR
/share/gm_et_linux_64/gmes_petap//gmes_petap.pl : [Mon Oct 5 09:13:12 2015] error
looks like the last command has failed: /share/gm_et_linux_64/gmes_petap/make_nt_freq_mat.pl --cfg /mouse/Mya/maker/run.cfg --section donor_GT --format DONOR
/share/gm_et_linux_64/gmes_petap/make_nt_freq_mat.pl --cfg /mouse/Mya/maker/run.cfg --section donor_GT --format DONOR
error, no valid sequences were found
from the cfg
file
donor_GT:
auto_order: 1
format_out: ''
gc_high: -1
gc_low: -1
infile: /mouse/Mya/maker/braker/Mya_a/GeneMark-ET/run/ET_ini/don.seq
margin: 3
order: 0
outfile: GT.mat
phase: ''
pseudocounts: 10
quite: 0
site_size: 2
threshold_zero: 2000
type: GT
width: 9
Of note, in the cfg
file originally it had listed don.seq
with no path - and I was getting file not found
errors related to that file. I see that don.seq
is located elsewhere, so I changed the config file accordingly. However, /mouse/Mya/maker/braker/Mya_a/GeneMark-ET/run/ET_ini/don.seq
is empty so the erro rmust be more upstream.
Any help greatly appreciated.
I know that the question is old but for other people with the same problem.
I had a similar problem. At least, Genemark wouldn't run properly. I figured out I had 2 problems.
The first one was simple, I set the path to genemark as:
GENEMARK_PATH=/home/user/software/gm_et_linux_64/gmes_petap/
This causes braker to use /home/user/software/gm_et_linux_64/gmes_petap//gmes_petap.pl as a call to genemark. Which has a forward slash too many. I see you have the same problem.
The second problem I had was with the fasta headers. I had as a header:
Which gave an error. Genemark removes the second part but something goes wrong when trying to link the two together again.
Using the gedit text editor in Ubuntu you can easily remove everything after >Scaffold1
use search and replace in gedit and replace: |.* using Match as regular expression, leave the replace field empty. When using that file as input braker had no problems and finished successfully.
The search and replace can be adapted when you wish. the " .* " is the wildcard for everything after the " | " sign. So if you change the | sign it will replace everything after and including what you replaced it with.
Hope this will be helpful for someone.