Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
purge_haplotigs test log on HPC
[aorth@hpc: ~]$ module load purge_haplotigs/1.1.1
[aorth@hpc: ~]$ purge_haplotigs test
purge_haplotigs readhist -b aligned.bam -g contigs.fa
[21-06-2021 12:25:50] bedtools OK!
[21-06-2021 12:25:50] Rscript OK!
[21-06-2021 12:25:50] samtools OK!
[21-06-2021 12:25:50] ALL DEPENDENCIES OK
[21-06-2021 12:25:51] Beginning read-depth histogram generation
[21-06-2021 12:25:51] running genome coverage analysis on aligned.bam
[21-06-2021 12:25:55] Building genome histogram
[21-06-2021 12:25:55] generating histogram
[21-06-2021 12:25:55] generating png image
[21-06-2021 12:26:00]
Pipeline finished! Your histogram is saved to: aligned.bam.histogram.png
[21-06-2021 12:26:00]
Check your histogram to observe where your haploid and diploid peaks are
and choose your low, midpoint, and high cutoffs (check the example histogram png
in the readme). You will need 'aligned.bam.gencov' and the cutoffs for the next
step: 'purge_haplotigs cov'
purge_haplotigs contigcov -i aligned.bam.gencov -l 3 -m 30 -h 70
Analysis finished successfully! Contig coverage stats saved to 'coverage_stats.csv'.
purge_haplotigs purge -g contigs.fa -c coverage_stats.csv -t 4 -r repeats.bed -dotplots -b aligned.bam
[21-06-2021 12:26:00] bedtools OK!
[21-06-2021 12:26:00] minimap2 OK!
[21-06-2021 12:26:01] samtools OK!
[21-06-2021 12:26:01] Rscript OK!
[21-06-2021 12:26:02]
Beginning Pipeline
PARAMETERS:
Genome fasta: contigs.fa
Coverage csv: coverage_stats.csv
Produce dotplots: TRUE
Bam file: aligned.bam
Min cov window len: 5000 bp
max ctg cov windows: 200
Falcon-style naming: FALSE
Repeat annotations: repeats.bed
Threads: 4
I/O intense jobs: 4
Cutoff, alignment: 70 %
Cutoff, repeat: 250 %
Cutoff, suspect: 5 %
Out prefix: curated
minimap2 parameters: '-p 1e-5 -f 0.001 -N 100000'
Running using command:
purge_haplotigs purge -g contigs.fa -c coverage_stats.csv -t 4 -r repeats.bed -dotplots -b aligned.bam
[21-06-2021 12:26:02]
PREPARATION
[21-06-2021 12:26:02] Reading contigs.fa.fai
[21-06-2021 12:26:02] Reading coverage_stats.csv
[21-06-2021 12:26:02] Scanning contigs.fa
[21-06-2021 12:26:02] Getting windowed read-depth for each contig
[21-06-2021 12:26:07] Generating log2(read-depth / average read-depth) coverages for plotting later
[21-06-2021 12:26:07] Scanning tmp_purge_haplotigs/assembly.logcov.bed
[21-06-2021 12:26:07] Building assembly index for minimap2
[21-06-2021 12:26:07] Performing minimap2 alignments
[21-06-2021 12:26:07] Indexing minimap2 alignments
[21-06-2021 12:26:08] Finished minimap2 alignments
[21-06-2021 12:26:08] Reading index of minimap2 alignments
[21-06-2021 12:26:08] Preparing contig hit summary
[21-06-2021 12:26:08] Parsing repeat annotations from repeats.bed
[21-06-2021 12:26:08] Reading contig hits from hit summary
[21-06-2021 12:26:08] Performing pairwise comparisons on contig hits
[21-06-2021 12:26:09] Checking contig assignments for conflicts
[21-06-2021 12:26:09] CONFLICT: 000001F and it's match 000002F both flagged for reassignment
[21-06-2021 12:26:09] Keeping longer contig 000001F
[21-06-2021 12:26:09] Logging reassignments and checking for convergence
[21-06-2021 12:26:09] Convergence not reached, more passes needed
[21-06-2021 12:26:09] Reading contig hits from hit summary
[21-06-2021 12:26:09] Performing pairwise comparisons on contig hits
[21-06-2021 12:26:09] Checking contig assignments for conflicts
[21-06-2021 12:26:09] Logging reassignments and checking for convergence
[21-06-2021 12:26:09] Convergence reached!
[21-06-2021 12:26:09] Checking for over-purging
[21-06-2021 12:26:23] Fixing over-purged contigs
[21-06-2021 12:26:23]
GENERATING OUTPUT
[21-06-2021 12:26:23] Writing contig associations
[21-06-2021 12:26:23] Writing the reassignment table and new assembly files
[21-06-2021 12:26:23]
PURGE HAPLOTIGS HAS COMPLETED SUCCESSFULLY!
samtools faidx curated.fasta
samtools faidx curated.haplotigs.fasta
purge_haplotigs ncbiplace -p curated.fasta -h curated.haplotigs.fasta -t 4 -f
[21-06-2021 12:26:24] minimap2 OK!
[21-06-2021 12:26:24] samtools OK!
[21-06-2021 12:26:24]
Beginning Pipeline
PARAMETERS:
Primary contigs FASTA curated.fasta
Haplotigs FASTA curated.haplotigs.fasta
Out FASTA ncbi_placements.tsv
Threads 4
Coverage align cutoff 50 %
RUNNING USING COMMAND:
purge_haplotigs place -p curated.fasta -h curated.haplotigs.fasta -t 4 -f
[21-06-2021 12:26:24] Reading contig lengths
[21-06-2021 12:26:24] Running minimap2 hit search
[21-06-2021 12:26:24] Minimap2 hit search done
[21-06-2021 12:26:24] Reading minimap2 alignments
[21-06-2021 12:26:24] Getting best hits
[21-06-2021 12:26:24] Getting alignments coords for best hits
[21-06-2021 12:26:24] Renaming contigs and haplotigs in the FALCON Unzip format
[21-06-2021 12:26:24] Writing placement file
[21-06-2021 12:26:24] Done!
md5sum -c validate.md5
curated.artefacts.fasta: OK
curated.contig_associations.log: OK
curated.FALC.fasta: OK
curated.fasta: OK
curated.haplotigs.FALC.fasta: OK
curated.haplotigs.fasta: OK
curated.reassignments.tsv: OK
ncbi_placements.tsv: OK
\n\nPurge Haplotigs successfully validated with default settings\n\n
ALL TESTS PASSED
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment