Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save junaruga/0826685f58d699afce3848488d0fac08 to your computer and use it in GitHub Desktop.
Save junaruga/0826685f58d699afce3848488d0fac08 to your computer and use it in GitHub Desktop.
$ ls exampleFiles
.gitkeep MW539688.1.fasta NC_016067.1.fasta test.fa
ilDeiPorc1.reads.fa MW539688.1.gb NC_016067.1.gb
$ time docker run --rm -w /data/ -v /home/jaruga/tmp/mitohifi/exampleFiles/:/data/ -t docker.io/biocontainers/mitohifi:2.2_cv1 mitohifi.py -c /data/test.fa -f /data/NC_016067.1.fasta -g /data/NC_016067.1.gb -t 4 -o 2
2022-10-25 20:00:44 [INFO] Welcome to MitoHifi v2. Starting pipeline...
2022-10-25 20:00:44 [INFO] Length of related mitogenome is: 15659 bp
2022-10-25 20:00:44 [INFO] Number of genes on related mitogenome: 37
2022-10-25 20:00:44 [INFO] Running MitoHifi pipeline in contigs mode...
2022-10-25 20:00:44 [INFO] 1. Fixing potentially conflicting FASTA headers
2022-10-25 20:00:44 [INFO] 2. Let's run the blast of the contigs versus the close-related mitogenome
2022-10-25 20:00:44 [INFO] 2.1. Creating BLAST database:
2022-10-25 20:00:44 [INFO] makeblastdb -in /data/NC_016067.1.fasta -dbtype nucl
2022-10-25 20:00:44 [INFO] Makeblastdb done.
2022-10-25 20:00:44 [INFO] 2.2. Running blast of contigs against close-related mitogenome:
2022-10-25 20:00:44 [INFO] blastn -query /data/test.fa -db /data/NC_016067.1.fasta -num_threads 4 -out contigs.blastn -outfmt 6 std qlen slen
2022-10-25 20:00:50 [INFO] Blast done.
2022-10-25 20:00:50 [INFO] 3. Filtering BLAST output to select target sequences
2022-10-25 20:00:50 [INFO] Filtering thresholds applied:
2022-10-25 20:00:50 [INFO] Minimum query percentage = 50
2022-10-25 20:00:50 [INFO] Minimum query length = 80% subject length
2022-10-25 20:00:50 [INFO] Maximum query length = 5 times subject length
2022-10-25 20:00:51 [INFO] Filtering BLAST finished. A list of the filtered contigs was saved on ./contigs_filtering/contigs_ids.txt file
2022-10-25 20:00:51 [INFO] 4. Now we are going to circularize, annotate and rotate each filtered contig. Those are potential mitogenome(s).
2022-10-25 20:00:51 [INFO] Working with contig tig00007572_1
2022-10-25 20:00:51 [INFO] Working with contig tig00007550_1
2022-10-25 20:00:51 [INFO] Started tig00007550_1 circularization
2022-10-25 20:00:51 [INFO] Started tig00007572_1 circularization
2022-10-25 20:00:51 [INFO] tig00007572_1 circularization done. Circularization info saved on ./potential_contigs/tig00007572_1/tig00007572_1.circularisationCheck.txt
2022-10-25 20:00:51 [INFO] Started tig00007572_1 (MitoFinder) annotation
2022-10-25 20:00:51 [INFO] tig00007550_1 circularization done. Circularization info saved on ./potential_contigs/tig00007550_1/tig00007550_1.circularisationCheck.txt
2022-10-25 20:00:51 [INFO] Started tig00007550_1 (MitoFinder) annotation
2022-10-25 20:03:51 [INFO] tig00007550_1 annotation done. Annotation log saved on ./potential_contigs/tig00007550_1/tig00007550_1.annotation_MitoFinder.log
2022-10-25 20:03:52 [INFO] tig00007572_1 annotation done. Annotation log saved on ./potential_contigs/tig00007572_1/tig00007572_1.annotation_MitoFinder.log
2022-10-25 20:03:52 [INFO] Started tig00007572_1 rotation.
2022-10-25 20:03:52 [INFO] Started tig00007550_1 rotation.
2022-10-25 20:03:52 [INFO] Rotation of tig00007572_1 done. Rotated is at tig00007572_1.mitogenome.rotated.fa
2022-10-25 20:03:52 [INFO] Rotation of tig00007550_1 done. Rotated is at tig00007550_1.mitogenome.rotated.fa
Gene ND3 contains frameshift
Gene COX3 contains frameshift
Gene ATP6 contains frameshift
Gene COX2 contains frameshift
Gene COX1 contains frameshift
Gene ND2 contains frameshift
Gene ND1 contains frameshift
Gene CYTB contains frameshift
Gene ND6 contains frameshift
Gene ND4L contains frameshift
Gene ND4 contains frameshift
Gene ND5 contains frameshift
Gene ND1 contains frameshift
Gene CYTB contains frameshift
Gene ND6 contains frameshift
Gene ND4L contains frameshift
Gene ND4 contains frameshift
Gene ND5 contains frameshift
Gene ND3 contains frameshift
Gene COX3 contains frameshift
Gene ATP6 contains frameshift
Gene COX2 contains frameshift
Gene COX1 contains frameshift
Gene ND2 contains frameshift
2022-10-25 20:03:52 [INFO] 5. Now the rotated contigs will be aligned
2022-10-25 20:03:52 [INFO] List of contigs that will be aligned: ['tig00007572_1.mitogenome.rotated.fa', 'tig00007550_1.mitogenome.rotated.fa']
2022-10-25 20:03:52 [INFO] MAFFT alignment will be called with:
mafft --quiet --clustalout --thread 4 all_mitogenomes.rotated.fa > all_mitogenomes.rotated.aligned.fa
2022-10-25 20:03:52 [INFO] Alignment done and saved at ./final_mitogenome_choice/all_mitogenomes.rotated.aligned.fa
2022-10-25 20:03:52 [INFO] 6. Now we will choose the most representative contig
/bin/MitoHiFi/getReprContig.py:96: UserWarning: Warning: representative contig contains frameshifts
warnings.warn("Warning: representative contig contains frameshifts")
2022-10-25 20:03:52 [INFO] Representative contig is tig00007550_1 that belongs to Cluster 0. This contig will be our final mitogenome. See all contigs and clusters in cdhit.out.clstr
2022-10-25 20:06:23 [INFO] 7. Calculating final stats for final mitogenome and other potential contigs.
Stats will be saved on contigs_stats.tsv file.
Gene ND3 contains frameshift
Gene COX3 contains frameshift
Gene ATP6 contains frameshift
Gene COX2 contains frameshift
Gene COX1 contains frameshift
Gene ND2 contains frameshift
Gene ND1 contains frameshift
Gene CYTB contains frameshift
Gene ND6 contains frameshift
Gene ND4L contains frameshift
Gene ND4 contains frameshift
Gene ND5 contains frameshift
2022-10-25 20:06:23 [INFO] Pipeline finished!
2022-10-25 20:06:23 [INFO] Run time: 338.59 seconds
real 5m41.835s
user 0m0.044s
sys 0m0.032s
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment