Created
October 25, 2022 16:41
-
-
Save junaruga/5daad4a92305c896bc86cd7f4e74ab6d to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
$ wc -l ilDeiPorc1.reads.* | |
244682 ilDeiPorc1.reads.fa | |
20045 ilDeiPorc1.reads.small.fa | |
264727 total | |
$ time docker run --rm -w /data/ -v /home/jaruga/tmp/mitohifi/exampleFiles/:/data/ -t docker.io/biocontainers/mitohifi:2.2_cv1 mitohifi.py -r /data/ilDeiPorc1.reads.small.fa -f /data/MW539688.1.fasta -g /data/MW539688.1.gb -t 4 -o 2 /data/ -v /home/jaruga/tmp/mitohifi/exampleFiles/:/data/ -t docker.io/biocontainers/mitohifi:2.2_cv1 mitohifi.py -r /data/ilDeiPorc1.reads.small.fa -f /data/MW539688.1.fasta -g /dat2022-10-25 16:33:56 [INFO] Welcome to MitoHifi v2. Starting pipeline... | |
2022-10-25 16:33:56 [INFO] Length of related mitogenome is: 15354 bp | |
2022-10-25 16:33:56 [INFO] Number of genes on related mitogenome: 37 | |
2022-10-25 16:33:56 [INFO] Running MitoHifi pipeline in reads mode... | |
2022-10-25 16:33:56 [INFO] 1. First we map your Pacbio HiFi reads to the close-related mitogenome | |
2022-10-25 16:33:56 [INFO] minimap2 -t 4 --secondary=no -ax map-pb /data/MW539688.1.fasta /data/ilDeiPorc1.reads.small.fa | samtools view -@ 4 -S -b -F4 -F 0x800 > reads.HiFiMapped.bam | |
2022-10-25 16:33:56 [INFO] 2. Now we filter out any mapped reads that are larger than the reference mitogenome to avoid NUMTS | |
2022-10-25 16:33:56 [INFO] 2.1 First we convert the mapped reads from BAM to FASTA format: | |
2022-10-25 16:33:56 [INFO] samtools fasta reads.HiFiMapped.bam > gbk.HiFiMapped.bam.fasta | |
2022-10-25 16:33:56 [INFO] Total number of mapped reads: 137 | |
2022-10-25 16:33:56 [INFO] 2.2 Then we filter reads that are larger than 15354 bp | |
2022-10-25 16:33:56 [INFO] Number of filtered reads: 137 | |
2022-10-25 16:33:56 [INFO] 3. Now let's run hifiasm to assemble the mapped and filtered reads! | |
2022-10-25 16:33:56 [INFO] hifiasm --primary -t 4 -f 0 -o gbk.HiFiMapped.bam.filtered.assembled gbk.HiFiMapped.bam.filtered.fasta | |
2022-10-25 16:34:06 [INFO] 4. Let's run the blast of the contigs versus the close-related mitogenome | |
2022-10-25 16:34:06 [INFO] 4.1. Creating BLAST database: | |
2022-10-25 16:34:06 [INFO] makeblastdb -in /data/MW539688.1.fasta -dbtype nucl | |
2022-10-25 16:34:06 [INFO] Makeblastdb done. | |
2022-10-25 16:34:06 [INFO] 4.2. Running blast of contigs against close-related mitogenome: | |
2022-10-25 16:34:06 [INFO] blastn -query hifiasm.contigs.fasta -db /data/MW539688.1.fasta -num_threads 4 -out contigs.blastn -outfmt 6 std qlen slen | |
2022-10-25 16:34:06 [INFO] Blast done. | |
2022-10-25 16:34:06 [INFO] 5. Filtering BLAST output to select target sequences | |
2022-10-25 16:34:06 [INFO] Filtering thresholds applied: | |
2022-10-25 16:34:06 [INFO] Minimum query percentage = 50 | |
2022-10-25 16:34:06 [INFO] Minimum query length = 80% subject length | |
2022-10-25 16:34:06 [INFO] Maximum query length = 5 times subject length | |
2022-10-25 16:34:06 [INFO] Filtering BLAST finished. A list of the filtered contigs was saved on ./contigs_filtering/contigs_ids.txt file | |
2022-10-25 16:34:06 [INFO] 6. Now we are going to circularize, annotate and rotate each filtered contig. Those are potential mitogenome(s). | |
2022-10-25 16:34:06 [INFO] Working with contig ptg000001l | |
2022-10-25 16:34:06 [INFO] Started ptg000001l circularization | |
2022-10-25 16:34:07 [INFO] ptg000001l circularization done. Circularization info saved on ./potential_contigs/ptg000001l/ptg000001l.circularisationCheck.txt | |
2022-10-25 16:34:07 [INFO] Started ptg000001l (MitoFinder) annotation | |
2022-10-25 16:36:52 [INFO] ptg000001l annotation done. Annotation log saved on ./potential_contigs/ptg000001l/ptg000001l.annotation_MitoFinder.log | |
2022-10-25 16:36:52 [INFO] Started ptg000001l rotation. | |
2022-10-25 16:36:52 [INFO] Rotation of ptg000001l done. Rotated is at ptg000001l.mitogenome.rotated.fa | |
Gene COX3 contains frameshift | |
Gene ATP6 contains frameshift | |
Gene COX2 contains frameshift | |
Gene COX1 contains frameshift | |
Gene ND2 contains frameshift | |
Gene ND1 contains frameshift | |
Gene CYTB contains frameshift | |
Gene ND4L contains frameshift | |
Gene ND4 contains frameshift | |
Gene ND5 contains frameshift | |
Gene ND3 contains frameshift | |
2022-10-25 16:36:52 [INFO] 7. Now the rotated contigs will be aligned | |
2022-10-25 16:36:52 [INFO] List of contigs that will be aligned: ['ptg000001l.mitogenome.rotated.fa'] | |
2022-10-25 16:36:52 [INFO] MAFFT alignment will be called with: | |
mafft --quiet --clustalout --thread 4 all_mitogenomes.rotated.fa > all_mitogenomes.rotated.aligned.fa | |
2022-10-25 16:36:52 [INFO] Alignment done and saved at ./final_mitogenome_choice/all_mitogenomes.rotated.aligned.fa | |
2022-10-25 16:36:52 [INFO] 8. Now we will choose the most representative contig | |
/bin/MitoHiFi/getReprContig.py:96: UserWarning: Warning: representative contig contains frameshifts | |
warnings.warn("Warning: representative contig contains frameshifts") | |
2022-10-25 16:36:52 [INFO] Representative contig is ptg000001l that belongs to Cluster 0. This contig will be our final mitogenome. See all contigs and clusters in cdhit.out.clstr | |
2022-10-25 16:39:24 [INFO] 9. Calculating final stats for final mitogenome and other potential contigs. | |
Stats will be saved on contigs_stats.tsv file. | |
Gene ND3 contains frameshift | |
Gene COX3 contains frameshift | |
Gene ATP6 contains frameshift | |
Gene COX2 contains frameshift | |
Gene COX1 contains frameshift | |
Gene ND2 contains frameshift | |
Gene ND1 contains frameshift | |
Gene CYTB contains frameshift | |
Gene ND4L contains frameshift | |
Gene ND4 contains frameshift | |
Gene ND5 contains frameshift | |
2022-10-25 16:39:24 [INFO] Pipeline finished! | |
2022-10-25 16:39:24 [INFO] Run time: 328.83 seconds | |
real 5m31.986s | |
user 0m0.048s | |
sys 0m0.032s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment