Skip to content

Instantly share code, notes, and snippets.

@jelber2
Last active October 25, 2023 09:51
Show Gist options
  • Save jelber2/451eec8c6b74617b8bf0532905f256c1 to your computer and use it in GitHub Desktop.
Save jelber2/451eec8c6b74617b8bf0532905f256c1 to your computer and use it in GitHub Desktop.
Zymo_Mock_HMW_SUP_duplex_reads_with_peregrine_2021
This was with https://zymo-files.s3.amazonaws.com/BioPool/ZymoBIOMICS.STD.refseq.v2.zip
RAW_SUP_Duplex pg_asm_1x_corrected_SUP_duplex pg_asm_2x_corrected_SUP_duplex pg_asm_3x_corrected_SUP_duplex
Bacillus_subtilis Bacillus_subtilis Bacillus_subtilis Bacillus_subtilis
# target bases: 4041255 # target bases: 4041255 # target bases: 4041255 # target bases: 4041255
# target bases overlapping regions: 4041255 (100.00%) # target bases overlapping regions: 4041255 (100.00%) # target bases overlapping regions: 4041255 (100.00%) # target bases overlapping regions: 4041255 (100.00%)
1159311 reference bases covered by exactly one contig 3791080 reference bases covered by exactly one contig 3642732 reference bases covered by exactly one contig 3786472 reference bases covered by exactly one contig
26 substitutions; ts/tv = 1.889 18 substitutions; ts/tv = 2.600 18 substitutions; ts/tv = 2.600 17 substitutions; ts/tv = 2.400
20 1bp deletions 3 1bp deletions 4 1bp deletions 4 1bp deletions
5 1bp insertions 1 1bp insertions 1 1bp insertions 1 1bp insertions
1 2bp deletions 0 2bp deletions 0 2bp deletions 0 2bp deletions
0 2bp insertions 0 2bp insertions 0 2bp insertions 0 2bp insertions
0 [3,50) deletions 0 [3,50) deletions 0 [3,50) deletions 0 [3,50) deletions
0 [3,50) insertions 0 [3,50) insertions 0 [3,50) insertions 0 [3,50) insertions
0 [50,1000) deletions 0 [50,1000) deletions 0 [50,1000) deletions 0 [50,1000) deletions
0 [50,1000) insertions 1 [50,1000) insertions 1 [50,1000) insertions 1 [50,1000) insertions
0 >=1000 deletions 0 >=1000 deletions 0 >=1000 deletions 0 >=1000 deletions
0 >=1000 insertions 2 >=1000 insertions 2 >=1000 insertions 2 >=1000 insertions
# samtools stats read alignment (map Qual=60) error rate
1.300449e-03 7.950136e-04 7.869161e-04 7.867193e-04
### This reference is from ATCC genomes
Enterococcus_faecalis Enterococcus_faecalis Enterococcus_faecalis Enterococcus_faecalis
# target bases: 2790314 # target bases: 2790314 # target bases: 2790314 # target bases: 2790314
# target bases overlapping regions: 2526050 (90.53%) # target bases overlapping regions: 2526050 (90.53%) # target bases overlapping regions: 2526050 (90.53%) # target bases overlapping regions: 2526050 (90.53%)
325487 reference bases covered by exactly one contig 1243807 reference bases covered by exactly one contig 1327819 reference bases covered by exactly one contig 1335426 reference bases covered by exactly one contig
2519 substitutions; ts/tv = 3.164 9279 substitutions; ts/tv = 3.509 9895 substitutions; ts/tv = 3.514 9960 substitutions; ts/tv = 3.509
14 1bp deletions 57 1bp deletions 58 1bp deletions 58 1bp deletions
12 1bp insertions 53 1bp insertions 57 1bp insertions 57 1bp insertions
0 2bp deletions 4 2bp deletions 4 2bp deletions 4 2bp deletions
1 2bp insertions 2 2bp insertions 2 2bp insertions 2 2bp insertions
3 [3,50) deletions 11 [3,50) deletions 12 [3,50) deletions 12 [3,50) deletions
3 [3,50) insertions 16 [3,50) insertions 16 [3,50) insertions 16 [3,50) insertions
0 [50,1000) deletions 4 [50,1000) deletions 4 [50,1000) deletions 4 [50,1000) deletions
0 [50,1000) insertions 4 [50,1000) insertions 8 [50,1000) insertions 8 [50,1000) insertions
4 >=1000 deletions 10 >=1000 deletions 10 >=1000 deletions 10 >=1000 deletions
1 >=1000 insertions 7 >=1000 insertions 7 >=1000 insertions 7 >=1000 insertions
# samtools stats read alignment (map Qual=60) error rate
5.005881e-02 5.060479e-02 5.062398e-02 5.062993e-02
### This reference is from Zymo
Enterococcus_faecalis Enterococcus_faecalis Enterococcus_faecalis Enterococcus_faecalis
1116406 reference bases covered by exactly one contig 2591694 reference bases covered by exactly one contig 2684232 reference bases covered by exactly one contig 2682675 reference bases covered by exactly one contig
18 substitutions; ts/tv = 8.000 47 substitutions; ts/tv = 2.615 47 substitutions; ts/tv = 2.615 47 substitutions; ts/tv = 2.615
2 1bp deletions 4 1bp deletions 4 1bp deletions 4 1bp deletions
0 1bp insertions 5 1bp insertions 4 1bp insertions 4 1bp insertions
0 2bp deletions 0 2bp deletions 0 2bp deletions 0 2bp deletions
0 2bp insertions 1 2bp insertions 1 2bp insertions 1 2bp insertions
0 [3,50) deletions 0 [3,50) deletions 0 [3,50) deletions 0 [3,50) deletions
0 [3,50) insertions 2 [3,50) insertions 2 [3,50) insertions 2 [3,50) insertions
0 [50,1000) deletions 0 [50,1000) deletions 0 [50,1000) deletions 0 [50,1000) deletions
0 [50,1000) insertions 0 [50,1000) insertions 0 [50,1000) insertions 0 [50,1000) insertions
0 >=1000 deletions 0 >=1000 deletions 0 >=1000 deletions 0 >=1000 deletions
0 >=1000 insertions 0 >=1000 insertions 0 >=1000 insertions 0 >=1000 insertions
# samtools stats read alignment (map Qual=60) error rate
8.843090e-03 8.610355e-03 8.618782e-03 8.617423e-03
Escherichia_coli Escherichia_coli Escherichia_coli Escherichia_coli
# target bases: 4875441 # target bases: 4875441 # target bases: 4875441 # target bases: 4875441
# target bases overlapping regions: 4875158 (99.99%) # target bases overlapping regions: 4765089 (97.74%) # target bases overlapping regions: 4765089 (97.74%) # target bases overlapping regions: 4765089 (97.74%)
2583750 reference bases covered by exactly one contig 4670961 reference bases covered by exactly one contig 4670961 reference bases covered by exactly one contig 4670961 reference bases covered by exactly one contig
402 substitutions; ts/tv = 1.892 553 substitutions; ts/tv = 1.957 553 substitutions; ts/tv = 1.957 553 substitutions; ts/tv = 1.957
33 1bp deletions 10 1bp deletions 10 1bp deletions 10 1bp deletions
16 1bp insertions 15 1bp insertions 15 1bp insertions 15 1bp insertions
4 2bp deletions 2 2bp deletions 2 2bp deletions 2 2bp deletions
3 2bp insertions 1 2bp insertions 1 2bp insertions 1 2bp insertions
4 [3,50) deletions 1 [3,50) deletions 1 [3,50) deletions 1 [3,50) deletions
2 [3,50) insertions 4 [3,50) insertions 4 [3,50) insertions 4 [3,50) insertions
0 [50,1000) deletions 1 [50,1000) deletions 1 [50,1000) deletions 1 [50,1000) deletions
0 [50,1000) insertions 0 [50,1000) insertions 0 [50,1000) insertions 0 [50,1000) insertions
0 >=1000 deletions 0 >=1000 deletions 0 >=1000 deletions 0 >=1000 deletions
1 >=1000 insertions 0 >=1000 insertions 1 >=1000 insertions 1 >=1000 insertions
# samtools stats read alignment (map Qual=60) error rate
1.070861e-03 7.181472e-04 7.074496e-04 7.071868e-04
Listeria_monocytogenes Listeria_monocytogenes Listeria_monocytogenes Listeria_monocytogenes
# target bases: 2992056 # target bases: 2992056 # target bases: 2992056 # target bases: 2992056
# target bases overlapping regions: 2992056 (100.00%) # target bases overlapping regions: 2992056 (100.00%) # target bases overlapping regions: 2992056 (100.00%) # target bases overlapping regions: 2992056 (100.00%)
1326181 reference bases covered by exactly one contig 2717713 reference bases covered by exactly one contig 2766521 reference bases covered by exactly one contig 2887788 reference bases covered by exactly one contig
2 substitutions; ts/tv = 1.000 3 substitutions; ts/tv = 2.000 2 substitutions; ts/tv = 1.000 3 substitutions; ts/tv = 2.000
3 1bp deletions 1 1bp deletions 0 1bp deletions 0 1bp deletions
1 1bp insertions 0 1bp insertions 1 1bp insertions 1 1bp insertions
1 2bp deletions 0 2bp deletions 0 2bp deletions 0 2bp deletions
0 2bp insertions 0 2bp insertions 0 2bp insertions 0 2bp insertions
0 [3,50) deletions 0 [3,50) deletions 0 [3,50) deletions 0 [3,50) deletions
0 [3,50) insertions 0 [3,50) insertions 0 [3,50) insertions 0 [3,50) insertions
0 [50,1000) deletions 0 [50,1000) deletions 0 [50,1000) deletions 0 [50,1000) deletions
0 [50,1000) insertions 0 [50,1000) insertions 0 [50,1000) insertions 0 [50,1000) insertions
0 >=1000 deletions 0 >=1000 deletions 0 >=1000 deletions 0 >=1000 deletions
0 >=1000 insertions 0 >=1000 insertions 0 >=1000 insertions 0 >=1000 insertions
# samtools stats read alignment (map Qual=60) error rate
4.337566e-04 1.859414e-05 1.089206e-05 1.021797e-05
### Something is strange about this reference direct from Zymo?
Saccharomyces_cerevisiae Saccharomyces_cerevisiae Saccharomyces_cerevisiae Saccharomyces_cerevisiae
# target bases: 12225336 # target bases: 12225336 # target bases: 12225336 # target bases: 12225336
# target bases overlapping regions: 3126602 (25.57%) # target bases overlapping regions: 6597540 (53.97%) # target bases overlapping regions: 7581784 (62.02%) # target bases overlapping regions: 6958316 (56.92%)
2317190 reference bases covered by exactly one contig 5541039 reference bases covered by exactly one contig 5998900 reference bases covered by exactly one contig 6117162 reference bases covered by exactly one contig
3207 substitutions; ts/tv = 2.742 5539 substitutions; ts/tv = 2.678 5729 substitutions; ts/tv = 2.767 6729 substitutions; ts/tv = 2.836
299 1bp deletions 394 1bp deletions 410 1bp deletions 450 1bp deletions
244 1bp insertions 355 1bp insertions 403 1bp insertions 467 1bp insertions
71 2bp deletions 98 2bp deletions 94 2bp deletions 121 2bp deletions
55 2bp insertions 104 2bp insertions 108 2bp insertions 114 2bp insertions
75 [3,50) deletions 103 [3,50) deletions 121 [3,50) deletions 114 [3,50) deletions
54 [3,50) insertions 136 [3,50) insertions 135 [3,50) insertions 150 [3,50) insertions
2 [50,1000) deletions 7 [50,1000) deletions 8 [50,1000) deletions 11 [50,1000) deletions
5 [50,1000) insertions 12 [50,1000) insertions 11 [50,1000) insertions 12 [50,1000) insertions
0 >=1000 deletions 1 >=1000 deletions 2 >=1000 deletions 2 >=1000 deletions
0 >=1000 insertions 2 >=1000 insertions 3 >=1000 insertions 8 >=1000 insertions
# samtools stats read alignment (map Qual=60) error rate
1.113377e-02 1.033118e-02 1.019597e-02 1.016335e-02
Salmonella_enterica Salmonella_enterica Salmonella_enterica Salmonella_enterica
# target bases: 4809318 # target bases: 4809318 # target bases: 4809318 # target bases: 4809318
# target bases overlapping regions: 4806012 (99.93%) # target bases overlapping regions: 4756440 (98.90%) # target bases overlapping regions: 4806012 (99.93%) # target bases overlapping regions: 4806012 (99.93%)
1186015 reference bases covered by exactly one contig 2021577 reference bases covered by exactly one contig 2855944 reference bases covered by exactly one contig 3301669 reference bases covered by exactly one contig
76 substitutions; ts/tv = 1.714 26 substitutions; ts/tv = 5.500 24 substitutions; ts/tv = 3.000 32 substitutions; ts/tv = 2.556
53 1bp deletions 3 1bp deletions 3 1bp deletions 4 1bp deletions
50 1bp insertions 8 1bp insertions 12 1bp insertions 10 1bp insertions
6 2bp deletions 1 2bp deletions 0 2bp deletions 0 2bp deletions
2 2bp insertions 0 2bp insertions 2 2bp insertions 2 2bp insertions
4 [3,50) deletions 0 [3,50) deletions 1 [3,50) deletions 0 [3,50) deletions
2 [3,50) insertions 0 [3,50) insertions 0 [3,50) insertions 1 [3,50) insertions
0 [50,1000) deletions 0 [50,1000) deletions 0 [50,1000) deletions 0 [50,1000) deletions
0 [50,1000) insertions 0 [50,1000) insertions 0 [50,1000) insertions 0 [50,1000) insertions
0 >=1000 deletions 0 >=1000 deletions 0 >=1000 deletions 0 >=1000 deletions
0 >=1000 insertions 0 >=1000 insertions 0 >=1000 insertions 0 >=1000 insertions
# samtools stats read alignment (map Qual=60) error rate
1.193091e-03 7.012235e-04 6.661453e-04 6.602791e-04
Staphylococcus_aureus Staphylococcus_aureus Staphylococcus_aureus Staphylococcus_aureus
# target bases: 2730326 # target bases: 2730326 # target bases: 2730326 # target bases: 2730326
# target bases overlapping regions: 2721773 (99.69%) # target bases overlapping regions: 2718780 (99.58%) # target bases overlapping regions: 2725117 (99.81%) # target bases overlapping regions: 2725117 (99.81%)
2245965 reference bases covered by exactly one contig 2646579 reference bases covered by exactly one contig 2646408 reference bases covered by exactly one contig 2646412 reference bases covered by exactly one contig
11 substitutions; ts/tv = 1.750 12 substitutions; ts/tv = 2.000 12 substitutions; ts/tv = 2.000 12 substitutions; ts/tv = 2.000
8 1bp deletions 7 1bp deletions 7 1bp deletions 7 1bp deletions
1 1bp insertions 1 1bp insertions 1 1bp insertions 1 1bp insertions
1 2bp deletions 1 2bp deletions 1 2bp deletions 1 2bp deletions
2 2bp insertions 2 2bp insertions 2 2bp insertions 2 2bp insertions
1 [3,50) deletions 1 [3,50) deletions 1 [3,50) deletions 1 [3,50) deletions
1 [3,50) insertions 1 [3,50) insertions 1 [3,50) insertions 1 [3,50) insertions
1 [50,1000) deletions 1 [50,1000) deletions 1 [50,1000) deletions 1 [50,1000) deletions
0 [50,1000) insertions 0 [50,1000) insertions 0 [50,1000) insertions 0 [50,1000) insertions
0 >=1000 deletions 0 >=1000 deletions 0 >=1000 deletions 0 >=1000 deletions
0 >=1000 insertions 0 >=1000 insertions 0 >=1000 insertions 0 >=1000 insertions
# samtools stats read alignment (map Qual=60) error rate
4.146390e-04 5.414635e-05 4.700935e-05 4.653782e-05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment