Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save smsaladi/2c47621b6ef406a612c6480f6a9f531b to your computer and use it in GitHub Desktop.
Save smsaladi/2c47621b6ef406a612c6480f6a9f531b to your computer and use it in GitHub Desktop.
Supplementary Sequences from Zong, et al., Nat. Biotechnol., 2018, doi:10.1038/nbt.4261
>A3A-PBE
ATGGAGGCCAGCCCGGCTAGCGGCCCAAGGCATCTCATGGACCCGCACATCTTCACCAGCAACTT
CAACAACGGCATCGGCAGGCACAAGACCTACTTGTGCTACGAGGTGGAGAGGCTCGACAACGGAA
CCTCCGTGAAGATGGACCAACACAGGGGGTTCCTCCACAACCAAGCCAAGAACCTCCTCTGCGGC
TTCTACGGCAGGCACGCCGAGTTGAGGTTCCTCGACTTGGTGCCATCCCTCCAACTCGATCCAGC
CCAAATCTACCGCGTGACCTGGTTCATCTCCTGGTCCCCATGCTTCTCCTGGGGTTGCGCCGGCG
AGGTTCGGGCTTTCCTCCAAGAAAACACCCACGTCCGCCTCCGCATTTTCGCCGCCAGGATCTAT
GATTACGACCCTCTCTACAAGGAGGCCCTCCAGATGCTGCGGGACGCCGGTGCTCAGGTGAGTAT
CATGACCTACGACGAGTTCAAGCACTGCTGGGACACCTTCGTTGACCACCAGGGCTGCCCATTCC
AACCATGGGACGGTCTGGATGAACACAGCCAAGCCTTGTCCGGCAGGCTCCGGGCCATCCTCCAA
AACCAGGGGA
ACTCCGGGAGCGAGACGCCAGGCACCTCCGAGTCGGCCACCCCAGAATCT
CTTAA
GGACAAGAAGTACTCGATCGGCCTCGCCATCGGGACGAACTCAGTTGGCTGGGCCGTGATCACCG
ACGAGTACAAGGTGCCCTCTAAGAAGTTCAAGGTCCTGGGGAACACCGACCGCCATTCCATCAAG
AAGAACCTCATCGGCGCTCTCCTGTTCGACAGCGGGGAGACCGCTGAGGCTACGAGGCTCAAGAG
AACCGCTAGGCGCCGGTACACGAGAAGGAAGAACAGGATCTGCTACCTCCAAGAGATTTTCTCCA
ACGAGATGGCCAAGGTTGACGATTCATTCTTCCACCGCCTGGAGGAGTCTTTCCTCGTGGAGGAG
GATAAGAAGCACGAGCGGCATCCCATCTTCGGCAACATCGTGGACGAGGTTGCCTACCACGAGAA
GTACCCTACGATCTACCATCTGCGGAAGAAGCTCGTGGACTCCACCGATAAGGCGGACCTCAGAC
TGATCTACCTCGCTCTGGCCCACATGATCAAGTTCCGCGGCCATTTCCTGATCGAGGGGGATCTC
AACCCAGACAACAGCGATGTTGACAAGCTGTTCATCCAACTCGTGCAGACCTACAACCAACTCTT
CGAGGAGAACCCGATCAACGCCTCTGGCGTGGACGCGAAGGCTATCCTGTCCGCGAGGCTCTCGA
AGTCCAGGAGGCTGGAGAACCTGATCGCTCAGCTCCCAGGCGAGAAGAAGAACGGCCTGTTCGGG
AACCTCATCGCTCTCAGCCTGGGGCTCACCCCGAACTTCAAGTCGAACTTCGATCTCGCTGAGGA
CGCCAAGCTGCAACTCTCCAAGGACACCTACGACGATGACCTCGATAACCTCCTGGCCCAGATCG
GCGATCAATACGCGGACCTGTTCCTCGCTGCCAAGAACCTGTCGGACGCCATCCTCCTGTCAGAT
ATCCTCCGCGTGAACACCGAGATCACGAAGGCTCCACTCTCTGCCTCCATGATCAAGCGCTACGA
CGAGCACCATCAGGATCTGACCCTCCTGAAGGCGCTGGTCCGCCAACAGCTCCCGGAGAAGTACA
AGGAGATTTTCTTCGATCAGTCGAAGAACGGCTACGCTGGGTACATCGACGGCGGGGCCTCACAA
GAGGAGTTCTACAAGTTCATCAAGCCAATCCTGGAGAAGATGGACGGCACGGAGGAGCTCCTGGT
GAAGCTCAACAGGGAGGACCTCCTGCGGAAGCAGAGAACCTTCGATAACGGCAGCATCCCCCACC
AAATCCATCTCGGGGAGCTGCACGCCATCCTGAGAAGGCAAGAGGACTTCTACCCTTTCCTCAAG
GATAACCGGGAGAAGATCGAGAAGATCCTGACCTTCAGAATCCCATACTACGTCGGCCCTCTCGC
GCGGGGGAACTCAAGATTCGCTTGGATGACCCGCAAGTCTGAGGAGACCATCACGCCGTGGAACT
TCGAGGAGGTGGTGGACAAGGGCGCTAGCGCTCAGTCGTTCATCGAGAGGATGACCAACTTCGAC
AAGAACCTGCCCAACGAGAAGGTGCTCCCTAAGCACTCGCTCCTGTACGAGTACTTCACCGTCTA
CAACGAGCTCACGAAGGTGAAGTACGTCACCGAGGGCATGCGCAAGCCAGCGTTCCTGTCCGGGG
AGCAGAAGAAGGCTATCGTGGACCTCCTGTTCAAGACCAACCGGAAGGTCACGGTTAAGCAACTC
AAGGAGGACTACTTCAAGAAGATCGAGTGCTTCGATTCGGTCGAGATCAGCGGCGTTGAGGACCG
CTTCAACGCCAGCCTCGGGACCTACCACGATCTCCTGAAGATCATCAAGGATAAGGACTTCCTGG
ACAACGAGGAGAACGAGGATATCCTGGAGGACATCGTGCTGACCCTCACGCTGTTCGAGGACAGG
GAGATGATCGAGGAGCGCCTGAAGACGTACGCCCATCTCTTCGATGACAAGGTCATGAAGCAACT
CAAGCGCCGGAGATACACCGGCTGGGGGAGGCTGTCCCGCAAGCTCATCAACGGCATCCGGGACA
AGCAGTCCGGGAAGACCATCCTCGACTTCCTCAAGAGCGATGGCTTCGCCAACAGGAACTTCATG
CAACTGATCCACGATGACAGCCTCACCTTCAAGGAGGATATCCAAAAGGCTCAAGTGAGCGGCCA
GGGGGACTCGCTGCACGAGCATATCGCGAACCTCGCTGGCTCCCCCGCGATCAAGAAGGGCATCC
TCCAGACCGTGAAGGTTGTGGACGAGCTCGTGAAGGTCATGGGCCGGCACAAGCCTGAGAACATC
GTCATCGAGATGGCCAGAGAGAACCAAACCACGCAGAAGGGGCAAAAGAACTCTAGGGAGCGCAT
GAAGCGCATCGAGGAGGGCATCAAGGAGCTGGGGTCCCAAATCCTCAAGGAGCACCCAGTGGAGA
ACACCCAACTGCAGAACGAGAAGCTCTACCTGTACTACCTCCAGAACGGCAGGGATATGTACGTG
GACCAAGAGCTGGATATCAACCGCCTCAGCGATTACGACGTCGATCATATCGTTCCCCAGTCTTT
CCTGAAGGATGACTCCATCGACAACAAGGTCCTCACCAGGTCGGACAAGAACCGCGGCAAGTCAG
ATAACGTTCCATCTGAGGAGGTCGTTAAGAAGATGAAGAACTACTGGAGGCAGCTCCTGAACGCC
AAGCTGATCACGCAAAGGAAGTTCGACAACCTCACCAAGGCTGAGAGAGGCGGGCTCTCAGAGCT
GGACAAGGCCGGCTTCATCAAGCGGCAGCTGGTCGAGACCAGACAAATCACGAAGCACGTTGCGC
AAATCCTCGACTCTCGGATGAACACGAAGTACGATGAGAACGACAAGCTGATCAGGGAGGTTAAG
GTGATCACCCTGAAGTCTAAGCTCGTCTCCGACTTCAGGAAGGATTTCCAGTTCTACAAGGTTCG
CGAGATCAACAACTACCACCATGCCCATGACGCTTACCTCAACGCTGTGGTCGGCACCGCTCTGA
TCAAGAAGTACCCAAAGCTGGAGTCCGAGTTCGTGTACGGGGACTACAAGGTTTACGATGTGCGC
AAGATGATCGCCAAGTCGGAGCAAGAGATCGGCAAGGCTACCGCCAAGTACTTCTTCTACTCAAA
CATCATGAACTTCTTCAAGACCGAGATCACGCTGGCCAACGGCGAGATCCGGAAGAGACCGCTCA
TCGAGACCAACGGCGAGACGGGGGAGATCGTGTGGGACAAGGGCAGGGATTTCGCGACCGTCCGC
AAGGTTCTCTCCATGCCCCAGGTGAACATCGTCAAGAAGACCGAGGTCCAAACGGGCGGGTTCTC
AAAGGAGTCTATCCTGCCTAAGCGGAACAGCGACAAGCTCATCGCCAGAAAGAAGGACTGGGACC
CAAAGAAGTACGGCGGGTTCGACAGCCCTACCGTGGCCTACTCGGTCCTGGTTGTGGCGAAGGTT
GAGAAGGGCAAGTCCAAGAAGCTCAAGAGCGTGAAGGAGCTCCTGGGGATCACCATCATGGAGAG
GTCCAGCTTCGAGAAGAACCCAATCGACTTCCTGGAGGCCAAGGGCTACAAGGAGGTGAAGAAGG
ACCTGATCATCAAGCTCCCGAAGTACTCTCTCTTCGAGCTGGAGAACGGCAGGAAGAGAATGCTG
GCTTCCGCTGGCGAGCTCCAGAAGGGGAACGAGCTCGCGCTGCCAAGCAAGTACGTGAACTTCCT
CTACCTGGCTTCCCACTACGAGAAGCTCAAGGGCAGCCCGGAGGACAACGAGCAAAAGCAGCTGT
TCGTCGAGCAGCACAAGCATTACCTCGACGAGATCATCGAGCAAATCTCCGAGTTCAGCAAGCGC
GTGATCCTCGCCGACGCGAACCTGGATAAGGTCCTCTCCGCCTACAACAAGCACCGGGACAAGCC
CATCAGAGAGCAAGCGGAGAACATCATCCATCTCTTCACCCTGACGAACCTCGGCGCTCCTGCTG
CTTTCAAGTACTTCGACACCACGATCGATCGGAAGAGATACACCTCCACGAAGGAGGTCCTGGAC
GCGACCCTCATCCACCAGTCGATCACCGGCCTGTACGAGACGAGGATCGACCTCTCACAACTCGG
CGGGGAT
aagagacccgcagcaaccaagaaggcagggcaagcaaagaagaagaag
ACGCGTGACT
CCGGCGGCAGCACCAACCTGTCCGACATCATCGAGAAGGAGACGGGCAAGCAACTCGTGATCCAG
GAGAGCATCCTCATGCTGCCAGAGGAGGTGGAGGAGGTCATCGGCAACAAGCCAGAGTCCGACAT
CCTGGTGCACACCGCCTACGACGAGTCCACCGACGAGAACGTCATGCTCCTGACCAGCGACGCCC
CAGAGTACAAGCCATGGGCCCTCGTCATCCAGGACAGCAACGGGGAGAACAAGATCAAGATGCTG
Tcgggggggagcccaaagaagaagcggaaggtg
TAG
>A3A-Gam
ATGGCGAAGCCGGCCAAGAGGATCAAATCCGCTGCTGCTGCCTACGTGCCGCAAAAT
AGGGATGCCGTGATCACCGACATCAAGAGGATCGGCGATCTGCAGAGGGAGGCGTCT
CGTCTCGAAACTGAGATGAACGACGCGATCGCGGAGATCACCGAGAAGTTCGCCGCT
CGTATCGCCCCGATCAAGACCGACATCGAAACTCTCTCCAAGGGCGTGCAAGGTTGG
TGCGAGGCCAATAGGGACGAGCTCACCAATGGCGGCAAGGTGAAGACCGCCAACCTC
GTGACCGGCGATGTGTCTTGGAGGGTGAGGCCACCATCCGTGAGCATTCGTGGTATG
GACGCCGTGATGGAAACTCTCGAGCGCCTCGGCCTCCAAAGGTTCATCCGCACCAAG
CAAGAAATCAACAAGGAGGCGATCCTCCTCGAGCCAAAAGCCGTGGCCGGCGTGGCC
GGCATCACAGTCAAGTCCGGCATCGAGGACTTCTCCATCATCCCGTTCGAGCAAGAA
GCCGGCATC
TCCGGCAGCGAGACGCCAGGCACCTCCGAGAGCGCTACGCCTGAATCC
AGGCCT
GAGGCCAGCCCGGCTAGCGGCCCAAGGCATCTCATGGACCCGCACATCTTC
ACCAGCAACTTCAACAACGGCATCGGCAGGCACAAGACCTACTTGTGCTACGAGGTG
GAGAGGCTCGACAACGGAACCTCCGTGAAGATGGACCAACACASGGGGTTCCTCCAC
AACCAAGCCAAGAACCTCCTCTGCGGCTTCTACGGCAGGCACGCCGAGTTGAGGTTC
CTCGACTTGGTGCCATCCCTCCAACTCGATCCAGCCCAAATCTACCGCGTGACCTGG
TTCATCTCCTGGTCCCCATGCTTCTCCTGGGGTTGCGCCGGCGAGGTTCGGGCTTTC
CTCCAAGAAAACACCCACGTCCGCCTCCGCATTTTCGCCGCCAGGATCTATGATTAC
GACCCTCTCTACAAGGAGGCCCTCCAGATGCTGCGGGACGCCGSTGCTCAGGTGAGT
ATCATGACCTACGACGAGTTCAAGCACTGCTGGGACACCTTCGTTGACCACCAGGGC
TGCCCATTCCAACCATGGGACGGTCTGGATGAACACAGCCAAGCCTTGTCCGGCAGG
CTCCGGGCCATCCTCCAAAACCAGGGGAAC
AGCGGAGGATCTTCCGGAGGATCTAGC
GGCTCCGAGACACCAGGAACATCCGAAAGCGCTACACCAGAATCTAGCGGAGGCTCT
TCCGGAGGATCT
CTTAAGGACAAGAAGTACTCGATCGGCCTCGCCATCGGGACGAAC
TCAGTTGGCTGGGCCGTGATCACCGACGAGTACAAGGTGCCCTCTAAGAAGTTCAAG
GTCCTGGGGAACACCGACCGCCATTCCATCAAGAAGAACCTCATCGGCGCTCTCCTG
TTCGACAGCGGGGAGACCGCTGAGGCTACGAGGCTCAAGAGAACCGCTAGGCGCCGG
TACACGAGAAGGAAGAACAGGATCTGCTACCTCCAAGAGATTTTCTCCAACGAGATG
GCCAAGGTTGACGATTCATTCTTCCACCGCCTGGAGGAGTCTTTCCTCGTGGAGGAG
GATAAGAAGCACGAGCGGCATCCCATCTTCGGCAACATCGTGGACGAGGTTGCCTAC
CACGAGAAGTACCCTACGATCTACCATCTGCGGAAGAAGCTCGTGGACTCCACCGAT
AAGGCGGACCTCAGACTGATCTACCTCGCTCTGGCCCACATGATCAAGTTCCGCGGC
CATTTCCTGATCGAGGGGGATCTCAACCCAGACAACAGCGATGTTGACAAGCTGTTC
ATCCAACTCGTGCAGACCTACAACCAACTCTTCGAGGAGAACCCGATCAACGCCTCT
GGCGTGGACGCGAAGGCTATCCTGTCCGCGAGGCTCTCGAAGTCCAGGAGGCTGGAG
AACCTGATCGCTCAGCTCCCAGGCGAGAAGAAGAACGGCCTGTTCGGGAACCTCATC
GCTCTCAGCCTGGGGCTCACCCCGAACTTCAAGTCGAACTTCGATCTCGCTGAGGAC
GCCAAGCTGCAACTCTCCAAGGACACCTACGACGATGACCTCGATAACCTCCTGGCC
CAGATCGGCGATCAATACGCGGACCTGTTCCTCGCTGCCAAGAACCTGTCGGACGCC
ATCCTCCTGTCAGATATCCTCCGCGTGAACACCGAGATCACGAAGGCTCCACTCTCT
GCCTCCATGATCAAGCGCTACGACGAGCACCATCAGGATCTGACCCTCCTGAAGGCG
CTGGTCCGCCAACAGCTCCCGGAGAAGTACAAGGAGATTTTCTTCGATCAGTCGAAG
AACGGCTACGCTGGGTACATCGACGGCGGGGCCTCACAAGAGGAGTTCTACAAGTTC
ATCAAGCCAATCCTGGAGAAGATGGACGGCACGGAGGAGCTCCTGGTGAAGCTCAAC
AGGGAGGACCTCCTGCGGAAGCAGAGAACCTTCGATAACGGCAGCATCCCCCACCAA
ATCCATCTCGGGGAGCTGCACGCCATCCTGAGAAGGCAAGAGGACTTCTACCCTTTC
CTCAAGGATAACCGGGAGAAGATCGAGAAGATCCTGACCTTCAGAATCCCATACTAC
GTCGGCCCTCTCGCGCGGGGGAACTCAAGATTCGCTTGGATGACCCGCAAGTCTGAG
GAGACCATCACGCCGTGGAACTTCGAGGAGGTGGTGGACAAGGGCGCTAGCGCTCAG
TCGTTCATCGAGAGGATGACCAACTTCGACAAGAACCTGCCCAACGAGAAGGTGCTC
CCTAAGCACTCGCTCCTGTACGAGTACTTCACCGTCTACAACGAGCTCACGAAGGTG
AAGTACGTCACCGAGGGCATGCGCAAGCCAGCGTTCCTGTCCGGGGAGCAGAAGAAG
GCTATCGTGGACCTCCTGTTCAAGACCAACCGGAAGGTCACGGTTAAGCAACTCAAG
GAGGACTACTTCAAGAAGATCGAGTGCTTCGATTCGGTCGAGATCAGCGGCGTTGAG
GACCGCTTCAACGCCAGCCTCGGGACCTACCACGATCTCCTGAAGATCATCAAGGAT
AAGGACTTCCTGGACAACGAGGAGAACGAGGATATCCTGGAGGACATCGTGCTGACC
CTCACGCTGTTCGAGGACAGGGAGATGATCGAGGAGCGCCTGAAGACGTACGCCCAT
CTCTTCGATGACAAGGTCATGAAGCAACTCAAGCGCCGGAGATACACCGGCTGGGGG
AGGCTGTCCCGCAAGCTCATCAACGGCATCCGGGACAAGCAGTCCGGGAAGACCATC
CTCGACTTCCTCAAGAGCGATGGCTTCGCCAACAGGAACTTCATGCAACTGATCCAC
GATGACAGCCTCACCTTCAAGGAGGATATCCAAAAGGCTCAAGTGAGCGGCCAGGGG
GACTCGCTGCACGAGCATATCGCGAACCTCGCTGGCTCCCCCGCGATCAAGAAGGGC
ATCCTCCAGACCGTGAAGGTTGTGGACGAGCTCGTGAAGGTCATGGGCCGGCACAAG
CCTGAGAACATCGTCATCGAGATGGCCAGAGAGAACCAAACCACGCAGAAGGGGCAA
AAGAACTCTAGGGAGCGCATGAAGCGCATCGAGGAGGGCATCAAGGAGCTGGGGTCC
CAAATCCTCAAGGAGCACCCAGTGGAGAACACCCAACTGCAGAACGAGAAGCTCTAC
CTGTACTACCTCCAGAACGGCAGGGATATGTACGTGGACCAAGAGCTGGATATCAAC
CGCCTCAGCGATTACGACGTCGATCATATCGTTCCCCAGTCTTTCCTGAAGGATGAC
TCCATCGACAACAAGGTCCTCACCAGGTCGGACAAGAACCGCGGCAAGTCAGATAAC
GTTCCATCTGAGGAGGTCGTTAAGAAGATGAAGAACTACTGGAGGCAGCTCCTGAAC
GCCAAGCTGATCACGCAAAGGAAGTTCGACAACCTCACCAAGGCTGAGAGAGGCGGG
CTCTCAGAGCTGGACAAGGCCGGCTTCATCAAGCGGCAGCTGGTCGAGACCAGACAA
ATCACGAAGCACGTTGCGCAAATCCTCGACTCTCGGATGAACACGAAGTACGATGAG
AACGACAAGCTGATCAGGGAGGTTAAGGTGATCACCCTGAAGTCTAAGCTCGTCTCC
GACTTCAGGAAGGATTTCCAGTTCTACAAGGTTCGCGAGATCAACAACTACCACCAT
GCCCATGACGCTTACCTCAACGCTGTGGTCGGCACCGCTCTGATCAAGAAGTACCCA
AAGCTGGAGTCCGAGTTCGTGTACGGGGACTACAAGGTTTACGATGTGCGCAAGATG
ATCGCCAAGTCGGAGCAAGAGATCGGCAAGGCTACCGCCAAGTACTTCTTCTACTCA
AACATCATGAACTTCTTCAAGACCGAGATCACGCTGGCCAACGGCGAGATCCGGAAG
AGACCGCTCATCGAGACCAACGGCGAGACGGGGGAGATCGTGTGGGACAAGGGCAGG
GATTTCGCGACCGTCCGCAAGGTTCTCTCCATGCCCCAGGTGAACATCGTCAAGAAG
ACCGAGGTCCAAACGGGCGGGTTCTCAAAGGAGTCTATCCTGCCTAAGCGGAACAGC
GACAAGCTCATCGCCAGAAAGAAGGACTGGGACCCAAAGAAGTACGGCGGGTTCGAC
AGCCCTACCGTGGCCTACTCGGTCCTGGTTGTGGCGAAGGTTGAGAAGGGCAAGTCC
AAGAAGCTCAAGAGCGTGAAGGAGCTCCTGGGGATCACCATCATGGAGAGGTCCAGC
TTCGAGAAGAACCCAATCGACTTCCTGGAGGCCAAGGGCTACAAGGAGGTGAAGAAG
GACCTGATCATCAAGCTCCCGAAGTACTCTCTCTTCGAGCTGGAGAACGGCAGGAAG
AGAATGCTGGCTTCCGCTGGCGAGCTCCAGAAGGGGAACGAGCTCGCGCTGCCAAGC
AAGTACGTGAACTTCCTCTACCTGGCTTCCCACTACGAGAAGCTCAAGGGCAGCCCG
GAGGACAACGAGCAAAAGCAGCTGTTCGTCGAGCAGCACAAGCATTACCTCGACGAG
ATCATCGAGCAAATCTCCGAGTTCAGCAAGCGCGTGATCCTCGCCGACGCGAACCTG
GATAAGGTCCTCTCCGCCTACAACAAGCACCGGGACAAGCCCATCAGAGAGCAAGCG
GAGAACATCATCCATCTCTTCACCCTGACGAACCTCGGCGCTCCTGCTGCTTTCAAG
TACTTCGACACCACGATCGATCGGAAGAGATACACCTCCACGAAGGAGGTCCTGGAC
GCGACCCTCATCCACCAGTCGATCACCGGCCTGTACGAGACGAGGATCGACCTCTCA
CAACTCGGCGGGGAT
aagagacccgcagcaaccaagaaggcagggcaagcaaagaag
aagaag
ACGCGTTCAGGCGGCTCCGGCGGCTCC
ACCAACCTGTCCGACATCATCGAG
AAGGAGACGGGCAAGCAACTCGTGATCCAGGAGAGCATCCTCATGCTGCCAGAGGAG
GTGGAGGAGGTCATCGGCAACAAGCCAGAGTCCGACATCCTGGTGCACACCGCCTAC
GACGAGTCCACCGACGAGAACGTCATGCTCCTGACCAGCGACGCCCCAGAGTACAAG
CCATGGGCCCTCGTCATCCAGGACAGCAACGGGGAGAACAAGATCAAGATGCTG
TCG
GGGACGCGTGACTCCGGCGGCAGC
ACCAACCTGTCCGACATCATCGAGAAGGAGACG
GGCAAGCAACTCGTGATCCAGGAGAGCATCCTCATGCTGCCAGAGGAGGTGGAGGAG
GTCATCGGCAACAAGCCAGAGTCCGACATCCTGGTGCACACCGCCTACGACGAGTCC
ACCGACGAGAACGTCATGCTCCTGACCAGCGACGCCCCAGAGTACAAGCCATGGGCC
CTCGTCATCCAGGACAGCAACGGGGAGAACAAGATCAAGATGCTG
tcgggggggagc
ccaaagaagaagcggaaggtg
TAG
>A3A-PBE-ΔUGI
ATGGAGGCCAGCCCGGCTAGCGGCCCAAGGCATCTCATGGACCCGCACATCTTCACCAGCAACTT
CAACAACGGCATCGGCAGGCACAAGACCTACTTGTGCTACGAGGTGGAGAGGCTCGACAACGGAA
CCTCCGTGAAGATGGACCAACACAGGGGGTTCCTCCACAACCAAGCCAAGAACCTCCTCTGCGGC
TTCTACGGCAGGCACGCCGAGTTGAGGTTCCTCGACTTGGTGCCATCCCTCCAACTCGATCCAGC
CCAAATCTACCGCGTGACCTGGTTCATCTCCTGGTCCCCATGCTTCTCCTGGGGTTGCGCCGGCG
AGGTTCGGGCTTTCCTCCAAGAAAACACCCACGTCCGCCTCCGCATTTTCGCCGCCAGGATCTAT
GATTACGACCCTCTCTACAAGGAGGCCCTCCAGATGCTGCGGGACGCCGGTGCTCAGGTGAGTAT
CATGACCTACGACGAGTTCAAGCACTGCTGGGACACCTTCGTTGACCACCAGGGCTGCCCATTCC
AACCATGGGACGGTCTGGATGAACACAGCCAAGCCTTGTCCGGCAGGCTCCGGGCCATCCTCCAA
AACCAGGGGAAC
TCCGGGAGCGAGACGCCAGGCACCTCCGAGTCGGCCACCCCAGAATCT
CTTAA
GGACAAGAAGTACTCGATCGGCCTCGCCATCGGGACGAACTCAGTTGGCTGGGCCGTGATCACCG
ACGAGTACAAGGTGCCCTCTAAGAAGTTCAAGGTCCTGGGGAACACCGACCGCCATTCCATCAAG
AAGAACCTCATCGGCGCTCTCCTGTTCGACAGCGGGGAGACCGCTGAGGCTACGAGGCTCAAGAG
AACCGCTAGGCGCCGGTACACGAGAAGGAAGAACAGGATCTGCTACCTCCAAGAGATTTTCTCCA
ACGAGATGGCCAAGGTTGACGATTCATTCTTCCACCGCCTGGAGGAGTCTTTCCTCGTGGAGGAG
GATAAGAAGCACGAGCGGCATCCCATCTTCGGCAACATCGTGGACGAGGTTGCCTACCACGAGAA
GTACCCTACGATCTACCATCTGCGGAAGAAGCTCGTGGACTCCACCGATAAGGCGGACCTCAGAC
TGATCTACCTCGCTCTGGCCCACATGATCAAGTTCCGCGGCCATTTCCTGATCGAGGGGGATCTC
AACCCAGACAACAGCGATGTTGACAAGCTGTTCATCCAACTCGTGCAGACCTACAACCAACTCTT
CGAGGAGAACCCGATCAACGCCTCTGGCGTGGACGCGAAGGCTATCCTGTCCGCGAGGCTCTCGA
AGTCCAGGAGGCTGGAGAACCTGATCGCTCAGCTCCCAGGCGAGAAGAAGAACGGCCTGTTCGGG
AACCTCATCGCTCTCAGCCTGGGGCTCACCCCGAACTTCAAGTCGAACTTCGATCTCGCTGAGGA
CGCCAAGCTGCAACTCTCCAAGGACACCTACGACGATGACCTCGATAACCTCCTGGCCCAGATCG
GCGATCAATACGCGGACCTGTTCCTCGCTGCCAAGAACCTGTCGGACGCCATCCTCCTGTCAGAT
ATCCTCCGCGTGAACACCGAGATCACGAAGGCTCCACTCTCTGCCTCCATGATCAAGCGCTACGA
CGAGCACCATCAGGATCTGACCCTCCTGAAGGCGCTGGTCCGCCAACAGCTCCCGGAGAAGTACA
AGGAGATTTTCTTCGATCAGTCGAAGAACGGCTACGCTGGGTACATCGACGGCGGGGCCTCACAA
GAGGAGTTCTACAAGTTCATCAAGCCAATCCTGGAGAAGATGGACGGCACGGAGGAGCTCCTGGT
GAAGCTCAACAGGGAGGACCTCCTGCGGAAGCAGAGAACCTTCGATAACGGCAGCATCCCCCACC
AAATCCATCTCGGGGAGCTGCACGCCATCCTGAGAAGGCAAGAGGACTTCTACCCTTTCCTCAAG
GATAACCGGGAGAAGATCGAGAAGATCCTGACCTTCAGAATCCCATACTACGTCGGCCCTCTCGC
GCGGGGGAACTCAAGATTCGCTTGGATGACCCGCAAGTCTGAGGAGACCATCACGCCGTGGAACT
TCGAGGAGGTGGTGGACAAGGGCGCTAGCGCTCAGTCGTTCATCGAGAGGATGACCAACTTCGAC
AAGAACCTGCCCAACGAGAAGGTGCTCCCTAAGCACTCGCTCCTGTACGAGTACTTCACCGTCTA
CAACGAGCTCACGAAGGTGAAGTACGTCACCGAGGGCATGCGCAAGCCAGCGTTCCTGTCCGGGG
AGCAGAAGAAGGCTATCGTGGACCTCCTGTTCAAGACCAACCGGAAGGTCACGGTTAAGCAACTC
AAGGAGGACTACTTCAAGAAGATCGAGTGCTTCGATTCGGTCGAGATCAGCGGCGTTGAGGACCG
CTTCAACGCCAGCCTCGGGACCTACCACGATCTCCTGAAGATCATCAAGGATAAGGACTTCCTGG
ACAACGAGGAGAACGAGGATATCCTGGAGGACATCGTGCTGACCCTCACGCTGTTCGAGGACAGG
GAGATGATCGAGGAGCGCCTGAAGACGTACGCCCATCTCTTCGATGACAAGGTCATGAAGCAACT
CAAGCGCCGGAGATACACCGGCTGGGGGAGGCTGTCCCGCAAGCTCATCAACGGCATCCGGGACA
AGCAGTCCGGGAAGACCATCCTCGACTTCCTCAAGAGCGATGGCTTCGCCAACAGGAACTTCATG
CAACTGATCCACGATGACAGCCTCACCTTCAAGGAGGATATCCAAAAGGCTCAAGTGAGCGGCCA
GGGGGACTCGCTGCACGAGCATATCGCGAACCTCGCTGGCTCCCCCGCGATCAAGAAGGGCATCC
TCCAGACCGTGAAGGTTGTGGACGAGCTCGTGAAGGTCATGGGCCGGCACAAGCCTGAGAACATC
GTCATCGAGATGGCCAGAGAGAACCAAACCACGCAGAAGGGGCAAAAGAACTCTAGGGAGCGCAT
GAAGCGCATCGAGGAGGGCATCAAGGAGCTGGGGTCCCAAATCCTCAAGGAGCACCCAGTGGAGA
ACACCCAACTGCAGAACGAGAAGCTCTACCTGTACTACCTCCAGAACGGCAGGGATATGTACGTG
GACCAAGAGCTGGATATCAACCGCCTCAGCGATTACGACGTCGATCATATCGTTCCCCAGTCTTT
CCTGAAGGATGACTCCATCGACAACAAGGTCCTCACCAGGTCGGACAAGAACCGCGGCAAGTCAG
ATAACGTTCCATCTGAGGAGGTCGTTAAGAAGATGAAGAACTACTGGAGGCAGCTCCTGAACGCC
AAGCTGATCACGCAAAGGAAGTTCGACAACCTCACCAAGGCTGAGAGAGGCGGGCTCTCAGAGCT
GGACAAGGCCGGCTTCATCAAGCGGCAGCTGGTCGAGACCAGACAAATCACGAAGCACGTTGCGC
AAATCCTCGACTCTCGGATGAACACGAAGTACGATGAGAACGACAAGCTGATCAGGGAGGTTAAG
GTGATCACCCTGAAGTCTAAGCTCGTCTCCGACTTCAGGAAGGATTTCCAGTTCTACAAGGTTCG
CGAGATCAACAACTACCACCATGCCCATGACGCTTACCTCAACGCTGTGGTCGGCACCGCTCTGA
TCAAGAAGTACCCAAAGCTGGAGTCCGAGTTCGTGTACGGGGACTACAAGGTTTACGATGTGCGC
AAGATGATCGCCAAGTCGGAGCAAGAGATCGGCAAGGCTACCGCCAAGTACTTCTTCTACTCAAA
CATCATGAACTTCTTCAAGACCGAGATCACGCTGGCCAACGGCGAGATCCGGAAGAGACCGCTCA
TCGAGACCAACGGCGAGACGGGGGAGATCGTGTGGGACAAGGGCAGGGATTTCGCGACCGTCCGC
AAGGTTCTCTCCATGCCCCAGGTGAACATCGTCAAGAAGACCGAGGTCCAAACGGGCGGGTTCTC
AAAGGAGTCTATCCTGCCTAAGCGGAACAGCGACAAGCTCATCGCCAGAAAGAAGGACTGGGACC
CAAAGAAGTACGGCGGGTTCGACAGCCCTACCGTGGCCTACTCGGTCCTGGTTGTGGCGAAGGTT
GAGAAGGGCAAGTCCAAGAAGCTCAAGAGCGTGAAGGAGCTCCTGGGGATCACCATCATGGAGAG
GTCCAGCTTCGAGAAGAACCCAATCGACTTCCTGGAGGCCAAGGGCTACAAGGAGGTGAAGAAGG
ACCTGATCATCAAGCTCCCGAAGTACTCTCTCTTCGAGCTGGAGAACGGCAGGAAGAGAATGCTG
GCTTCCGCTGGCGAGCTCCAGAAGGGGAACGAGCTCGCGCTGCCAAGCAAGTACGTGAACTTCCT
CTACCTGGCTTCCCACTACGAGAAGCTCAAGGGCAGCCCGGAGGACAACGAGCAAAAGCAGCTGT
TCGTCGAGCAGCACAAGCATTACCTCGACGAGATCATCGAGCAAATCTCCGAGTTCAGCAAGCGC
GTGATCCTCGCCGACGCGAACCTGGATAAGGTCCTCTCCGCCTACAACAAGCACCGGGACAAGCC
CATCAGAGAGCAAGCGGAGAACATCATCCATCTCTTCACCCTGACGAACCTCGGCGCTCCTGCTG
CTTTCAAGTACTTCGACACCACGATCGATCGGAAGAGATACACCTCCACGAAGGAGGTCCTGGAC
GCGACCCTCATCCACCAGTCGATCACCGGCCTGTACGAGACGAGGATCGACCTCTCACAACTCGG
CGGGGAT
aagagacccgcagcaaccaagaaggcagggcaagcaaagaagaagaag
TAG
@smsaladi
Copy link
Author

smsaladi commented Oct 11, 2018

Text from the header of the document:

Supplementary sequences. Complete coding sequences of the A3A-PBE, A3A-Gam,
and A3A-PBE-ΔUGI fusion cistrons optimized in this study. The NLSs are written in
lower cases. The codon-optimized Gam, XTEN linker, APOBEC1/A3A, 32aa linker,
9aa linker and UGI are highlighted in brown, green, blue, purple, gray and red,
respectively. The codon-optimized nCas9 (D10A) is shown in bold. 

Not sure why you'd take screenshots and then paste them into a PDF, but... here are the sequences after using Adobe Acrobat X's OCR functionality. Line breaks are inserted where the color changes in the PDF.

Details:

  1. Downloaded Supplementary Sequences file
  2. Saved each image in the pdf as a jpg
  3. Compiled them together into a new PDF document using Adobe Acrobat XI
  4. Used Acrobat's Text Recognition feature (English (US), ClearScan, 300 dpi)
  5. Copy/pasted text into Microsoft Word (Ctrl+A, Ctrl+C, Ctrl+P`) and format into a fixed-width
  6. Spot check that lines are a consistent length, correct blaring errors (i.e. non-ACGT), insert > at headings, and insert \n where font color changes

caveat emptor

Associated paper:
Efficient C-to-T base editing in plants using a fusion of nCas9 and human APOBEC3A
Yuan Zong, Qianna Song, Chao Li, Shuai Jin, Dingbo Zhang, Yanpeng Wang, Jin-Long Qiu & Caixia Gao
Nat Biotechnol. 2018 Oct 1. doi: 10.1038/nbt.4261

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment