Last active
December 19, 2015 03:49
-
-
Save cfljam/5892919 to your computer and use it in GitHub Desktop.
iPython Notebook describing basic use of the Python PCR Design Tools
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"metadata": { | |
"name": "Designing CAPS Markers" | |
}, | |
"nbformat": 2, | |
"worksheets": [ | |
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"source": [ | |
"#CAPS Marker Design", | |
"", | |
"Using the Python tools at https://github.com/cfljam/galaxy-pcr-markers", | |
"", | |
"CAPS (cleaved amplified polymorphic sequences) are simple and robust genetic markers. We can screen for polymorphisms that condition restriction polymorphisms by passing a multifasta file of reference sequences and a gff3 file." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"!python find_CAPS.py test_data/targets.fasta test_data/targets.gff > CAPS.out" | |
], | |
"language": "python", | |
"outputs": [], | |
"prompt_number": 16 | |
}, | |
{ | |
"cell_type": "markdown", | |
"source": [ | |
"Now we have some hits we can design primers. " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"!head CAPS.out" | |
], | |
"language": "python", | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"JR844712\t686\t688\tJR844712:SAMTOOLS:SNP:686\tHaeIII\treference", | |
"JR844712\t759\t761\tJR844712:SAMTOOLS:SNP:759\tDdeI\treference", | |
"JR845107\t546\t548\tJR845107:SAMTOOLS:SNP:546\tRsaI\treference", | |
"JR845347\t636\t638\tJR845347:SAMTOOLS:SNP:636\tAluI\tvariant", | |
"JR845763\t63\t65\tJR845763:SAMTOOLS:SNP:63\tHinfI\treference", | |
"JR848350\t517\t519\tJR848350:SAMTOOLS:SNP:517\tAluI\tvariant", | |
"JR848350\t517\t519\tJR848350:SAMTOOLS:SNP:517\tDdeI\tvariant", | |
"JR848350\t642\t644\tJR848350:SAMTOOLS:SNP:642\tAluI\tvariant", | |
"JR848350\t642\t644\tJR848350:SAMTOOLS:SNP:642\tPvuII\tvariant", | |
"JR848350\t762\t764\tJR848350:SAMTOOLS:SNP:762\tAluI\tvariant" | |
] | |
} | |
], | |
"prompt_number": 17 | |
}, | |
{ | |
"cell_type": "markdown", | |
"source": [ | |
"In this case we will just design assays to Taq1 polymorphisms." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"!awk '/TaqI/ {print $4}' CAPS.out > Taq1CAPS.out", | |
"!head Taq1CAPS.out" | |
], | |
"language": "python", | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"k41_520516:SAMTOOLS:SNP:1229", | |
"k49_198840:SAMTOOLS:SNP:223", | |
"k49_198840:SAMTOOLS:SNP:634", | |
"k65_175750:SAMTOOLS:SNP:1216", | |
"k65_175750:SAMTOOLS:SNP:1243", | |
"k65_175750:SAMTOOLS:SNP:1813", | |
"k65_175750:SAMTOOLS:SNP:2257", | |
"k69_176262:SAMTOOLS:SNP:1401", | |
"k69_177402:SAMTOOLS:SNP:139", | |
"k69_244399:SAMTOOLS:SNP:546" | |
] | |
} | |
], | |
"prompt_number": 24 | |
}, | |
{ | |
"cell_type": "markdown", | |
"source": [ | |
"Now we can pass these targets to the design tool, specifying in this case a product size range of 80-150 bp and requesting just the best set (n=1)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"collapsed": false, | |
"input": [ | |
"%run design_HRM_primers.py -i test_data/targets.fasta -g test_data/targets.gff -T Taq1CAPS.out -p 80 -P 150 -n 1" | |
], | |
"language": "python", | |
"outputs": [ | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
"SNP_Target_ID Position Ref_base Variant_base PRIMER_LEFT_SEQUENCE PRIMER_RIGHT_SEQUENCE ref_melt_Tm var_melt_Tm Tm_difference", | |
"k41_520516:SAMTOOLS:SNP:1229 149 C T GATTCATCACTCTCCTCGTTG TGCGGATATTGATGTTGATG 0 0 0", | |
"k49_198840:SAMTOOLS:SNP:223" | |
] | |
}, | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
" 88 A G CAGTAGTGGTAGCCAAGCCC GATGGTTAATGCTTGGGAGG 0 0 0", | |
"k49_198840:SAMTOOLS:SNP:634 128 C G GATGTCGGAGAGGAGAGAGG TGAGAACCCAAACCCTAACC 0 0 0", | |
"k65_175750:SAMTOOLS:SNP:1216 116 T C GGTCCACTTCATTGAAAGGC GCGACAACAAAGAACAGTGG 0 0 0", | |
"k65_175750:SAMTOOLS:SNP:1243" | |
] | |
}, | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
" 116 C T GGTCCACTTCATTGAAAGGC GCGACAACAAAGAACAGTGG 0 0 0", | |
"k65_175750:SAMTOOLS:SNP:1813 109 C G GGCTCAACTTGGATTGTGTG TGCCTTGTGCAAGTAACTCC 0 0 0", | |
"k65_175750:SAMTOOLS:SNP:2257 79 C T GTCCCTAGACACCTGGAAGC GGGCTTCTTCTTTCAGCTTG 0 0 0", | |
"k69_176262:SAMTOOLS:SNP:1401" | |
] | |
}, | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
" 86 T C CCCTCACTGCTGAAATTACG GAGCGGGATCGGTTTAATAG 0 0 0", | |
"k69_177402:SAMTOOLS:SNP:139" | |
] | |
}, | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
" 138 T G GGCATGAAAGTGGCTAGTTG CAGGAAATCTCATGTTTGTCG 0 0 0", | |
"k69_244399:SAMTOOLS:SNP:546 149 T A GTTGGACGAACACAAAGCTG CATTCTTGCATTCTCTGCATC 0 0 0", | |
"k69_323478:SAMTOOLS:SNP:869" | |
] | |
}, | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
" 136 C A TGCACCAAGAACTTCAATCG TGATGAAGTTGCATTACGGG 0 0 0", | |
"k69_324482:SAMTOOLS:SNP:1194 137 A G AGAAATGGGTGGGTTCATTC ACCATTCATGGATCACTTCG 0 0 0", | |
"k69_324482:SAMTOOLS:SNP:2172" | |
] | |
}, | |
{ | |
"output_type": "stream", | |
"stream": "stdout", | |
"text": [ | |
" 142 C T ACATCAGCAAGGAGAACACG CTATCCATGTGGCGGTGTAG 0 0 0", | |
"k69_93535:SAMTOOLS:SNP:1147 137 C G GGACAGGGAAGCTTCATAGG TGTTTGGTATCGTTTCACCC 0 0 0" | |
] | |
} | |
], | |
"prompt_number": 29 | |
} | |
] | |
} | |
] | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment