Skip to content

Instantly share code, notes, and snippets.

@fomightez
Last active December 20, 2015 07:59
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save fomightez/6097538 to your computer and use it in GitHub Desktop.
Save fomightez/6097538 to your computer and use it in GitHub Desktop.
compositioncalc2.py from Practical Computing for Biologists by Steven H. D. Haddock and Casey W. Dunn. Posted as a Gist by Wayne Decatur (fomightez) with full credit and reference to the original authors and specifying where they freely share the code online. You can see a static IPython Notebook version at http://nbviewer.ipython.org/6102154
# code by Steven H. D. Haddock and Casey W. Dunn as described in:
# Practical Computing for Biologists
# Steven H. D. Haddock and Casey W. Dunn
# Published in 2011 by Sinauer Associates.
# ISBN 978-0-87893-391-4
# http://www.sinauer.com/practical-computing-for-biologists.html
# see practicalcomputing.org
#
#scripts freely available by the original authors at practicalcomputing.org
#DIRECT LINK: http://practicalcomputing.org/files/pcfb_examples.zip
#
#! /usr/bin/env python
DNASeq = "ATGTCTCATTCAAAGCA"
SeqLength = float(len(DNASeq))
BaseList = list(set(DNASeq))
for Base in BaseList:
Percent = 100 * DNASeq.count(Base) / SeqLength
print "%s: %4.1f" % (Base,Percent)
#--------------------------------------------------------#
#****ALL BELOW ADDED BY WAYNE DECATUR TO HELP *********#
#****WITH UNDERSTANDING THIS SCRIPT AND EXPLORING******#
#****FURTHER USING THE INTERACTIVE CONSOLE. ********#
#--------------------------------------------------------#
print "\n\n\n\n---------------------------------------------------------------------------"
print "Following added by Wayne Decatur beyond Haddock and Dunn code, as a starting point"
print "for understanding the script in this gist."
print "---------------------------------------------------------------------------"
print "The compositoncalc2 script calculates percent of the bases or amino acids in a "
print "DNA or protein sequence,handling nonstandard bases or amino acids. (Actually "
print "it will calculate percent of any character in a string of characters.)"
print "This improvement over the compositoncalc1 script is made possible by "
print "defining 'BaseList = list(set(DNASeq))'"
print "\nHere, the DNA sequence (DNASeq) = %s" % DNASeq
for Base in BaseList:
Percent = 100 * DNASeq.count(Base) / SeqLength
print "%s: %4.1f%%" % (Base,Percent)
#-----defining the calculator as a function to make interaction via the interactive gist console easier
def calc (MySeq):
SeqLength = float(len(MySeq))
ComponentList = list(set(MySeq))
for Component in ComponentList:
Percent = 100 * MySeq.count(Component) / SeqLength
print "%s: %4.1f%%" % (Component,Percent)
#initialize MySeq with same sequence 'calc(MyDNASeq)' works for testing purposes without necesarily needing to define
MySeq = DNASeq
import pprint #this is simply to allow use of the pprint command mentioned below
print "\n\nSee the raw code by clicking the blue numbers after 'Gist ID and link' "
print " in the side panel. Tip: Opening the code in a separate browswer window can "
print "make it easier to follow along. See a static IPython Notebook of the input and "
print "output at http://nbviewer.ipython.org/6102154"
print "\n\nTo calculate composition for your own sequence (DNA or protein):"
print " Define the MySeq variable to what you want by typing 'MySeq=\"YOUR SEQUENCE HERE\" ' where "
print " you type your sequence in place of YOUR SEQUENCE HERE and hit return. Then type, 'calc (MySeq)'"
print " and hit return."
print " You can also skip the MySeq defining and type 'calc(\"YOUR SEQUENCE HERE\") and hit return."
print"\nHelp getting started exploring the code in the iPython interactive shell:"
print " You can get a list of the variable and values by typing 'pprint.pprint(locals())' at the prompt."
print " Type the name of a variable or object followed by a '?' to get information."
print "In all cases you actually type the double quotes and simply what is between the single quotes."
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment