Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

@fomightez
Last active December 20, 2015 15:19
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save fomightez/6153106 to your computer and use it in GitHub Desktop.
Save fomightez/6153106 to your computer and use it in GitHub Desktop.
dnacalc.py from Practical Computing for Biologists by Steven H. D. Haddock and Casey W. Dunn. Posted as a Gist by Wayne Decatur (fomightez) with full credit and reference to the original authors and note where the freely share the code online. You can see a static IPython Notebook version at http://nbviewer.ipython.org/6180942
# code by Steven H. D. Haddock and Casey W. Dunn as described in:
# Practical Computing for Biologists
# Steven H. D. Haddock and Casey W. Dunn
# Published in 2011 by Sinauer Associates.
# ISBN 978-0-87893-391-4
# http://www.sinauer.com/practical-computing-for-biologists.html
# see practicalcomputing.org
#
#scripts freely available by the original authors at practicalcomputing.org
#DIRECT LINK: http://practicalcomputing.org/files/pcfb_examples.zip
#
#! /usr/bin/env python
# This program takes a DNA sequence (without checking)
# and shows its length and the nucleotide composition
# This program is described in Chapter 8 of PCfB
DNASeq = "ATGTCTCATTCAAAGCA"
# gather user input for sequence
# this overrides the definition of DNASeq above
DNASeq = raw_input("Enter a sequence: ")
DNASeq = DNASeq.upper() # convert to uppercase for .count() function
DNASeq = DNASeq.replace(" ","") # remove spaces
print 'Sequence:', DNASeq
# below are nested functions: first find the length, then make it float
SeqLength = float(len(DNASeq))
print "Sequence Length:", SeqLength
NumberA = DNASeq.count('A')
NumberC = DNASeq.count('C')
NumberG = DNASeq.count('G')
NumberT = DNASeq.count('T')
# Old way to output the Numbers
# print "A:", NumberA/SeqLength
# print "C:", NumberC/SeqLength
# print "G:", NumberG/SeqLength
# print "T:", NumberT/SeqLength
# Calculate percentage and output to 1 decimal
print "A: %.1f" % (100 * NumberA / SeqLength)
print "C: %.1f" % (100 * NumberC / SeqLength)
print "G: %.1f" % (100 * NumberG / SeqLength)
print "T: %.1f" % (100 * NumberT / SeqLength)
#--------------------------------------------------------#
#****ALL BELOW ADDED BY WAYNE DECATUR TO HELP *********#
#****WITH UNDERSTANDING THIS SCRIPT AND EXPLORING******#
#****FURTHER USING THE INTERACTIVE CONSOLE. ********#
#--------------------------------------------------------#
print "\n\n\n\n---------------------------------------------------------------------------"
print "Following added by Wayne Decatur beyond Haddock and Dunn code, as a starting point"
print "for understanding the script in this gist."
print "---------------------------------------------------------------------------"
print "The dnacalc1 script calculates percent of the four bases (A,C,G,&T) in a DNA sequence."
import pprint #this is simply to allow use of the pprint command mentioned below
print "\n\nSee the raw code by clicking the blue numbers after 'Gist ID and link' in the side panel."
print "Tip: Opening the code in a separate browswer window can make it easier to follow along."
print "See a static IPython Notebook of the input and output at http://nbviewer.ipython.org/6180942"
print"\nHelp getting started exploring the code in the iPython interactive shell:"
print " You can get a list of the variable and values by typing 'pprint.pprint(locals())' at the prompt."
print " Type the name of a variable or object followed by a '?' to get information."
print "In all cases you simply what is between the single quotes, not including the single quotes."
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment