Skip to content

Instantly share code, notes, and snippets.

@gradha
Last active December 17, 2015 05:09
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
Star You must be signed in to star a gist
Save gradha/5555512 to your computer and use it in GitHub Desktop.
Initial Nimrod implementation of the file reading/scanning problem at http://saml.rilspace.org/calculating-gc-content-in-python-and-d-how-to-get-10x-speedup-in-d
import strutils
proc process(filename: string) =
var
input: TFile
lineBuf = newString(100)
gcCount = 0
totalBaseCount = 0
input = open(filename)
finally: input.close()
while input.readLine(lineBuf):
if lineBuf[0] != '>':
for letter in lineBuf:
case letter
of 'A':
totalBaseCount += 1
of 'C':
gcCount += 1
totalBaseCount += 1
of 'G':
gcCount += 1
totalBaseCount += 1
of 'T':
totalBaseCount += 1
else:
nil
let gcFraction = gcCount / totalBaseCount
echo formatFloat(gcFraction * 100, ffDecimal, 4)
when isMainModule:
process("Homo_sapiens.GRCh37.67.dna_rm.chromosome.Y.fa")
@gradha
Copy link
Author

gradha commented May 10, 2013

For comparison, the best of three time mesurements on my machine were:

$ time python gc_test.py 
37.6217301394

real    0m11.097s
user    0m9.934s
sys 0m0.333s

$ time ./nimspeed 
37.6217

real    0m0.931s
user    0m0.823s
sys 0m0.052s


$ time ./gc_cpp 
37.6217

real    0m0.697s
user    0m0.586s
sys 0m0.085s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment