Skip to content

Instantly share code, notes, and snippets.

@jamessdixon
Created December 19, 2020 21:01
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jamessdixon/7b6cb74e7f318af370fb99d46dd8f0d0 to your computer and use it in GitHub Desktop.
Save jamessdixon/7b6cb74e7f318af370fb99d46dd8f0d0 to your computer and use it in GitHub Desktop.
Looking at the COVID Genome Using F#
open System
open System.IO
let path = "/Users/jamesdixon/Downloads/sequence.fasta"
let file = File.ReadAllText(path)
let totalLength = file.Length
let prefixLength = 97
let suffixLength = 2
let stringSequence = file.Substring(prefixLength,totalLength-prefixLength-suffixLength)
type Base = A | C | G | T
let baseFromString (s:String) =
match s.ToLowerInvariant() with
| "a" -> Some A
| "c" -> Some C
| "t" -> Some T
| "g" -> Some G
| _ -> None
let arraySequence = stringSequence.ToCharArray()
let bases =
arraySequence
|> Seq.map(fun c -> c.ToString())
|> Seq.map(fun s -> baseFromString s)
|> Seq.filter(fun b -> b.IsSome)
|> Seq.map(fun b -> b.Value)
let spikeLength = abs(21563-25384)
let spike = bases |> Seq.skip 21561 |> Seq.take (spikeLength)
spike
let orf1abLength = abs(266-21555)
let orf1ab = bases |> Seq.skip 263 |> Seq.take (spikeLength)
orf1ab
@muehlhaus
Copy link

Hi James, very nice post.
I took the liberty to converted your example using BioFSharp. BioFSharp aims to be a user-friendly functional library for bioinformatics written in F#.
I hope that is interesting for you. We are always happy to find other computational biologist using F#. Maybe you want to get in touch?
All the best,
Timo

`
#r "nuget: BioFSharp, Version=2.0.0-beta4"
#r "nuget: BioFSharp.IO, Version=2.0.0-beta4"

open BioFSharp
open BioFSharp.IO

//reads from file to a sequence of FastaItems and takes the first element
let genome =
SOURCE_DIRECTORY + "/sequence.fasta"
|> FastA.fromFile BioArray.ofNucleotideString
|> Seq.head

let spikeLength' = abs(21563-25384)
let spike' =
genome.Sequence
|> Seq.skip 21561
|> Seq.take spikeLength'
spike'

let orf1abLength' = abs(266-21555)
let orf1ab' =
bases
|> Seq.skip 263
|> Seq.take spikeLength'
orf1ab'

// Example translation

spike'
// Convert to mRNA (from coding strand)
|> BioSeq.transcribeCodingStrand
// Here you can vary the offset (to frameshift)
|> BioSeq.translate 1
|> Seq.toArray

`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment