Skip to content

Instantly share code, notes, and snippets.

@jamessdixon
Created Dec 19, 2020
Embed
What would you like to do?
Looking at the COVID Genome Using F#
open System
open System.IO
let path = "/Users/jamesdixon/Downloads/sequence.fasta"
let file = File.ReadAllText(path)
let totalLength = file.Length
let prefixLength = 97
let suffixLength = 2
let stringSequence = file.Substring(prefixLength,totalLength-prefixLength-suffixLength)
type Base = A | C | G | T
let baseFromString (s:String) =
match s.ToLowerInvariant() with
| "a" -> Some A
| "c" -> Some C
| "t" -> Some T
| "g" -> Some G
| _ -> None
let arraySequence = stringSequence.ToCharArray()
let bases =
arraySequence
|> Seq.map(fun c -> c.ToString())
|> Seq.map(fun s -> baseFromString s)
|> Seq.filter(fun b -> b.IsSome)
|> Seq.map(fun b -> b.Value)
let spikeLength = abs(21563-25384)
let spike = bases |> Seq.skip 21561 |> Seq.take (spikeLength)
spike
let orf1abLength = abs(266-21555)
let orf1ab = bases |> Seq.skip 263 |> Seq.take (spikeLength)
orf1ab
@muehlhaus

This comment has been minimized.

Copy link

@muehlhaus muehlhaus commented Dec 29, 2020

Hi James, very nice post.
I took the liberty to converted your example using BioFSharp. BioFSharp aims to be a user-friendly functional library for bioinformatics written in F#.
I hope that is interesting for you. We are always happy to find other computational biologist using F#. Maybe you want to get in touch?
All the best,
Timo

`
#r "nuget: BioFSharp, Version=2.0.0-beta4"
#r "nuget: BioFSharp.IO, Version=2.0.0-beta4"

open BioFSharp
open BioFSharp.IO

//reads from file to a sequence of FastaItems and takes the first element
let genome =
SOURCE_DIRECTORY + "/sequence.fasta"
|> FastA.fromFile BioArray.ofNucleotideString
|> Seq.head

let spikeLength' = abs(21563-25384)
let spike' =
genome.Sequence
|> Seq.skip 21561
|> Seq.take spikeLength'
spike'

let orf1abLength' = abs(266-21555)
let orf1ab' =
bases
|> Seq.skip 263
|> Seq.take spikeLength'
orf1ab'

// Example translation

spike'
// Convert to mRNA (from coding strand)
|> BioSeq.transcribeCodingStrand
// Here you can vary the offset (to frameshift)
|> BioSeq.translate 1
|> Seq.toArray

`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment