Skip to content

Instantly share code, notes, and snippets.

@trevorsibanda
Last active September 28, 2016 07:59
Show Gist options
  • Save trevorsibanda/fbad6d07338f8b7384cba71d93aee31b to your computer and use it in GitHub Desktop.
Save trevorsibanda/fbad6d07338f8b7384cba71d93aee31b to your computer and use it in GitHub Desktop.
an encoder (and a separate decoder) which takes a string as input and outputs the string encoded in the style of a strand of DNA. Can be improved
//Simple DNA encoder
//http://codegolf.stackexchange.com/questions/91645/dna-encode-a-string/91850
import java.io.{ByteArrayOutputStream, ByteArrayInputStream}
import java.util.Base64
import java.util.zip.{GZIPOutputStream, GZIPInputStream}
import scala.util.Try
object Nucleotide{
abstract class Nucleotide(val symbol: String){
override def toString = symbol.toUpperCase
}
case class Adenine extends Nucleotide("A")
case class Thymine extends Nucleotide("T")
case class Cytosine extends Nucleotide("C")
case class Guanine extends Nucleotide("G")
def A = new Adenine
def T = new Thymine
def C = new Cytosine
def G = new Guanine
case class Pair( val _1: Nucleotide, val _2: Nucleotide ){
override def toString = _1.toString + _2.toString
}
object BasePairs{
def AT = new Pair(A,T)
def TA = new Pair(T,A)
def CG = new Pair(C,G)
def GC = new Pair(G,C)
}
}
import Nucleotide._
class DNA( val l: Seq[Nucleotide] ){
val mapping = (math.pow(2,9).toInt to math.pow(2,10).toInt).map{
x => buildEncoding(x)
}.filter{
_.length == 8
}.toSet.take(255).zipWithIndex
println(mapping)
def buildEncoding( x: Int ) = {
import Nucleotide.BasePairs._
def iter(i: Int, m: Int, s: Seq[Pair]): Seq[Pair] = {
if( i> 0 && i <= 4 )
s ++ Seq( Seq(AT,TA,CG,GC)(i-1) )
else
iter(i/2, i%4, s ++ Seq( Seq(AT,TA,CG,GC)(m) ) )
}
iter(x,x%4, Seq() )
}
def encode( s: String ): Seq[Pair] = s.map{
case c: Char => mapping.collect{
case (a,index) if index == c.toByte => a
}.head
}.flatten
def decode( l: Seq[Pair] ): String = l.grouped(8).toList.map{
case l2: Seq[Pair] => mapping.collect{
case (a,index) if a == l2 => index.toChar
}
}.flatten.mkString
def decode( s: String ): String = decode( textToPairs(s) )
def textToPairs(s: String): Seq[Pair] = {
def letterToNucleotide(c: Char) = c match{
case 'A' => A
case 'T' => T
case 'C' => C
case 'G' => G
}
assert( s.length%8 == 0)
s.toUpperCase.grouped(8).toList.map{
l => l.grouped(2).toList.map{
case str => new Pair(letterToNucleotide( str.charAt(0) ), letterToNucleotide( str.charAt(1) ) )
}
}.flatten
}
}
object DNAEncoderDecoder{
def main(args: Array[String] ){
val test = """I have a friend who's an artist and has sometimes taken a view which I don't agree with very well. He'll hold up a flower and say "look how beautiful it is," and I'll agree. Then he says "I as an artist can see how beautiful this is but you as a scientist take this all apart and it becomes a dull thing," and I think that he's kind of nutty. First of all, the beauty that he sees is available to other people and to me too, I believe. Although I may not be quite as refined aesthetically as he is ... I can appreciate the beauty of a flower. At the same time, I see much more about the flower than he sees. I could imagine the cells in there, the complicated actions inside, which also have a beauty. I mean it's not just beauty at this dimension, at one centimeter; there's also beauty at smaller dimensions, the inner structure, also the processes. The fact that the colors in the flower evolved in order to attract insects to pollinate it is interesting; it means that insects can see the color. It adds a question: does this aesthetic sense also exist in the lower forms? Why is it aesthetic? All kinds of interesting questions which the science knowledge only adds to the excitement, the mystery and the awe of a flower. It only adds. I don't understand how it subtracts."""
val d = new DNA( Seq() )
val encoded1 = d.encode( test )
val decoded1 = d.decode( encoded1 )
println( s"Encoded:\t${encoded1.mkString}\nDecoded:\t${decoded1}" )
println( s"Encoded:\t${encoded1.mkString.length}\nDecoded:\t${decoded1.length}" )
println( decoded1 == test )
}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment