Convert RNA secondary structure (in dot-bracket notation) to forest in pgf/tikz using Haskell
Data for dot-bracket notation (a.k.a. Vienna notation)
> type Viennachar = Char
> type RNAchar = Char
Data for tree structure
> data Tree a = N a (Forest a) deriving (Show,Eq,Ord)
> type Forest a = [Tree a]
Parse dot-bracket notation and build an RNA forest
> build' :: [(Viennachar, RNAchar)] -> Forest RNAchar
> build' [] = []
> build' [('.',a)] = [N a []]
> build' [(')',a)] = error $ "structure not well parenthesized"
> build' [('(',a)] = error $ "structure not well parenthesized"
> build' [( x ,a)] = error $ show x++" symbol not allowed in vienna notation"
> build' (('(',a1):(')',a2):ps) = (N 'P' (open:[close])) : build' ps
> where (open, close) = ((N a1 []), (N a2 []))
> build' (('.',a):ps) = (N a []) : build' ps
> build' (('(',o):ps) = (N 'P' ((open:inbracketsrecursion)++[close])) : build' trest
> where (open, close) = ((N o []), (N (snd hrest) []))
> inbracketsrecursion = build' inbrackets
> (hrest, trest) = (head rest, tail rest)
> (inbrackets,rest) = if (l==[]) then error "closing bracket missing" else (head l)
> l = [splitAt k ps| k<-[1..length(ps)], level(take (k+1) (map fst ps))<0 ]
> build' ((')',a):ps) = error "no closing bracket expected here"
> build' (( x ,a):ps) = error $ show x++" symbol not allowed in vienna notation"
Print an RNA forest as a list of lines in pgf/tikz tree format
> pprint :: [Tree RNAchar] -> [String]
> pprint ((N x (a:as)):bs) = ("\\begin{scope}[xshift=1em]"):("\\node {"++ [x] ++ "}"):(pprint' (a:as)) ++ ["; "] ++ pprint bs ++ ["\\end{scope}"]
> pprint ((N x []):bs) = ("\\begin{scope}[xshift=1em]"):("\\node {"++ [x] ++ "};") : pprint bs ++ ["\\end{scope}"]
> pprint [] = []
> pprint' ((N x (a:as)):bs) = ("child { node {"++ [x] ++ "} "):(pprint' (a:as)) ++ ["} "] ++ pprint' bs
> pprint' ((N x []):bs) = ("child { node {"++ [x] ++ "} "):"} ": pprint' bs
> pprint' [] = []
> test0 = ("GGGGGGCCGCCGGGGAAAAA", "..((..)).((..)).....")
> test = ("GGGGGCGCCGGGG", "..(..).((..))")
Main function
> dotBracketToTikz s = putStr $ unlines $ pprint (build (snd s) (fst s))
Example call in GHCi / hugs
dotBracketToTikz test0
