Skip to content

Instantly share code, notes, and snippets.

@ejconlon
Last active December 15, 2016 04:32
Show Gist options
  • Save ejconlon/50fb57c28a4d63a6fe3dae9ada3a22e7 to your computer and use it in GitHub Desktop.
Save ejconlon/50fb57c28a4d63a6fe3dae9ada3a22e7 to your computer and use it in GitHub Desktop.
Simple data science operations in Haskell

Simple data science operations in Haskell

Simple tasks

  1. Read a table of data from disk

Cassava: exploratory CSV parsing

Frames: known, typed CSV parsing

Binary: more specific binary types ad hoc

  1. Show a few rows of that table (HTML rendering)

Cassava: something like

import qualified Lucid as L
import qualified Data.CSV as C
render :: (C.DefaultOrdered a, C.ToNamedRecord a) -> [a] -> L.Html ()

Frames: probably something analogous for Vinyl records

  1. Examine types and distributions of columns

statistics: mean, variance, etc

  1. Map, filter, and project to a new table

  2. Plot column values: histograms, lines, etc

Chart, Diagrams can render to PNG, SVG

  1. Show images if data is visual

JuicyPixels can interpret array data as images

Haskell4Mac Playgrounds

hfm helps with a lot of this. Here are positives and negatives:

(+) Can render visual results in playground

(-) Horizontal presentation (term type result) is more difficult to read than vertical (term \n type \n result)

(-) Need to write results to file in addition to displaying in the playground (HTML or PDF export of playground would be nice)

(+) Has Lucid/Blaze HTML rendering

(-) Embedded views cannot be resized manually, making text flow weirdly for compressed tables

(+) Has image rendering w/ JuicyPixels

(-) Not straightforward to render charts. Pure-haskell diagrams backend is slow, are rasterrific or cairo better?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment