Skip to content

Instantly share code, notes, and snippets.

@rntz
Last active August 7, 2017 01:14
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save rntz/4af5d185663180188c640526bbbf25cd to your computer and use it in GitHub Desktop.
Save rntz/4af5d185663180188c640526bbbf25cd to your computer and use it in GitHub Desktop.
-- https://stackoverflow.com/questions/38831961/what-declarative-language-is-good-at-analysis-of-tree-like-data
-- This is my attempt to implement your examples in a suitably extended
-- hypothetical version of Datafun.
-- 1. Currently Datafun only has sets, while SQL has bags/multisets. Some of your
-- queries would be nicer if Datafun supported bags; for example, with bags we
-- can express a `mean` function's type as:
--
-- mean : Bag Float -> Float
--
-- With only sets, we must instead use (where {a} is the type for "a set of
-- values of type a"):
--
-- mean : (a -> Float) -> {a} -> Float
--
-- where (mean f s) returns the mean of the value of `f` across every value in
-- the set `s`.
--
-- Bags make a lot of aggregations nicer, so we want to add them Datafun
-- eventually, but that's a few papers down the line yet.
-- 2. Datafun also has no way to sort sets (which your third example uses to pick
-- the maximum two elements). As with aggregations and bags, we'd like to add
-- sorting and lists to add to Datafun but haven't yet.
--
-- Taking the first two elements of a list also poses a problem: what if the
-- list has < 2 elements? Datafun has no runtime errors, and I don't see this
-- changing. We'd probably use a Maybe type to handle this.
-- 3. Finally, there's an even deeper problem with sorting and finding maxima,
-- which is determinism: when sorting a set by a certain key, how do we order
-- elements with the same key? When finding a maximum (or two) by a certain key,
-- what if there are multiple maxima with the same value? I mostly ignore this
-- problem.
-- 4. So, for these examples, I'll pretend Datafun has the following types &
-- aggregation functions:
--
-- type Maybe a
-- type Float
-- type Int
-- max : (a -> Float) -> {a} ->+ {a}
-- mean : (a -> Float) -> {a} -> Float
-- maxTwo : (a -> Float) -> {a} -> Maybe (a, a)
--
-- `max` will return a set, in case there are multiple maxima (or none).
-- curly braces describe sets: {Int} is a set of Ints
-- square braces describe records/structs: [field1: type1, field2: type2, ...]
type Event = [ timestamp: Int
, missingEnergy: Float
, tracks: { [ momentum: Float
, theta: Float
, hits: { [ detector: Int
, charge: Float
, time: Float
, ... ] }
] }
, showers: { [ ... ] }
]
-- example1 : Event -> {Float}
fun example1 event =
{ x.momentum
| x in max (\t -> len(t.hits))
{ t | t in event.tracks, abs(t.theta) < 2.4 } }
-- example2 : Event -> Float
fun example2 event =
mean (\x -> x.charge)
{ h | track in event.tracks, track.momentum > 10
, hit in track.hits, hit.time < 10 }
-- example3 : Event -> Maybe Float
fun example3 event =
case maxTwo (\t -> t.momentum) event.tracks
| None -> None
| Some (one, two) ->
Some (one.theta * one.momentum + two.theta * two.momentum)
/ (one.momentum + two.momentum)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment