zudov/haskellx.org

## haskellx.org

      
    Raw
  

              haskellx.org
            
          
    Alois Cochard :: Welcome to the Machines

History of data streaming

Problem

We want:

  interleaving effects
  constant space/time
  strong compasability/reusability
  We could try the List, but it doesn’t allow effects
  We could interleave effects via monads (mapM etc), but no constant space/time then and/or lazy IO, meh
  We can write specialized loops, but that’s not reusable

Existing implementations


  eternal
lazy IO
  2008
Iteratees and Generators
    
      producers/consumers/processors are separate types
    
  
  2011
Conduit
    
      strong emphysasis on exception and resources handling
    
    
      lots of combinators and ecosystem
      used a lot, especially in the “haskell web stack”
    
  
  ???
pipes
    
      a lot like conduit in terms of functionality, but a bit more “principled”
    
  
  ???
machines
    
      doesn’t handle resource management
    
    
      constraints to simple input topology
      best asymptotic complexity (compared to mentioned above at least)
    
  
Tradeoffs


  except for resource hnadling, machines does everything that other thing do


  conduit mixes lots of things together
    
      there are purescript-machines!!!!
      ???
    
  
The Kmett’s Machines

???
Matthew Pickering :: A look inside GHC’s optimiser


  TODO: see last year talk on Core by SPJ
  Source is desuraged into Core
  Various operations are applied to the Core to optimize it:
    
      float in
      float out
      simplification
      inlining
      specialisation
    
  
  let
allocation -> more memory
  case
evaluation -> more time


  TODO: Read ‘A transformation-based optimiser for Haskell’
    
      shorter
      read first
    
  
  TODO: Read ‘Compilation by Transformation in Non-Strict Functional Languages’
    
      good bed reading
    
    
      many sections
    
  
Inlining

???

  recursive things don’t get inlined

Simplification


  small local transformation
    
      case of case
the shallow (top) branching of nested cases gets pushed into the deeper branches
      case of known constructor
if the body of `case` is known the whole case expression can be removed
    
  
Specializing


  the desugaring of type classes happens by passing their dictionaries as arguments
    
      which can be simplified further, especially by inlining!!!!
    
  
  specialized versions of the generic functions can be created
    
      but when it’s imported from another module, it won’t be specialized
        
          unless you tell it via INLINABLE pragma
          but this potentially call’s a lot of specialized versions all other the place
          so instead you could use SPECIALISE pragma to specialize it for specific types only
        
      
Float Out


  move computations which shouldn’t be repeated to the top-level

Float In


  enables further optimisation opportunity by pushing lets closer to use sites

Join points


  encode the info about your program into the optimizer??
  see Simon’s talk tomorrow

{#- INLINE #-} Hammer


  makes compiler VERY keen to inline the annotated thing at the call sites
  inlines precise RHS
  specialisation works across modules

Ritesh Ragavender :: Algebraic Design of DSLs

Big goal: flexible nice programs, that are yet well type-checked and safe
Phantom types


  keep value-thing isomorphic, but yet distinguish different types of things on the type level

Use cases


  type-level permissions
    
      e.g. read/write-only handles
    
  
Free monad DSLs


  clear to define and extend
  the logic of domain code is separated from the primitives
    (e.g. graph search vs graph access operations)

David Luposhainsky :: Change in an Immutable World


  Side effects are:
    
      evil
      hard
      unmaintanable
      unfunctional
      THE point of running the program
    
  
  Can mitigate the bad parts, allow side effects, but allow making it
    
      still easy to reason about
      stay maintainable
      stay functional
      still do IO
    
  
  Map to effectful primitives, maintain functonal API. Have the compiler do the dirty work

Problems

Ordered?

Normally compiler can reorder things, but with side-effects it’s not a good idea.
Solution

A thingy like state monad with ‘token’


  passes around a ‘token’ that directs side effects and keeps the state
  let requires a couple of heap allocations
    
      one way to mitigate is move it to case matches, that usually allocate on stack
    
  
  extra tuple is allocated
    
      use onboxed tuples inside the case match
    
  
Get rid of a token’s value


  make a ‘Discard’/’State#’ phantom type with no constructors
  apply it to a token type
  pass it around in the state-like thingy
  no values
  no allocations
  ordered
  RealWorld is that ‘token’
  the ‘token’ can’t really be a value, but ‘main’ provides it to the program, as an act of magic

Hierarchy of unsafety


  ‘unsafePerformIO’ :: relatively safe
    
      has implicit mutex
    
    
      will just perform the action
        
          maybe multiple times
          but won’t perform it concurrently multiple times
        
      
  ‘unsafeDupablePerformIO’ :: relattivelyyyyy sssaaffe
    
      like ‘unsafePerformIO’, but also might end up acting concurrently
    
  
  ‘inlinePerformIO’ :: quite not safe
    
      performs IO, but also might inline the action
      might result into weird arbitrary things
    
  
Conclusion


  Order computations via token
  Make token symbolic and don’t have a value for it
  RealWorld is a token and never exists therefore
  Deligate all the work to STG primitives

Julie Moronuki :: A Monoid for all Season

Logic


  ancient
propositions, Aristotelian logic
  middle ages
syllogistic mnemonics

Monoids


  binary operation
  associative
  has unit value
  commutative

Disjunction


  can be inclusive or exclusive
  analogous for booleans and binary numbers
  distributive

Sums and products


  distrbutive property is analogous to disjunction

Union and intersection

Set operations

Disjoint union vs Cartesian product

Arnauld Spiwack :: Distributed Programming with Linear Types

Garbage collection, latency, synchronisation


  GC pauses are costly. Especially in distributed settings

Generational hypothesis


  the heap is split in ‘old’ and ‘young’ generation
  ‘young’ is cheaper, so short-lived data stay there
  ‘young’ data dies sooner (hypothesis)
  ‘old’ is more expensive to collect, and causes real pauses
    
      we might want to manage this pauses manually
    
    
      but we want to do it safely
      linear types can help
    
  
Linear functions


  the functions that would consume all the arguments exactly once, however the branching goes

Malloc

v1


  malloc
Storable a => IO (tr a)
  free
Ptr a ⊸ IO ()


  do
    
      x <- malloc
      x’ <- doSomething x
    
    
      free x’
      doMore
    
  
  ^^ linear, which enforces valid sequence of malloc/free

v2


  malloc
Storable a => a -> (Ptr a ⊸ ()) ⊸ ()
  free
Ptr a ⊸ ()


  malloc init $ λx -> let x’ = doSomething x in free x’ `seq` doMore

Fusion?


  allocations in the fused code aren’t so obvious
  who is allocating?
  what if we write to mmap
  FFI?
  consumer not inlined?
  another computer?
  or maybe we know the size upfront and actually want to allocate

Destination-passing style

Andrey Mokhov :: Algebraic Graphs


  https://www.staff.ncl.ac.uk/andrey.mokhov/algebra.pdf
  https://github.com/snowleopard/alga
  http://hackage.haskell.org/package/algebraic-graphs

Problem


  graphs are quite easy to express in haskell
    
      see Data.Graph
    
  
  but enforcing the validity of the graphs on the type levnel isn’t that simple

Algebraic Graphs


  data Graph a where
    
      Empty
Graph a
      Vertex
a -> Graph a
      Overlay
Graph a -> Graph a -> Graph a
        
          union of vertices + union of edges
        
        
          (v1, e1) + (v2, e2) = (v1 ∪ v2, e1 ∪ e2)
        
      
      Connect
Graph a -> Graph a -> Graph a
        
          a union of vertices + a union of edges + new edges
        
        
          new edges are created by connecting the edges of both graphs
        
      
  this gives us a way to construct any valid graph
  overlay is like addition, commutative and associative
  connect is like multiplication, associative
  intuition: any graph expression can be broken down into an overlay of vertices and edges
  unexpectedly, graph isn’t a semiring because ‘identity’ is the same for buth ‘multiply’ and ‘add’

How is it useful


  fmap
allows to do interesting transformations, mapping one-or-more vertices to another vertice (e.g. for joining them)
  bind
allows to split the vertices
  mfilter
allows to actually filter nodes out

Cartesian graph product


  fairly easy via fmap and msum

Might be even more useful via an associated type family


  class Graph h where
    
      type Vertex g
    
    
      empty
g
      vertex
Vertex g -> g
      overlay
g -> g -> g
      connect
g -> g -> g
    
  
  this makes it even easier to define a bunch of stuff like:
    
      vertices
      clique
      edge
      star
      biclique
      isSubgraphOf
      hasEdge
    
  
Jasper Van der Jugt :: Getting Things Done in Haskell

Types module

Problems


  instances must go there too, but instances need functions, so everything gets pulled in there
  defining ad-hoc data types (to avoid algebraic blindness) might be awkward if they aren’t next
    to the actual usage

Solutions


  use domain terminology when organizing modules
  use internal modules for stuff that doesn’t have a good place to be put into

Utils module


  Just use the ‘Extended’ modules instead

Naming and qualified imports


  design for qualified imports

Impure things


  Just use sensible OOP practices
  `bracket` usage might look a bit row, so wrapping it in e.g. ‘withHandle’ function is a good idea