philandstuff/euroclojure2014.org

## euroclojure2014.org

      
    Raw
  

              euroclojure2014.org
            
          
    EuroClojure 2014, Krakow

Fergal Byrne, Clortex: Machine Intelligence based on Jeff Hawkins’ HTM Theory


  @fergbyrne
  HTM = Hierarchical Temporal Memory
  Slides

big data


  big data is like teenage sex
    
      noone knows how to do it
      everyone thinks everyone else is doing it
      so everyone claims to be doing
      (Dan Ariely)
    
  
machine learning is important


  people don’t trust other people
    
      they have their own agendas
    
  
  so they place too much trust in machines

asimov’s take


  we gain knowledge faster than we gain wisdom
    
      applies to human knowledge
      applies to data: gathering data is easy, drawing conclusions is
        not
    
  
a problem in neuroscience


  rate of papers published is growing exponentially
  2013: 1 every 32 minutes
  2014 so far: 1 every 17 minutes

can AI learn from neuroscience?

Jeff Hawkins’ goals in HTM


  Study the neocortex and establish its principles
  open sourced NuPIC in 2013

neocortex


  the wrinkly part at the surface of the brain
    
      grey matter: processing
      white matter: wiring
    
  
  about 2mm thick, 10cm^2 in area
  30-50MM neurons
  1G connections
  hierarchical
  uniform
    
      ie all looks physically the same
      all regions have the same algorithm
    
  
6 key principles

on-line learning from streaming data


  up to 10 million senses feed the brain
  we don’t (can’t) store this data
  we build models from live data
  models constantly updated

hierarchy jof regions


  sensory data enters at the bottom
  models are built in every region
  things change more slowly as you go up
  hierarchy enables sequences of sequences
    
      seq of waves
      seq of phonemes
      seq of words
      seq of sentences
    
  
  hierarchy works upwards and downwards

sequence memory


  all sensory data involves time
  sequence memory allows predictions
  structure in data elaborated over time
  sequences can be c

sparse distributed representations


  in each region, many neruons, few active
  SDRs represent spatial patterns
  fault-tolerant, semantic ops, high-capacity
  key to understanding & building intelligent systems

all regions are both sensory and motor


  behaviour provides context for sensory data
  structure in model navigated via behaviour

attention


  use attention to manage the neocortex
  planning and previsualisation
  whole subhierarchies can be switched on and off

layers of neocortex


  from molecular upwards
  around 5 or 6

neurons


  distral dendrites detect coincidence of incoming activity from
    neighbouring cells
  you don’t just see what you’re seeing now, you predict what
    you’re going to see next
  (reality is much more complicated, but this algorithm is
    sufficient to explain a lot)

clortex

background: numenta’s nupic


  in dev since 2005
  partially implements HTM/CLA
  python/c++
  open source

strengths


  skilled dev team
  eat their own dog food (grok uses nupic)
  operates on subset of HTM/CLA principles
  tunable using swarming on your data
  works well on streaming scalar data (eg machine-generated)
  great community – http://numenta.org

limitations


  codebase has evolved as theory has developed
  difficult/scary to rewrite for flexibility
  OO with large, coupled, classes (~1500 LoC per class)
  need to swarm to find parameters, no real-time control
  not easy to extend beyonnd streaming scalar use case

clortex requirements


  directly analogous to HTM/CLA theory
  transparenntly understandable source code
    
      a neuroscientist should be able to read & review code
    
  
  directly observable data
  sufficiently performant
  useful metrics
  appropriate platform
    
      portability
      scalability
    
  
architectural simplicity


  first role: be useful!
  best software is that which is not needed at all
  human comprehension is king
    
      if people can’t understand your code, your code is not
        finished
      unit tests are not sufficient in themselves
    
  
  machine sympathy is queen
  software is a process of R&D
  software development is challenging & intellectual
    
      more science than engineering
        
          engineering: you have a good model already, you just have to
            plug in the particular parameters
          science: there are a bunch of unknowns which you have to
            learn & understand
        
      
#1: Just use data!


  maps, vectors, sets
  all done in a one-page datomic schema

#2: Clojure & its ecosystem


  clojure data not domain objects

#3: russ miles’ life preserver


  everything either “core” or “integration”
  core: a datomic database for the neocortex
  core: each “patch” of neurons is a graph
  integration: algorithms, encoders, classifiers, SDRs

key clj libs & tools


  datomic (+adi)
  quil/processing
  incanter
  lein-midje-doc for literate documentation
  hoplon-reveal-js for presentations
  lighttable

review


  Big Data isn’t just Machine Intelligence problem
  HTM is exciting

links


  http://numenta.org
  http://inbits.com
  https://github.com/fergalbyrne/clortex
  writing a leanpub book

Logan Campbell, Clojure at a Post OFfice

history:


  was at clojure user group
  a guy turns up and says he’s hiring a team of clojure developers
  he was at Australia Post
    
      a million lines of Java worked on by a team in India
      wanted to bring it back in-house
    
  
project: digital mailbox


  big companies spend a lot of money sending out bills & junk mail
  product to seamlessly replace that workflow
  switch from physical mail to cheaper model
  consumer can sign up to receive water bill online
  I was brought on as the “clojure expert”
    
      (I’d been playing with it for a couple of years)
    
  
  drama:
    
      the people they could hire:
        
          really experienced java devs
          keen on FP
        
      
      they said as they were hiring “you might be doing clojure or
        you might be doing scala”
      first few people were scala fans
      scala v clojure battles
        
          “we need static typing”
          “we need OO for domain modelling”
          “clojure is slow” (?)
          “what framework do you use?”
        
      
  “we need static typing? okay, we’ll use core.typed”
  domain modelling:
    
      when people are used to domain modelling in OO, telling them to
        just use maps feels like a cop-out
      records + protocols kind of feel like classes
      wasn’t til I showed them code I’d written and comparing it with
        their code that they realized that you can just use maps
    
  
  online scala course
    
      we did it as a team
      I also did the exercises in clojure
      did one exercise three different ways in clojure
        
          conditional
          match
          stream processing
        
      
      showed them my solutions
        
          they already understood the problems because they’d solved
            them themselves
        
      
  clojure performance was a surprise, because I’d come from ruby (!)
    
      clojure is fast
      there was an underlying feeling that “we need scala for
        performance”
    
  
  I’m a consultant, so was happy for the team to make the language
    decisions
    
      “if you’re keen on scala, let’s find out a way to pitch it to
        management”
    
  
  web stack: kept hearing “async async async”
    
      felt like premature optimization
      but still we used http-kit
        
          benchmark started to allay fears that clojure was slow
        
      
feature: make a payment on a bill


  not necessarily a full payment

    POST /bills/:bill-id/payments
    Session: user-id
    Post Data: amount


  GET credit card token for user
    
      POST request to payment gateway
    
  
  GET how much left to be paid
  if payment succeeds: display amount remaining
  if payment fails: display error

candidates solutions


  synchronous promises
  promise monad
  lamina
  etc etc

solution 0: synchronous


  http-kit’s requests return a promise
    
      just @deref the promise (blocks the thread)
    
  
solution 1.1: promise monad


  do is aware of promises
    
      doesn’t block thread, but waits for promise to be executed
        before continuing
      felt natural way to write with promises
      but incorrect: too much waiting, no concurrency
    
  
solution 1.2: promise monad let/do


  let to define promises
    
      do to pseudo-block on them
      introduces correctness but reduces readability
    
  
solution 1.3: let/do/do


  okay, let’s step away from monads

solution 2: (?)

solution 3: raw promises


  when to explicitly wait for a particular promise

solution 4: raw callbacks


  not viable
  would have just written a hacky little promise library

solution 5: core.async:


  great! same shape as synchronous code, but correct concurrency

solution 6: lamina


  didn’t feel totally suited to the situation

solution 7: meltdown (LMAX disruptor based)


  not appropriate

solution 8: pulsar promises


  looks exactly the same as the synchronous code, except for one
    character
  pulsar rearranges your code at the bytecode level
    
      uses JVM agents (normally used for tracing/debugging)
    
  
  pass a fn to one of pulsar’s functions
    
      turns synchronous code to async code
    
  
solution 9: pulsar actors


  not appropriate

winners


  0: synchronous
  5: core.async
  8: pulsar

scala solution, for comparison


  scala futures (basically promises)
  all monadic
  I don’t understand it entirely
  concise
  battle of the benchmarks, fastest first
    
      pulsar-async
      pulsar-sync
      core-async
      raw-callback
      scala-play-future (significantly less than others)
    
  
CQRS (command-query responsibility segregation)


  want fast reads
  reduce number of queries
  don’t want to have to update write code every time we add a new
    reader

structure


  service A → cassandra → service B
  custom triggers in cassandra in clojure (just drop in the .jar!)
    
      publish to rabbitmq
      notify index maintainer
      write index to cassandra
      service B reads from cassandra
    
  
cassandra triggers


  can just throw the clojure jar in there
  everything is byte buffers
    
      you need to know the type of all the fields out-of-band
      not self-describing data at all
    
  
microservices


  I thought we would have a user service and a provider service and
    a mail service
    
      but this gets tricky when you want data about users and providers
    
  
  you need to split things much more fine grained
  user service →
    
      authentication
      multi-factor auth
      authorization
      user profile
      password reset
        
          does it belong in user profile?
          there’s a bit of workflow here
            
              send out email
              get user to click link
              enough to warrant its own service
            
          
  drama: needed to talk to systems team to deploy
    
      I did things badly
      I didn’t get anything into production in my 6 months there
      systems team: we need monitoring and config and stuff
        
          if we’d had something early on which had gone through these
            barriers, we would have had much less stress
          benchmarks end petty arguments
        
      
Q&A

can you share some experience with monitoring & resilience?


  appdynamics
  classnames are expected to be java-style class names
    
      clojure ones are close enough
    
  
  clj-metrics to expose more high-level metrics
    
      requests/second from ring
      number of bills paid
      appdynamics could pick it up from jmx
    
  
  nomad for configuration

with http-kit+core.async, what happens when server dies and there’s loads of threads?


  bottleneck was amount of memory
  when server runs out, it slows down a lot
  way to get around that is to monitor resources on your machine
    and ideally have autoscaling

were the scala guys finally writing clojure in the end?


  we have one person still hardcore for scala, but sees the merits
    of clojure
  a few who did the online scala courses are clojure folks now
  people who come from the java world of static typing feel they
    need that
  but now they’ve written code that actually works, they’re more
    comfortable with that now

Tom Hall, Escaping DSL Hell by having parens all the way down


  @thattommyhall

DSLs


  languages made for specific purposes
    
      config mgmt
      science
      learning
    
  
  distinction between:
    
      internal DSLs: embedded in another language
      external DSLs: implemented in another language
    
  
problems with puppet


  zen of python:
    
      namespaces are a honking great idea, let’s do more of them!
    
  
puppet namespaces


  Exec[‘install’] in two different modules will result in a
    naming collision
  fail :(
  end up with Exec[‘tom::install’] but this is a hack

iteration


  file type lets you pass in an array
  nagios_host doesn’t
  iteration is responsibility of type, not language
    
      as far as I know
    
  
but you need to know ruby anyway


  if you want to extend puppet, you need ruby
  if you need to know ruby, why do we bother with the puppet DSL
    in the first place?

experimental features: lambdas and iteration


  any language where lambdas arrive late is not a good language

ansible


  just YAML
    
      oh wait, I might want to iterate
      oh wait, I’ve got embedded ginger templates in my YAML strings
        
          what’s the scope of names in my templates?
        
      
if you give people a “language” they will expect loops


  maybe lambdas
  probably namespaces
  this has been done before

chef gets it right


  it’s embedded in ruby
  you get iteration and namespaces from ruby

teaching people to program


  if you design a language:
    
      you need a parser, which is hard
      you need an interpreter/compiler, which is hard
    
  
  if you embed it, you get that stuff for free

geomlab


  minimal language for teaching
  talks about pictures
  intro to FP
  gets you into recursion early on
  man $ woman - “next to”
  man & man - “on top of”
  (man $ woman) $ tree = man $ (woman $ tree)
  man $ (woman & tree) – scales nicely to get a nice aspect ratio
  learn about operator precedence
  de morgan’s laws
    
      although not always held, due to scale
    
  
  define functions

   define manrow(n) = manrow(n-1) $ man when n>1
                    ~ manrow(1) = man


  builds up to an escher tiling
  but once you’ve done that, where do we go?
    
      only exists in this sim
      if you want to extend it, you need java
      “I’m really excited about FP now, but I’ve got nowhere to go”
    
  
what if we did it in clojurescript?


  let’s use ‘below and ‘beside instead of $ and &
  (below man woman)
  (beside tree star)
  http://cljsfiddle.net/fiddle/thattommyhall.geomlab.demo
  let’s say I want to change man – what does it mean?
    
      it’s implemented in the same sort of language
      I can see there’s a url in there where I fetch an image from
        the internet
      I know recursion, because I learned that from the geomlab
        exercises
      I can extend the language itself
    
  
science languages


  R
  wolfram alpha
  maple
  matlab
  these things just aren’t very good languages, even if they are
    good at their domain

another problem with DSLs


  netlogo
    
      http://ccl.northwestern.edu/tortoise/2013-10-25/Ants.html
    
  
  If you’re based on applets, and Oracle drops applet support, you
    find you need to port your whole language to a new platform (in
    this case javascript)
  again, reimplement in clojurescript?
    
      anyone interested in hacking on this with me?
    
  
conclusion


  you probably don’t need to make a new language
  if you do it will probably be rubbish
    
      at least for a while
    
  
  think about power and reach
  you should embed /deeply/ into clojure

links


  http://twitter.com/otfrom
  http://cljsfiddle.net/fiddle/thattommyhall.geomlab.core
  http://cljsfiddle.net/fiddle/thattommyhall.geomlab.demo
  http://cljsfiddle.net/fiddle/thattommyhall.geomlab.bruce
  http://www.complexityexplorer.org/
  http://cljsfiddle.net/fiddle/thattommyhall.ants.core
  http://ccl.northwestern.edu/tortoise/2013-10-25/Ants.html

Q&A

what makes a good first language?


  clojure needs a better day 0 story
  at some coder dojos where I’ve taught kids, some don’t even know
    about files and folders
    
      so if you say “open a terminal, cd into a directory” you’ve
        lost them
        
          and it’s not their fault
        
      
have you had any kids look at your examples here?


  I’ve done the geomlab example
  otherwise this is all a recent exploration
  errors in cljsfiddle are not reported well
    
      again problematic for day zero
    
  
Mathieu Gauthron, JVM-breakglass


  using a clojure REPL to troubleshoot live java/JVM processes
  http://slides-euroclojure2014.matlux.net
  when you see fire, you break glass
  when your jvm process is on fire, you use JVM-breakglass

troubleshooting a java application


  debugger
    
      only powerful when you can narrow down the problem to a series
        of breakpoints
      when the problem is a race condition, it will change the nature
        of the problem you’re studying
    
  
  log/print statements
    
      you need to plan before compilation
      when the problem is in production, it might be too late
    
  
  jmx
    
      again, you need to plan for it in advance
    
  
  ad-hoc interactive mechanism

what is jvm-breakglass


  open source
  integrates with any jvm process
  console onto a jvm process

main features


  interactive prompt
  see inside private members
  call arbitrary methods
  create new object instances
  create new classes
  monitor object state
  no need to use clojure to develop the app

how does it work?


  jvm-breakglass runs inside the JVM and starts an nrepl server
  you can then connect using an nrepl client (eg lein)

how to use it?


  add it to your maven dependencies
  add an entry point (as a <bean> or in java code)
  connect with lein repl :connect localhost:1112

demo (enterprise application)


  tomcat JVM
  employee/dept data structure
  report generation
  java/spring mvc webapp
  jvm-breakglass
  spring data
    
      in XML, naturally
    
  
homepage


  oh no! one of the reports isn’t working?
  “list employees in london” is empty
    
      but we know that employee Mick Jagger lives in london
      what’s going on?
    
  
breakglass to the rescue


  view environment:
    
      current directory, System/getProperties
      view conf directory
    
  
  list all loaded Spring beans
  instrospect into object private members
    
      bean builtin fn
      to-tree to do so recursively
    
  
  view methods or fields for a given object
  redefine a class
    
      in this case, (proxy [Address] ["1 Mayfair", "SW1", "London"]
        (getCity [] "London")) to define the new version, overriding
        a method
      (.setAddress (:Mick employees) address) to inject it into
        the live data
    
  
take a step back


  remember what it’s like to be a java programmer?
  working with jmx beans and suchlike to try to understand why
    production is down
  this stuff looks like magic

Q: how do you convince production people to put nrepl server in place?


  short answer: impossible
  that’s not how you present it
  either you do it sneakily (that’s bad), and only pull the trump
    card when the team is desparate
  or you convince the team that it would be useful in the UAT
    environment, and “of course it’s never going to be used in
    production”
    -

Q: have you considered a high-level switch that would prevent you mutating anything in the host application?


  don’t know how you’d be able to do that
  have been thinking about it
  maybe using clojail
  kind of defeats the point

Q: have you tested this with a scala app?


  haven’t tried
  I’ve reverse-engineered the java bytecode, and it’s readable
  as long as you know how it compiles, it seems reasonable

Q: you were using methods like get-obj and passing string name. how does breakglass know which object to get?


  eg if you have multiple instances of Department, how does it know which department?
    
      in Spring it’s a Spring bean which is named
      if you’re not using Spring, what’s your entry point?
        
          when you create your NreplServer to enable jvm-breakglass,
            you can add your entry points there
          new NreplServer(port).put("department"),myObject);
          static methods & fields can be used too
        
      
Gary Crawford, Using Clojure for Sentiment Analysis of the Twittersphere


  slides: http://www.slideshare.net/garycrawford/using-clojure-for-sentiment-analysis-of-the-twittersphere-euroclojurhttp://www.slideshare.net/garycrawford/using-clojure-for-sentiment-analysis-of-the-twittersphere-euroclojur
  leiningen versus the ants, carl stephenson
  leiningen versus apache ant?
  clojure versus java?
  FP versus OO?

stratified medicine


  determine the best treatment for someone based on their genetic
    makeup to manage their chronic disease

sentiment analysis


  Paper: “Twitter mood predicts the stock market”
    
      predicted Dow Jones average through monitoring tweets
    
  
  people who suffer chronic disease tend to be neurocompromised
    
      what would normally be a minor illness can prove fatal
    
  
  can we use twitter to predict spread of disease?

so we tried


  score tweets for flu symptoms
  the data science wasn’t very difficult
    
      but scaling it was
    
  
  30 million geo-tagged tweets sent from UK
  couldn’t scale, even with
    
      HDFS/hadoop
      mongo/aggregation
      mongo/mapreduce
      postgres
    
  
how can we do fast, real-time analytics of social media?


  application: how do people feel about Scotland’s independence
    referendum?
  data increases in value as we analyse it
    
      tweets
      analytically prepared data
      analysis
      insight
      predictions
    
  
  the raw data isn’t what you care about
  don’t store the raw tweets, only store the analytically prepared
    data
  stored in redis using ptaoussanis/carmine
    
      it has great support for bitmaps
    
  
example


  (car/setbit sentiment tweet-id 1)
  (car/bitcount "SCOTLAND") – tells me how many tweets have
    mentioned Scotland
  how many people in england are happy?

(wcar*
 (car/bitop "AND" "ENGLAND&JOVIALITY" "ENGLAND" "JOVIALITY")
 (car/expire "ENGLAND&JOVIALITY" 10) ;; don't keep the data longer than 10 seconds
 (car/bitcount "ENGLAND&JOVIALITY"))

  further: “how many people in Scotland are tired or grumpy?”

getting the data in


  adamwynne/twitter-api
  you can specify you only want tweets from a certain geographical
    locality with a bounding box
    
      but this is literally a rectangle
      need it around Europe
    
  
  LMAX-Exchange/disruptor to communicate
    
      journaling
      syncing
    
  
  business logic

what sentiment?


  this is hard!
  “I’m loving #EuroClojure! :D”
  Positive Affect: enthusiastic, active, alert
  Negative Affect: subjective distress
  actually two separate dimensions, not opposites
  Watson et al, 1988
  PANAS
  then PANAS-x
  then PANAS-t
    
      accounts for bias on social media
      outlines sanitisation
      validate against 10 real events
    
  
sanitisation


  https://github.com/dakrone/clojure-opennlp
  get rid of spam
  account for text speak
  account for emoticons and emoji
  word stemming (or lemmatisation)
  part of speech tagging

where? reverse geocoding


  don’t want to rely on external services
  don’t want heavy IO
  don’t want round trips to database
  accuracy not too much of a concern
    
      we already lose accuracy in interpreting the sentiment of the
        tweet
    
  
  convert a map of the uk to colours:
    
      look up geocode coords in map
      check colour → get country code
    
  
  problem: the world is a sphere
    
      projecting a sphere onto a rectangle
    
  
  prior art in d3.js
  use JavaFX to exploit it

when?


  there’s a lot of seconds in a day
  and even more seconds in a year
  really not interested in seconds anyway
  want to group tweets by minute
  and also group by hour
  and also group by day, and month, and year

why?


  why are we doing this?
  online social media are surveillance
  the line between public and private is becoming blurred
  if we don’t need data, we shouldn’t collect it
    
      in this example:
        
          we’re never more granular than country
          we’re never more granular than overall sentiment
          we’re never more granular than minute
        
      
      hopefully this is enough to prevent anyone being identified
    
  
  Datensparsamkeit

Q: have you used Storm for this?


  no

Q: any preliminary results on the Scotland referendum analysis?


  I’ve had more luck with tech than data science?

Q: which way should we vote?


  haha

Q: how do you verify your results?


  it’s very crude at the moment?

Paul Ingles, Multi-armed Bandit Optimisation in Clojure


  @pingles

problem statement


  product optimisation cycles are long, complex, and inefficient
  the multi-armed bandit model shows lots of things we’re getting
    wrong
  eg: online newspapers
    
      fundamentally human-led, editorially-led
    
  
  people behave irrationally
  Dan Ariely & Daniel Kahnemann
  (@philandstuff suggestion: Stuart Sutherland, Irrationality)
  economist subscription options
    
      online $59
      print $125
      print & online $125
      the ridiculousness of option 2. makes option 3. seem more
        reasonable
    
  
  need machines to optimise at scale; but need humans to provide
    stuff only they can
  running RCTs to optimise sites
    
      doing so on a continuing basis
      measuring big effects work with small numbers of participants
      but measuring small effects requires ever larger numbers
      to the extent that you can only run ~12 experiments a year
      which is not really good enough
    
  
Bandit strategies can help


  a product for procrastinators by a procrastinator
  Product: Notflix!
    
      video website
      http://notflix.herokuapp.com/
      shows 3 different videos
      show good videos at top of page, and less good at bottom
      show best possible thumbnail for each video
    
  
  optimising with multi-armed bandits
    
      optimising order and thumbnails
    
  
multi-armed bandit problem


  slot machine = one-armed bandit
  problem: you have a bunch of money you want to “invest” in a
    casino
    
      you have a number of different machines to play
      each machine has a different probability of reward
      you don’t know what that probability is up front
    
  
  need to balance “exploration” and “exploitation”
    
      ie learning about the world vs using that knowledge to maximise
        income
      analogy: trying new foods out vs sticking to what you like
    
  
bandit model


  number of arms {1, 2, …, K }
  number of trials: 1, 2, …, T
  rewards: {0,1}
  K-headlines
    
      options of different text
    
  
  K-buttons
    
      options of button text, colour, etc
    
  
  K-pages
    
      whole page redesigns
    
  
  explore this space with notflix

bandit strategy

;; choose which arm to pull
(defn select-arm [arms]
  ...)

;; update arm with feedback
(defn pulled [arm]
  ...)
(defn reward [arm x]
  ...)

(defrecord Arm [name pulls value])
ε-greedy


  “hello world” algorithm
  generally exploit
  ε (epsilon) is the rate of exploration
  eg if ε = 0.1, your strategy is:
    
      with probability 10%, try a random arm with equal
        probability
      with probability 90%, try the best arm based on current
        knowledge
    
  
  if ε = 0, always exploit; if ε = 1, always explore
  example with bernoulli-bandit

(bernoulli-bandit {:arm1 0.1 :arm2 0.1 :arm3 0.1 :arm4 0.1 :arm5 0.9})

  with ε=0.2, you converge faster on the best arm
  but ε=0.1, you exploit it more when you find it
  once you’ve found the best arm, you should be able to double down
    
      ie explore more at the beginning (when you have least
        knowledge) and less at the end
      lots of extensions to ε-greedy to factor things like this in
    
  
Thompson sampling


  Arm model
    
      Θ_k: Arm k’s hidden true probability of reward (in range
        [0,1])
      can build a distribution for Θ_k based on current knowledge
      small number of pulls means wide distribution; large number
        means narrow distribution
      captures uncertainty in value of Θ_k
    
  
  each iteration, take a random sample from each distribution,
    take the largest sample
    
      algorithm naturally balances exploration/exploitation
        trade-off
      the more it learns, the narrower the distributions get, and so
        the more likely it is to choose an arm with a higher expected
        value
    
  
  incanter example
  Thompson-sampling example with same Bernoulli-bandit from above
    
      compared with ε-greedy, explores much more much earlier, and
        exploits much more later on
      considered optimal convergence
    
  
  we can use it to rank things (not just select)
    
      take a sample from each arm distribution, then order arms by
        that value
      in notflix, can use for ordering the videos we show
    
  
applied to notflix


  video rank bandit
  for each video, a thumbnail bandit
  at the end, the best video should be at the top
    
      and each video should show the best thumbnail
    
  
results


  videos, worst to best
    
      “hero of the coconut pain”
      “100 Danes eat 1000 chillies”
      “3 year-old with a portal gun”
    
  
  thumbnail bandit data
  “we built a fictional but amazing product”

links


  [bandit/bandit-core “0.2.1-SNAPSHOT”]
  https://github.com/pingles/bandit

Q: this model assume bandits have same probability through time


  can it readapt?
  Thompson sampling does adapt
    
      it won’t change back as quickly
    
  
Q: isn’t there an interaction between the two bandits?


  if the thumbnail is crappy, they might not click the video
  made an assumption about this
  in general, if you leave it running over time and let the
    evidence build, it should be fine in the long run
  but that is definitely a flaw

Tommi Reiman, Schema and Swagger to improve your web APIs

super simple web api in clojure


  just using compojure
  “sausage” as example data
  PUT /foo/sausage/:id
  example:
    
      in Java: immutable value object
      in Scala: case class
      in Clojure:
        
          free-form map?
          constructor fn with bunch of validation?
          prismatic/schema!
        
      
prismatic schema


  define structure of sausage
  then call s/validate to validate
  schema can define functions

(s/defn get-sausage :- (s/maybe Sausage) [id :- Long]
  (@sausages id))

(s/defn ^:always-validate get-sausage2 :- Sausage [id :- Long]
  (@sausages id))
schema coercion

(defmodel Pizza {:id Long
                 :name String
                 :price Double
                 :hot Boolean
                 (s/optional-key :description) String
                 :toppings #{(s/enum :cheese :olives :ham :pepperoni :habanero)}})

  allows slurping JSON data, but imposing extra types
  eg above we can slurp toppings from a JSON array into a Clojure
    set rather than a vector

double schema


  loose schema for first input
    
      (def Customer {...})
    
  
  tighter schema for validated input
    
      (def ValidCustomer (merge Customer {...}))
    
  
schema selectors


  accept but remove unrecognised params with select-schema

generative schema


  generate random orders for test data
  davegolland/generative-schema.clj – demonstrates how to convert
    schemas to test.check generators

contribs


  sfx/schema-contrib
  cddr/integrity

swagger


  a specification for describing, producing, consuming, visualising RESTful web services
  https://helloreverb.com/developers/swagger
  existing adapters
  clojure options:
    
      octohipster
      swag
      ring-swagger
        
          compojure-api
          fnhouse-swagger
        
      
  endpoint definitions in JSON
  data models as a JSON Schema
  swagger UI
    
      visualises the API
    
  
  code gen
    
      no clojure support yet (anyone?)
    
  
  swagger-socket
    
      run it all on top of websockets
    
  
ring-swagger


  https://github.com/metosin/ring-swagger
  JSON-Schema has some dates
    
      but prismatic/schema will never support dates, as it’s more
        generic
    
  
  higher level abstractions on top of swagger, but nothing for the
    web developer

compojure-api


  an extendable web api lib on top of compojure
  macros & middleware with good defaults
  schema-based models & coercion
  GET* macro to define input and output schemas

fnhouse-swagger


  prismatic/fnhouse
    
      launched at clojure/west
    
  
  defnk with metadata → annotated handler
  fnhouse-swagger
    
      metosin/fnhouse-swagger
    
  
summary


  schema is an awesome tool
  describe, validate, coerce your data
  building on top of ring-swagger
    
      compojure-api → declarative web apis
      fn-swagger → meta-data done right
      or do your own!
    
  
  kekkonen.io
    
      CQRS-lib
    
  
Renzo Borgatti, The Compiler, the Runtime and other interesting beasts from the clojure codebase


  http://twitter.com/reborg

an amazing growth:


  mar 2006: first commit
  oct 2006: 30k loc (7 month old)
  oct 2007: clojure announced!
  oct 2008: invited to Lisp50 to celebrate 50 years of lisp
  May 2009: 1.0 + book!
  now: almost 90k loc

initial milestones


  apr 06: lisp2java sources
  may 06: boot.clj appears
  may 06: STM first cut
  june 06: first persistent data structure
  sep 06: java2java sources
  aug 07: java2bytecode started
  right after: almost all the rest: refs, lockingtx

drew on lots of sources of knowledge


  collection of papers

high-level view:


  (def lister (fn [& args] args))
  read → analyse → emit/compile → compile
  although the lines between the stages get blurred at times

reader


  takes stream, returns data structures
  PersistentList, Symbol, etc

analyser


  input: data structure
  output: exprs
    
      DefExpr
        
          Var
          FnExpr
            
              Sym
              PersistentList
                
                  FnMethod
                    
                      LocalBinding(Sym(“args”)),
                      BodyExpr
                        
                          PersistentVector
                          LocalBindingExpr
                        
                      
Emission


  bytecode generation for Exprs
  prerequisite for evaluation
  emit() method in Expr interface
  Notable exception: called over ??

Evaluation


  transform Exprs into their “usable form”
  eg
    
      new object
      a var
      namespace
    
  
  FnExpr is just getCompiledClass().newInstance

Compilation


  Usually coordination for emit
  Compiler.compile namespace -> file
  …

Emit


  input: Exprs
  output: bytecode

monsters!

RT


  this is how the RT class gets initialised: the first time it gets
    referenced:

final static private Var REQUIRE = RT.var("clojure.core", "require");

  simply referring to it here causes the static initializers to run
  RT has a lot of behaviour in static initializers
    
      inside it is the doInit(); call
        
          which loads all of clojure.core
        
      
      all just from referring to RT in some otherwise unrelated class!
    
  
Compiler


  inner classes for each Expr type

LispReader


  inner classes for each token you might encounter
  <clinit>
    
      sets up reader macros
        
          macros and dispatchMacros (latter for #{ #( #_ #^ etc)
        
      
analyze()


  not a class, but a family of methods
    
      analyzeSeq
      new ConstantExpr
      MapExpr.parse
    
  
  FnExpr.parse
    
      invokes the compiling phase during parsing phase
    
  
emission


  ASM lib used to generate bytecode
  FnExpr.emitMethods()
    
      generate a method for each of the arities of the function
    
  
other beasts


  LockingTransaction and Ref

DynamicClassLoader


  clojure.lang.DynamicClassLoader.findClass(String)
    
      RT.classForName()
      Compiler$HostExpr.maybeClass()
    
  
  Class.forName() goes up the hierarchy of classloaders and asks
    each what they know
    
      an instance of DynamicClassloader is created for each namespace
        
          and also for each form
        
      
      (this is true for the bootstrap phase; not always true eg in
        AOT (ahead-of-time) compilation)
    
  
  supporting dynamicity
    
      in defineClass:
        
          classCache.put(name, new SoftReference(c,rq));
        
      
      in findClass:
        
          Reference<Class> cr = classCache.get(name);
        
      
      SoftReferences are used to save PermGen, since if we redef a
        var we don’t want it to keep consuming PermGen
    
  
Bonus: clojure was initially implemented in lisp


  ~1600 loc to implement read, analyse, compile, eval
  although emitting Java code, not bytecode
  was also generating C♯

Q: some things in bytecode can’t be expressed in java


  is there anything which clojure generates which can’t be
    decompiled back to Java?
    
      I’m pretty sure yes, but not sure exactly what
      Rich:
        
          locals-clearing
          constructs which use goto (which exists in bytecode but not
            Java)
        
      
Rich Hickey, the insides of core.async channels

aside: here’s what clojure looks like in a good IDE


  (ie IntelliJ)
  yes, Compiler.java is massive
    
      but if your IDE has a structure editor, you can navigate them
        all easily
      it’s all in one file because I don’t want 300 files
    
  
aside2: the classloader has a cache in a branch


  fast-load branch

warning! implementation details ahead


  subject to change!
  informational only

the problems


  single channel implementation
    
      for use from both dedicated threads and go threads
        
          simultaneously, on same channel
        
      
  alt and atomicity
    
      Java CSP libraries often didn’t support alt well
      it’s tricky to do atomically
    
  
  multi-reader/multi-writer
  concurrency
    
      construct deals with the ick of threads and mutexes
    
  
  (this talk: focus on JVM impl; JS version has less of these
    issues)

API


  >! >!! put! alt! → channel → <! <!! take! alt!
  it’s not an RPC mechanism, it’s just a conveyor belt

SPI (service provider interface)


  >! >!! put! alt! → impl/put! [val handler] → channel →
    impl/take! [handler] → <! <!! take! alt!

anatomy


  channel has:
    
      pending puts (fifo)
      a buffer (optional) in the middle
        
          contains data
        
      
      pending takes (fifo)
      flag indicating if channel is closed
    
  
  fifos implemented as linked queues
  important to distinguish queues of operations from buffer of data

invariants


  never pending puts and takes simultaneously
  never takes and anything in buffer
  never puts and room in buffer
  take! and put! use channel mutex
  no global mutex
    
      or even multi-channel mutex
    
  
put! scenarios


  one or more waiting take! operations
    
      gets paired up, takes handler gets completed
    
  
  stuff in the buffer, but with room in buffer
    
      puts its stuff in the buffer, succeeds and immediately
        completes
    
  
  buffer full (or no buffer)
    
      enter puts queue, block
        
          results in backpressure
        
      
  full buffer, but windowed
    
      sliding buffer: latest information takes priority, drop head
        of buffer (oldest item in fifo), put! completes immediately
        and enters buffer
      dropping buffer: drop put! on floor, but completes immediately
      could have more sophisticated policies in future
    
  
take! scenarios


  nothing in buffer
    
      enqueued
    
  
  buffer has stuff, but no puts waiting
    
      get data, immediately complete
    
  
  buffer full (or no buffer), puts pending
    
      get something (either head of buffer or get paired with first
        put!)
      first waiting put! completes (either enters buffer or hands
        directly to take!)
    
  
close! scenario


  all pending takes complete with nil (closed)
  subsequent puts complete with nil (already closed) (relatively
    new)
  subsequent takes consume ordinarily until empty
    
      any pending puts complete with true
      takes then complete with nil
    
  
queue limits


  puts and takes queues are not unbounded either
  1024 pending ops limit
    
      somewhat arbitrary, might change
      will throw if exceeded
        
          if you’re seeing this, it’s an architecture smell
        
      
      most likely if you use put! on the edge of your system
    
  
alt(s!!)


  attempts more than one op
  on more than one channel
  without global mutex
  nor multi-channel locks
  exactly one op can succeed

implications


  registration of handlers is not atomic
  completion might occur before registrations are finished, or any
    time thereafter
  completion of one alternative must ‘disable’ the others
    atomically
  cleanup

handlers


  wrapper around a callback
    
      callbacks are icky, so we want to hide them
    
  
  SPI
    
      active?
      commit → callback-fn
      lock-id → unique-id
      java.util.concurrent.locks.Lock: lock, unlock
    
  
take/put handlers


  simple wrapper on callback
  lock is no-op
  lock-id is 0
  active? always true
  commit → the callback

alt handlers


  each op handler wraps its own callback, but delegates rest to
    shared “flag” handler
  flag handler has lock
    
      a boolean active? flag that starts true and makes one-time
        atomic transition
    
  
  commit transitions shared flag and returns callback
    
      must be called under lock
    
  
alt concurrency


  no global or multi-channel locking
  but channel does multi-handler locking
    
      some ops commit both a put and a take
    
  
  lock-ids used to ensure consistent lock acquisition order
    
      (avoids deadlock)
    
  
alt cleanup


  “disabled” handlers will still be in queues
  channel ops purge

SPI revisited


  handler callback only invoked on async completion
    
      only 2 scenarios
    
  
  when not “parked”, op happens immediately
    
      callback is not used
      non-nil return value is op return
    
  
  only time ops park
    
      put! when it gets blocked on full buffer
      take! when it gets blocked on empty buffer
    
  
  only time ops complete asynchronously
    
      take! with pending puts
      put! with pending takes
    
  
wiring !/!!


  blocking ops (!!)
    
      create promise
      callback delivers
      only deref promise on nil return from op
        
          non-nil indicates immediate success (and so callback never
            gets called)
        
      
  parking go ops (!)
    
      IOC state machine code is callback
    
  
summary


  you don’t need to know any of this
  but understanding the “machine” can help you make good decisions

Q: why use alt! for putting? what’s rationale?


  taking multiple channels is like a select(2)
  when you have consumers of different capabilities
    
      I want to try to write to everyone, but whenever the first one
        is ready, I give it to them
      Q: what’s the difference between that and having four consumers
        on a single channel?
        
          you might have a priority metric, or a cost metric
          though yes sometimes you can achieve same result two
            different ways
        
      
Q: why is global or multi-channel mutex not good enough?


  well it would be easy! :)
  a global mutex could make registration atomic
  you’d have to make disabling other alts atomic
  you’d have to make rendezvous atomic
  you could have two unrelated sets of channel operations, why
    should they contend?
  people hate global locks
  rules out by my aesthetic sense :)

Q: David Nolen had an example of 10000 go blocks updating a textarea, did he hit the 1024 limit?


  no I don’t think so, but not sure exactly

Q: are buffer & queue sizes useful metrics to monitor?


  that would be great, and making them monitorable is on the TODO
    list

Q: other possible extensions?


  buffer policies
    
      you might have logic about priority
    
  
  core.async has proven its utility and it’s become important
    
      go macro is a great PoC of what you can do with a macro with
        several kLoC behind it
        
          has its own subcompiler inside it
          kind of implements a subset of clojure
        
      
      maybe build async support into the compiler?
        
          move locals from the stack to fields on the method object
          I don’t need the stack anymore
          I can be paused and resumed on another thread
          declare a fn as async
          comply with this SPI
          could build other things like generators & yield
        
      
      the pride moment of “look you can do this with a macro” is not
        dominated by the desire to make this performant and more solid
    
  
  Q: continuations? how do they differ?
    
      continuations are more general
      this won’t use continuation-passing-style
      it’s related
      it won’t be like call/cc
      it won’t be first-class
      you won’t be able to resume it more than once
      for a specific set of use-cases
      Oleg did a talk that just generators are enough to do stuff
        that people think you need a lot more for
    
  
Q: is there something planned for dynamic binding and the go macro?


  there are fns which allow you to do the conveyance
    
      don’t know if go allows all of them to work
    
  
Q: channels on the network?


  it’s easy to have something you call a channel and put over a wire
  pretty hard to have all the semantics of these channels over the
    wire
  already have queues and all sorts of interfaces to do similar
    things
  atomic alt! over more than one wire not going to happen
  maybe semantics for ports
  or limitations on alt!
  the wire has its own semantics, this is the key thing here
    
      failure, queueing, delays
    
  
  really easy to just take something from the wire and call put!

Q: is there a typical way to monitor a go block?


  what kind of monitoring?
  see that it’s still working, still alive?
  if the channels were monitorable, you could see if things were
    producing/consuming properly

Q: what other options did you consider & reject in the design of core.async


  something other than CSP?
  the generators stuff
  continuations
  I liked what golang did
    
      they made a good choice
      there’s a java csp lib that impls the same kinds of ops
      it’s difficult to get the semantics correct
    
  
  wanted alts! to be a regular fn, not syntax
    
      which feels like an enhancement over go
    
  
  what we’re putting on these channels is immutable
    
      which gives extra robustness
    
  
Meta-eX, conference party


  github: meta-ex
  twitter: meta_ex
  soundcloud: meta-ex
  facebook: meta.ex.live
  website: http://meta-ex.com
  wooo!

David Nolen, Invention, Innovation & ClojureScript


  @swanodette
  recently left NYT for Cognitect

“The future doesn’t have to be incremental”, Alan Kay


  talks about Xerox PARC
  worked there for a decade
  in that decade, inventions!
    
      bitmap screens
      laser printers
      GUI
      PC
      WYSIWYG & DTP
    
  
  innovating is taking inventions and bringing them to a wider
    audience

The Dream Machine, JCR Licklider and the Revolution that made personal computing possible


  M Mitchell Waldrop
  he believed human factors would play an important role
  we would all have a computer
  he helped create the future we live in today
  he helped ARPA finance PARC’s research
  he helped finance John McCarthy & Ed Fremkin (sp?)

Man-Computer Symbiosis, JCR Licklider, 1960


  talks about the trie data structure
  (clojure’s persistent data structures use these!)

invention is hard


  but innovation is equally important
  Douglas Engelbart’s original mouse wasn’t very usable
    
      a tonne of work went into making it more natural, more durable
      (apple computers reference)
      this is innovation!
    
  
Purely functional data structures, Okasaki


  this book is about “paper complexity” – stuff that looks good on
    paper
  it’s a foundation which people can build variants on
  Rich did this
    
      he doesn’t get credit for inventing the bit-mapped vector trie
    
  
the state of clojurescript


  released 2011-07-20
  a lot has happened since then
  early experiments:
    
      clojurescript one
      himera (from fogus)
        
          “translations from javascript”
          showed what value clojurescript provides over javascript
        
      
has 81 contributors <3


  the reason we don’t have copy-on-write data structures is
    because someone put in the hard work to make them
  the reason we have source maps, similar

lighttable - ~11,000 lines of clojurescript

also, the world hasn’t stopped


  js hosts have improved
  persistent data structures were a basic performance win
    
      COW doesn’t scale well past (say) 100
    
  
  V8 had a lead when we introduced persistent data structures
  we hoped that others would catch up
    
      javascriptcore
      webkit is trying to get asmjs-level performance with JIT
        compilation
      nashorn has come along
    
  
demo


  mori: library for js devs
    
      here used to demo performance of persistent data structures
    
  
  comparison:
    
      adding 1000000 items to a JS Array
      adding 1000000 items to a persistent vector
      85 ms vs 235 ms (V8)
      this is really good!
    
  
  comparison:
    
      adding 1000000 items to a JS Array
      adding 1000000 items to a persistent vector (using transients)
      85 ms vs 47 ms (!) (v8)
      transients are faster than mutable arrays
      javascriptcore: 28 ms (arrays) vs 30 ms (transient vector)
    
  
  nashorn demo
    
      benchmark: react running at the command line with om
      building a template 100 times
        
          ~13 ms avg with v8
          ~8 ms avg with jsc
          ~14 ms avg with spidermonkey
        
      
      nashorn: slow load time & long warmup time
        
          starts really slow (>1s)
          converges slowly, but:
          approaches ~23 ms
        
      
now what?


  typescript, dart?
    
      these are under the opinion we want to build the same broken
        type of stuff
      cljs: we can build things radically simpler
    
  
React


  library from facebook
  other libraries have a deep-seated notion that everything is
    mutable
    
      angular, backbone, …
    
  
  react is different: it has a functional mindset
  the virtual DOM evolves from one value to the next
    
      clojurescript allows fast diffing between these values
      react will do the right thing
    
  
  react has completely taken over the cljs world
    
      Om
      reagent
      quiescent (much thinner)
      reacl
    
  
Om


  Om was an experiment to show that representing app state a
    single global value was a good idea
    
      this had been done before in other areas:
        
          databases
          server-side
        
      
  we’re not going to make interfaces that people haven’t seen
    before
  prismatic’s blog post about moving to Om
    
      simple components which don’t interact in crazy ways
    
  
Goya


  by Jack Schaedler (sp?)
    
      ui dev for ableton
    
  
  “we can do real undo”
  Jack saw this and wondered if it would scale
  Goya: pixel editor
    
      surface: immutable vector
    
  
  gets undo without adding complexity to app
  get almost unlimited number, without loss of performance
  github: jackschaedler/goya
  his app is complicated!
    
      the UI is complicated
      but cljs eliminates unnecessary complexity
    
  
  how much memory does his app use? not much
    
      (aside: use google chrome dev tools!)
    
  
innovate!


  model story needs work
    
      js MVCs backbone/ember/angular
      notion of a model on the client
        
          you can do operations on it
        
      
      nothing particularly compelling for this in the react space
      DataScript
        
          export some elements of the datomic api to the client
          store your data in a flat way
          sensible query api over it (queries on trees aren’t so fun)
          datomic allows you to ask for entities
            
              lift a tree out of the flat database
            
          
  react model can be further improved
    
      addressability
      immutable everything
        
          they have to convert styles and DOM attributes back to
            javascript objects which have to be walked
        
      
      (one benefit of react: it’s facebook’s problem 😺 )
    
  
Q: is it possible to implement Om all in clojurescript using a macro?


  I suppose it could
  you might want to compose things dynamically, and macros are static
  you have to be concerned with the amount of code that a macro generates
  I would not pursue that idea

Q: is there a community place for shared Om components?


  I’m not going to spend much time on it
  if Om needs to be improved to make this happen, I will do that
  you want to be able to use other people’s code without jumping
    through too many hoops
  things get tricky with events & communication between components
    
      need some agreement on how people communicate between components
    
  
Q: what’s your vision for cljs 1.0? how can we help with the yak shaving?


  basic things like sharing code
  shared analysis over clojure and clojurescript
    
      would open up a lot of tooling
        
          eg linter to lint both languages
        
      
      would like infrastructure for tooling to be much better
    
  
  when you go to 1.0, people lock to that version and are slow to
    move off it

Q: are you seeing much evidence of cmd-line or server-side cljs?


  most people doing it are doing node.js

Q: when is cljs going to be self-hosting?


  it’s not that we don’t want it
    
      we’re keen on self-hostability/bootstrappability, if not
        self-hosting
    
  
  nice to remove the JVM dependency
    
      eg lighttable might not want it
    
  
  it’s last-mile stuff at the moment
    
      which isn’t that fun
      and I don’t personally need it so I won’t work on it
    
  
Q: do you forsee a pure cljs version of react?


  if someone wants to shave that yak, that would be awesome
  if the system is immutable all the way down, the optimizability
    explodes

Ali Asad Lotia, Why devops needs Clojure


  @aalotia

background


  was a dev who had helped get stuff to prod
  our ops person left
  they asked me to fill in
  I said “okay, as long as you hire a replacement soon”
  they didn’t arrive
  I missed being able to write code

problem


  we’re exec’ing a jar, and it keeps taking 3s
  I saw an opportunity to write a very simple noir app
  much improved performance
  people were impressed, asked to see the source code
  “what is this clojure thing, and why did you use it?”
    
      I’m not a seasoned Java dev
    
  
  moved to another company:

Beamly


  TV focused social network
    
      smart TV planner
      personalised TV Magazine
    
  
  availability

behind the curtain


  AWS: us-west, us-east, eu-west
  milli-services (ie not quite µservices)
    
      scala
      node.js
    
  
my team


  build/release automation
    
      but we don’t do deploys; we just enable them
    
  
  persistence
  platform performance/metrics/logs
  core libs

deploys in the mutable days


  generate build artefacts
  define config in puppet
  deploy artefacts
  deploy config

phoenix servers


  base server images, with some configuration changes
  relatively short-lived
  didn’t name them or worry when they were switched off

disconnected dev and ops


  zed shaw:
    
      “maybe you use a language like lisp that pretends the computer
        is some purely functional fantasy land with padded walls for
        little babies”
      actually, yes I do
    
  
immutable servers


  kill server for every deploy
  package new server images (AMIs) in order to deploy new version

Requirement: examine server images


  aws console
    
      some config but manual and not all info
    
  
  python + boto:
    
      just got back a list of objects
      we know there’s more available! we saw it in the console!
    
  
  clojure + amazonica
    
      it just gave us data back!
      data trumps objects every time for this kind of use case
    
  
  console → cli → sdk → repl/scripts

repls are awesome


  exploration of APIs
  minimise context switching
  instant feedback
  data rich (or richer, at least, in some cases)

team reactions


  “Soooo many brackets!!!”
    
      I don’t see them anymore – paredit deals with it
    
  
  “How do I iterate over this?”
    
      why do you need to iterate? what are you trying to do?
    
  
  “I want to change this value”
    
      again, what are you trying to do?
    
  
  “Wow, this is really powerful”

Offloading state


  Immutable servers
    
      http://martinfowler.com/bliki/ImmutableServer.html
    
  
  pass the buck to a service someone else manages
    
      application data
      metrics
    
  
  but when you do autoscaling, it takes some problems away but
    gives us other problems
    
      provider defined data model
    
  
  clojure was a great fit for managing autoscaling groups
    
      all the information we needed was made visible by a single clj
        fn
    
  
  in a repl, with ad-hoc tasks, having some clojure code you’ve
    written and evalling it is really powerful

Observing platform performance


  knock-on/trickle down effects
  sensu handlers limited

riemann


  http://riemann.io
  had used graphite
    
      very data-poor
    
  
  riemann gives you a clojure map, which is a much richer model
  embedded REPL
  overridable
  extensible
  responsive primary author

tracking our services


  zion - system knowledge base
    
      who owns which service? what do I do when alert X fires?
    
  
  component details

infrastructure as data


  config
  metrics
  logs
  we had powerful ways to analyse this data without having to
    resort to glomming 500 scripts together
  we have a single language which is superlative
  I sit next to extremely good Scala devs and ask how they would
    do it
    
      “I’d write a case class”
    
  
future work


  analyse logs + metrics
  catch and correct misconfigurations
  scripts with upcoming fastload?
    
      if clojure fastload is fast enough that we don’t have to worry
        about startup time, could it replace some of our python scripts?
    
  
  cyanite to replace graphite
    
      cassandra/clojure
    
  
  “lisp isn’t a language, it’s a building material”
    
      Alan Kay
    
  
clojure summary


  pros
    
      core data structures
      data manipulation
      community
        
          #clojure and #ldnclj on freenode
          people accept PRs, give real feedback
          projects move
        
      
      shared aesthetic
    
  
refs


  martin fowler posts above
  mcohen01/amazonica
  pyr/cyanite

Q: can you share more info about zion?


  we will when it’s in decent shape
  too coupled to our particular environment right now

Q: graphite data poor? can you elaborate, particularly with reference to storage backend?


  how data is stored is poor
  all you get is an arbitrarily long key name (hierarchical)
    
      a timestamp
      a single numerical value
    
  
  with Riemann, you can add arbitrary tags to the events
    
      persisting them – don’t have a great answer
      looking at influxdb
      store time-series data with a richer data model
    
  
Leonardo Borges, Taming Asynchronous Workflows with Functional Reactive Programming


  who has used Reactive extensions?
    
      do you think it’s FRP?
    
  
  currently writing “Clojure Reactive Programming: RAW”
  when people talk about FRP, they mean merely “inspired by FRP”

Naming is hard


  should really be talking about “Compositional Event Systems”
  Conal Elliott tweet
  http://bit.ly/rx-commit
  http://bit.ly/reactive-cocoa-commit

what’s the difference?


  every construct in FRP has a precise mathematical definition
  free of side-effects
    
      kind of like Haskell’s IO monad
    
  
history


  1997: created in haskell
  other haskell libs
    
      reactive-banana, netwire, sodium
    
  
  FRP-insired:
    
      Rx[.NET/Java/JS], baconjs, reagi (cljs)
    
  
  main abstractions: Behaviours and Events
    
      traditionally:
    
  
type Behavior a = [Time] -> [a]
type Event a    = [Time] -> [Maybe a]

  this talk: compositional event systems

motivating example


  imperative code to iterate over a list
    
      lots of changing state
    
  
  functional code
    
      we describe what, but not how
      no mutating variables
      gain reusable single-purpose functions
    
  
  CES has similar principles
  think of key presses as a list of keys over time
  http://bit.ly/rxjava-github
  http://bit.ly/rxjs-github
  subscribe to event sources, filter/transform them
  map behaviour to event streams
    
      say, by sampling every second
    
  
  flatMap / selectMany

network IO


  rather than events from keyboard, mouse etc
  in javascript: callback hell :(
  on jvm: clojure promises don’t compose
  promises in js are slightly better but have limitations

demo: simple polling app


  partition/zip

quote


  “FRP is about handling time-varying values like they were regular
    values”

why not core.async?


  core.async feels like it’s a lower level of abstraction
  it’s a great foundation for an FRP-inspired framework
  reagi is built on top of core.async ( http://bit.ly/reagi )

bonus example: reactive API to AWS


  retrieve list of resources from a stack
  for each EC2 instance, get status
  same for each RDS instance

Q: when you do you jump from handling manually to observables?


  my rule of thumb is if I need anything more than a single
    callback, I’ll use this (or core.async)

Q: have you used RxJava from clj? How nice is it?


  works great, so does RxJs

Stuart Sierra, Components: Just enough structure

architecture


  software architecture is very simple(!)
    
      presentation
      business logic
      DB
    
  
  actually, much more complex
    
      config
      connections to external resources
        
          monitoring
          queues
          sessions/connections in pools
        
      
      process state
        
          thread pools
          caches
          schedulers
        
      
Java: structure built in

clojure: not much structure


  clojure namespaces aren’t classes
    
      they’re not instantiable
    
  
  def creates a singleton
  (def foo (atom ..)) creates global mutable state
  bootstrapping

(defn start-all! []
  (database/connect!)
  (create-queues!)
  (start-thread-pool!)
  ....
  (start-web-server!))
component


  immutable data structure (map or record)
  public api
  management lifecycle
  relationships to other components
  It’s an object (ssh!)
    
      not using it to represent data
    
  
State wrapper component


  (defrecord DB [host conn] ....)
  opaque to most consumers (by convention)

Public API


  fns take component as an argument

Lifecycle: Constructor


  set up initial state
  no side effects

Lifecycle: Transitions


  side effects happen here:

(defprotocol Lifecycle
  (start [component])
  (stop [component]]))

  start and stop return an updated version of the component

Service provider component


  (defrecord Email [endpoint api-key ...] ...)

Domain model?


  traditionally intermingle data and behaviour:

public class Customer {
    private String name;
    private Address address;
    public void notify() {...}
    //...
}

  Let data be data
  just use a map

domain model component


  represent aggregate operations
  (defrecord Customers [db email])
  db and email are other components, used by the customer component
  entirely interacts through their public APIs
  to construct a Customers instance, need to get its dependencies

system map


  takes created but unstarted components:

(defn system [...]
  (component/system-map
   :customers (customers)
   :db (db ...)
   :email (email...)))

  to start the system, understands dependencies and works out
    correct dependency order to start each component
  then wires each component up to the correct (started) dependency
  stopping the system is similar but in reverse dependency order
  Before start, dependencies not filled in yet (just nil)
  after start, fill in dependencies
  the system is just a map
    
      so if I want to inject a test stub, I can just assoc it in:
    
  
(defn test-system [...]
  (assoc (system ...)
    :email (stub-email)
    :db    (stub-db)))

  works as long as I do it before starting any services

DB for testing


  fixtures to inject into database
  mocking the db is too hard unless you use datomic 😏

var substitution & asynchrony


  with-redefs and binding are delimited in time
    
      problems if you dispatch to another thread
      potential race conditions
      tightly coupled to implementation
      wrong level of granularity
    
  
Entry point: main


  exactly one mutable global for the whole system
  (def sys (atom nil))
    
      use reset! not swap! because start and stop are
        side-effecting and swap! might call multiple times
      (@samaaron ed: uses agents for this sort of thing)
    
  
Web app: static routes


  defroutes considered harmful

renaming dependencies


  you can merge systems
  name common components with shared keys:

{:a/web-app ..
 :a/server ...
 :db ...
 :email ...}

{:b/web-app ..
 :b/server ...
 :db ...
 :email ...}

(merge system-a system-b)
core.async


  components take channels as state
  decouples components from one another
  system creation can create the channels you want and wire them up

summary


  advantages
    
      once you’re used to the patterns of clear dependencies and
        boundaries, you maybe don’t even need the library anymore
      isolation, decoupling
      testing, refactoring
      automatic ordering of start/stop
      easy to swap in alternate implementations
      everything at most one map lookup away
    
  
  disadvantages
    
      requires whole-app buy-in
        
          won’t get a lot of the benefits without this
          porting an existing system can be tedious
        
      
      system map is too big to inspect visually
      cannot start/stop only part of a system
        
          may try to fix someday but don’t really understand how yet
        
      
  possible future
    
      “init” acquires resources but doesn’t start?
      “close”/”stop” separation – close acquired resources and
        discard dependencies so they can be GC’d
      (the “stop” method doesn’t dissoc anything)
        
          dissoc stops a record being a record
          you might want to use that state again
        
      
      handle mutable containers for systems
        
          currently, library code doesn’t care – you can use an atom
            or a var or whatever
          allow individual components to start, stop, or change at
            runtime
          deref container and get “current” component with latest deps
          catch errors, mark component as “failed”
            
              this is the tricky part
            
          
Philip Potter, Generative testing with clojure.test.check


  sorry, I was too busy to take notes for this for some reason
  preliminary notes:
    
      davegolland/generative-schema.clj – demonstrates how to convert
        schemas to test.check generators
      (post-hoc addition: herbert is like schema and comes with
        test.check integration; thanks to @minimal for pointing this
        out)
    
  
  slides: http://www.philandstuff.com/slides/2014/euroclojure.html

Chris Ford, the hitchhiker’s guide to the Curry-Howard correspondence


  number of papers published today by the foremost expert on the
    Curry-Howard correspondance…
    
      1
    
  
  Don’t panic!
  Gödel’s incompleteness theorem

introduction


  a → a
    
      this is a proposition in logic
      but it’s also a type
        
          the type of the identity function
        
      
our heroes


  Haskell Curry
    
      1958: textbook on combinatorial logic
        
          didn’t necessarily understand how revolutionary this idea was
        
      
  William A. Howard (1969)
    
      not only does a type correspond to a proposition, but:
        
          a function with a type corresponds to a proof of a
            proposition
        
      
      “The formulae-as-types notion of construction” - finally
        published in 1980
    
  
modus ponens


  (a → b) → a → b
    
      modus ponens
      type of apply (haskell or idris):
    
  
apply :: (a -> b) -> a -> b
apply f x = f x

  the implementation here corresponds to a proof of modus ponens
  apply works with any types a and b
  modus ponens works with any propositions a and b
  view the type “Integer” as the proposition that integers exist
    
      any example – say, 65 – counts as a proof of this proposition
    
  
  here, we use (3==) to prove that Integer -> Bool is populated

(3==) :: Integer -> Bool

apply (3==) 4
False: Bool
composition


  (a → b) → (b → c) → (a → c)
    
      type of function composition
    
  
  length : List a -> Integer
  (3==) : Integer -> Bool
    
      comp length (3==) : List a -> Bool
      if we accept that List a exists, we now prove that Bool
        exists
    
  
axioms?


  a → a
  (a → b) → a → b
  a → b → (a,b)
    
      if I can build an a, and I can build a b, then I can build
        an (a,b)
    
  
  (a,b) → b

bottom type


  a → b
  (a → b) → (b → a)
    
      neither of these are true in general
      the bottom type: ⊥ is guaranteed to have nothing in it
        
          represents falsity in the Curry-Howard correspondance
          represents something it’s impossible to prove, because it’s
            not true
        
      
AnythingGoes : Type
AnythingGoes = (a : Type) -> a

cantProveItAll : AnythingGoes -> _|_
cantProveItAll f = f _|_

  cantProveItAll shows that AnythingGoes is uninhabited (because if
    it weren’t, it would imply that ⊥ was inhabited)

harmless(?)


  types prove our program correct?
  types only get us so far
    
      can still get runtime errors if the types check out
    
  
mostly harmless.


  are types defective?
  haskell will crash at runtime despite an advanced type system
    
      head [] isn’t defined
    
  
enter Idris


  Edwin Brady, creator of Idris
    
      (and whitespace)
    
  
  killer feature of Idris:
    
      allows you to make condescending remarks about the Haskell type
        system
      although it’s really a dialect of Haskell
    
  
  example:

Type Nat = Z or (S Nat)

Type List = [] or (x :: List)

data Vect : Nat -> Type -> Type where
  Nil  : Vect Z a
  (::) : (x : a) ->
         (xs : Vect n a) ->
         Vect (S n) a

  in Haskell, types can be parameterized on other types
  in Idris, they can also be parameterized on values as well as
    types
    
      Vect 2 Integer is the type of vectors which contain exactly 2
        Integers
        
          or rather, Vect (S (S Z)) Integer
        
      
head : Vect (S n) a -> a
head (x::_) = x

head []
Can't unify Vect 0 a
with Vect (S n) iType

  trying to take the head of an empty vector is a compile-time
    error

concatenation


  signature: Vect m a -> Vect n a -> Vect (m+n) a
  sort: Ord a => Vect m a -> Vect m a
    
      would have caught Phil’s my-sort which dropped duplicates(!)
      didn’t manage to get this implemented in the lunch break
        
          it’s not theorems for free 😉
        
      
even number family of types

data Even : Nat -> Type where
  Zero : Even Z
  Next : Even n -> Even (S (S n))

Zero : Even Z
Next (Next (Next Zero)) : Even 6

  can now show that even numbers sum to even numbers:

add : Even m -> Even n -> Even (m + n)
add Zero y = y
add (Next x) y = Next (add x y)

  Although we’ve really proved that:
    
      there exists an operation which takes Even n and Even m and
        returns Even (n+m)
      we chose add but could have chosen any other name
    
  
  can prove that 42 is even
  Even 3 is a valid type
    
      Even 3 -> _|_
      proof in slides
    
  
the unit type


  () represents truth
    
      you don’t need anything else to prove this
      you can construct it without context
      (you could use Even 42 to represent truth too)
      so LifeTheUniverseAndEverything -> Even 42 :)
    
  
References


  http://idris-lang.org
  http://brianmckenna.org
  http://wadler.blogspot.co.uk

Q: is there a way to specify that sort’s return value is sorted?


  @bodil thinks it’s true :)

Q: is the type-checker guaranteed to terminate?


  it’s equivalent to the halting problem

Anna Pawlicka, Reactive data visualisations with Om


  slides: http://www.slideshare.net/annapawlicka/reactive-data-visualisations-with-om
  code: https://github.com/apawlicka/om-data-vis
  @AnnaPawlicka
  Data Engineer, Mastodon C
  “Big Data” startup
  (or Big “Data startup”?)
  “I work for Bruce”
  it’s been fantastic learning Clojure over the last years

Technologies

D3 (data-driven documents)


  to visualise data
    
      table of numbers, bar chart, whatever
    
  
  data bound to DOM
  interactive - transformations driven by data
  huge community
    
      huge number of plugins and extensions
    
  
  Higher level libs available
    
      hide the complexity of d3
      but if you need to tweak the underlying d3 it’s still
        available
    
  
leaflet.js


  layer on top of d3
  mapping data
    
      tile layers, vector layers
    
  
  user interaction

dimple.js


  charting library on d3
  bar charts

react


  (interface components)
  solves one problem: complex UI rendering
  just the V of MVC
    
      say no to “two-way data binding”
    
  
  re-renders the entire UI
    
      sounds like a bad idea
      actually quite performant, due to:
    
  
  virtual DOM
    
      diffs between previous and next renders of a UI
    
  
  less code
  shorter update times

react lifecycle


  IInitState →
  IWillMount →
  IShouldUpdate →
    
      IRenderState
      IRender
    
  
  Om handles most of these for us (particularly IShouldUpdate)

Om


  entire state of the UI in single piece of data
  immutable data structures = reference equality check
    
      shouldComponentUpdate() can be overridden to take advantage
        of this
    
  
  snapshottable, free undo

Liberator & core.async


  component interaction
  liberator: RESTful apis via defresource macro
  core.async
    
      js developers would freak out looking at it
      get blocking calls without browser freezing
    
  
data sources


  for example, local sensors
  may want to perform some sql queries to see patterns in your data
  may wish others to retrieve data through API (if they don’t like
    our chart)

Chart & API demo


  user interacts, triggers API calls to fetch data, updates graph
    in real time
  chart implementation
    
      om/IInitState to construct
      om/IRenderState
        
          to update
        
      
  device-form
    
      om/IWillMount to read shared info to find API endpoint
      om/IRenderState
    
  
  form-row
  chart-figure

last.fm chart


  chart based on last.fm playlist
  input box takes username, calls last.fm to find that user’s
    playlist
    
      chart then shows bands from most popular downwards
    
  
interactive maps


  input box for postcode lookup
    
      uses google geocoding api to get coords
    
  
  can click map to create marker & display coordinates
  app-model stores map location and coordinates panel contents
  nice use of core.async sliding-buffer 😎
  (go (while true ...)) could be (go-loop ...) ? dunno

summary


  fast rendering and interactivity is yours with js + cljs + om
  immutability = efficiency
  sane application structure
  (philandstuff ed: this presentation is very visual, just watch the video!)

algernon, The Face of Inspiration, or how Clojure helps bring Lisp to Python


  github: algernon
  twitter: algernoone
  sorry, I was in the hallway track for this 😦

Malcolm Sparks, Assembling secure clojure applications from re-usable parts


  @malcolmsparks, juxt

warning! research, evolving ideas, alpha quality

juxt/modular


  set of components compatible with stuartsierra/component
  http-kit, bidi router, mustache templating

juxt/cylon


  security components
    
      login form
      session
      user domain
      hashing
      authn and authz
    
  
assertion


  libraries are great
  systems are complex

assertion 2


  a meta-architecture, that can scale to hundreds of diverse
    projects, is useful

architecture


  don’t want to just port Spring MVC to clojure
  components, dependencies, protocols

components


  reusable bits

dependencies


  wiring of components together
  since the system is in a var, I can do a tree-walk on the
    system, and show it
  can visualise it with dagra and svg rendering, and react
  slide deck which shows its own wiring
    
      I’m So Meta, Even This Acronym
    
  
protocols


  integration surface
  necessary for component interchangeability
  example: bayonet light-bulb fitting
    
      can plug a light bulb into it
      light bulb dies – replace it!
      over time, can replace entire system by replacing parts
    
  
  hidden couplings
    
      copied code
      database schemas and sql queries
      URI formation & URI dispatch
      have to change things in multiple places to effect change
    
  
  juxt/bidi
    
      dispatch and forge URIs from the same route data
    
  
component example


  constructor
    
      defaults with merge
      schema/validate
    
  
  components are units of cohesion
    
      implements multiple protocols:
        
          component/Lifecycle
          WebService
          JavaScripts
          TemplateData
        
      
intermission


  maze creation in cljs
    
      “drunken walk” algorithm
        
          the first time you visit a space, you break down the wall
        
      
  core.async visualisation and demo of map<

the index pattern

the interceptor pattern


  a component is wired in between two components

shared dependency pattern

security

challenge


  don’t want to re-implement security components
  tried-and-tested security by default
  flexibility of ‘roll-your-own’

example: website


  router routing between sub-websites A and B
  add a login form to the router, which uses:
    
      user-domain
        
          password-algo (eg scrypt)
          user-store (eg cassandra)
        
      
      session-store
      all comes from cylon
    
  
  add authorization component to website B (again from cylon),
    using:
    
      authenticator
      session-store (same dep as above)
    
  
summary


  http://modularity.org
  google group: modularity

demo


  lein new modular myapp
  lein new modular myapp +cljs
  lein new modular myapp +cljs +security
  lein new modular myapp +cljs +security +devtools
  (dev) fn
    
      if your code doesn’t compile on your repl, then you just get
        loads of stack traces
    
  
  secure content, rather than URI routes
    
      there may be multiple routes to the content
      restrict-handler to wrap a response in a RestrictedHandler
        which implements IFn to look like a fn and make it invokable
    
  
Q: what about hypermedia to decouple URI dispatch & formation?


  we don’t have HATEOAS because it’s quite hard
  want to make it easier

that’s all folks!


  thanks for reading :)