Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
Euroclojure 2014

EuroClojure 2014, Krakow

Fergal Byrne, Clortex: Machine Intelligence based on Jeff Hawkins’ HTM Theory

  • @fergbyrne
  • HTM = Hierarchical Temporal Memory
  • Slides

big data

  • big data is like teenage sex
    • noone knows how to do it
    • everyone thinks everyone else is doing it
    • so everyone claims to be doing
    • (Dan Ariely)

machine learning is important

  • people don’t trust other people
    • they have their own agendas
  • so they place too much trust in machines

asimov’s take

  • we gain knowledge faster than we gain wisdom
    • applies to human knowledge
    • applies to data: gathering data is easy, drawing conclusions is not

a problem in neuroscience

  • rate of papers published is growing exponentially
  • 2013: 1 every 32 minutes
  • 2014 so far: 1 every 17 minutes

can AI learn from neuroscience?

Jeff Hawkins’ goals in HTM

  • Study the neocortex and establish its principles
  • open sourced NuPIC in 2013

neocortex

  • the wrinkly part at the surface of the brain
    • grey matter: processing
    • white matter: wiring
  • about 2mm thick, 10cm^2 in area
  • 30-50MM neurons
  • 1G connections
  • hierarchical
  • uniform
    • ie all looks physically the same
    • all regions have the same algorithm

6 key principles

on-line learning from streaming data

  • up to 10 million senses feed the brain
  • we don’t (can’t) store this data
  • we build models from live data
  • models constantly updated

hierarchy jof regions

  • sensory data enters at the bottom
  • models are built in every region
  • things change more slowly as you go up
  • hierarchy enables sequences of sequences
    • seq of waves
    • seq of phonemes
    • seq of words
    • seq of sentences
  • hierarchy works upwards and downwards

sequence memory

  • all sensory data involves time
  • sequence memory allows predictions
  • structure in data elaborated over time
  • sequences can be c

sparse distributed representations

  • in each region, many neruons, few active
  • SDRs represent spatial patterns
  • fault-tolerant, semantic ops, high-capacity
  • key to understanding & building intelligent systems

all regions are both sensory and motor

  • behaviour provides context for sensory data
  • structure in model navigated via behaviour

attention

  • use attention to manage the neocortex
  • planning and previsualisation
  • whole subhierarchies can be switched on and off

layers of neocortex

  • from molecular upwards
  • around 5 or 6

neurons

  • distral dendrites detect coincidence of incoming activity from neighbouring cells
  • you don’t just see what you’re seeing now, you predict what you’re going to see next
  • (reality is much more complicated, but this algorithm is sufficient to explain a lot)

clortex

background: numenta’s nupic

  • in dev since 2005
  • partially implements HTM/CLA
  • python/c++
  • open source

strengths

  • skilled dev team
  • eat their own dog food (grok uses nupic)
  • operates on subset of HTM/CLA principles
  • tunable using swarming on your data
  • works well on streaming scalar data (eg machine-generated)
  • great community – http://numenta.org

limitations

  • codebase has evolved as theory has developed
  • difficult/scary to rewrite for flexibility
  • OO with large, coupled, classes (~1500 LoC per class)
  • need to swarm to find parameters, no real-time control
  • not easy to extend beyonnd streaming scalar use case

clortex requirements

  • directly analogous to HTM/CLA theory
  • transparenntly understandable source code
    • a neuroscientist should be able to read & review code
  • directly observable data
  • sufficiently performant
  • useful metrics
  • appropriate platform
    • portability
    • scalability

architectural simplicity

  • first role: be useful!
  • best software is that which is not needed at all
  • human comprehension is king
    • if people can’t understand your code, your code is not finished
    • unit tests are not sufficient in themselves
  • machine sympathy is queen
  • software is a process of R&D
  • software development is challenging & intellectual
    • more science than engineering
      • engineering: you have a good model already, you just have to plug in the particular parameters
      • science: there are a bunch of unknowns which you have to learn & understand

#1: Just use data!

  • maps, vectors, sets
  • all done in a one-page datomic schema

#2: Clojure & its ecosystem

  • clojure data not domain objects

#3: russ miles’ life preserver

  • everything either “core” or “integration”
  • core: a datomic database for the neocortex
  • core: each “patch” of neurons is a graph
  • integration: algorithms, encoders, classifiers, SDRs

key clj libs & tools

  • datomic (+adi)
  • quil/processing
  • incanter
  • lein-midje-doc for literate documentation
  • hoplon-reveal-js for presentations
  • lighttable

review

  • Big Data isn’t just Machine Intelligence problem
  • HTM is exciting

links

Logan Campbell, Clojure at a Post OFfice

history:

  • was at clojure user group
  • a guy turns up and says he’s hiring a team of clojure developers
  • he was at Australia Post
    • a million lines of Java worked on by a team in India
    • wanted to bring it back in-house

project: digital mailbox

  • big companies spend a lot of money sending out bills & junk mail
  • product to seamlessly replace that workflow
  • switch from physical mail to cheaper model
  • consumer can sign up to receive water bill online
  • I was brought on as the “clojure expert”
    • (I’d been playing with it for a couple of years)
  • drama:
    • the people they could hire:
      • really experienced java devs
      • keen on FP
    • they said as they were hiring “you might be doing clojure or you might be doing scala”
    • first few people were scala fans
    • scala v clojure battles
      • “we need static typing”
      • “we need OO for domain modelling”
      • “clojure is slow” (?)
      • “what framework do you use?”
  • “we need static typing? okay, we’ll use core.typed”
  • domain modelling:
    • when people are used to domain modelling in OO, telling them to just use maps feels like a cop-out
    • records + protocols kind of feel like classes
    • wasn’t til I showed them code I’d written and comparing it with their code that they realized that you can just use maps
  • online scala course
    • we did it as a team
    • I also did the exercises in clojure
    • did one exercise three different ways in clojure
      • conditional
      • match
      • stream processing
    • showed them my solutions
      • they already understood the problems because they’d solved them themselves
  • clojure performance was a surprise, because I’d come from ruby (!)
    • clojure is fast
    • there was an underlying feeling that “we need scala for performance”
  • I’m a consultant, so was happy for the team to make the language decisions
    • “if you’re keen on scala, let’s find out a way to pitch it to management”
  • web stack: kept hearing “async async async”
    • felt like premature optimization
    • but still we used http-kit
      • benchmark started to allay fears that clojure was slow

feature: make a payment on a bill

  • not necessarily a full payment
    POST /bills/:bill-id/payments
    Session: user-id
    Post Data: amount
  • GET credit card token for user
    • POST request to payment gateway
  • GET how much left to be paid
  • if payment succeeds: display amount remaining
  • if payment fails: display error

candidates solutions

  • synchronous promises
  • promise monad
  • lamina
  • etc etc

solution 0: synchronous

  • http-kit’s requests return a promise
    • just @deref the promise (blocks the thread)

solution 1.1: promise monad

  • do is aware of promises
    • doesn’t block thread, but waits for promise to be executed before continuing
    • felt natural way to write with promises
    • but incorrect: too much waiting, no concurrency

solution 1.2: promise monad let/do

  • let to define promises
    • do to pseudo-block on them
    • introduces correctness but reduces readability

solution 1.3: let/do/do

  • okay, let’s step away from monads

solution 2: (?)

solution 3: raw promises

  • when to explicitly wait for a particular promise

solution 4: raw callbacks

  • not viable
  • would have just written a hacky little promise library

solution 5: core.async:

  • great! same shape as synchronous code, but correct concurrency

solution 6: lamina

  • didn’t feel totally suited to the situation

solution 7: meltdown (LMAX disruptor based)

  • not appropriate

solution 8: pulsar promises

  • looks exactly the same as the synchronous code, except for one character
  • pulsar rearranges your code at the bytecode level
    • uses JVM agents (normally used for tracing/debugging)
  • pass a fn to one of pulsar’s functions
    • turns synchronous code to async code

solution 9: pulsar actors

  • not appropriate

winners

  • 0: synchronous
  • 5: core.async
  • 8: pulsar

scala solution, for comparison

  • scala futures (basically promises)
  • all monadic
  • I don’t understand it entirely
  • concise
  • battle of the benchmarks, fastest first
    • pulsar-async
    • pulsar-sync
    • core-async
    • raw-callback
    • scala-play-future (significantly less than others)

CQRS (command-query responsibility segregation)

  • want fast reads
  • reduce number of queries
  • don’t want to have to update write code every time we add a new reader

structure

  • service A → cassandra → service B
  • custom triggers in cassandra in clojure (just drop in the .jar!)
    • publish to rabbitmq
    • notify index maintainer
    • write index to cassandra
    • service B reads from cassandra

cassandra triggers

  • can just throw the clojure jar in there
  • everything is byte buffers
    • you need to know the type of all the fields out-of-band
    • not self-describing data at all

microservices

  • I thought we would have a user service and a provider service and a mail service
    • but this gets tricky when you want data about users and providers
  • you need to split things much more fine grained
  • user service →
    • authentication
    • multi-factor auth
    • authorization
    • user profile
    • password reset
      • does it belong in user profile?
      • there’s a bit of workflow here
        • send out email
        • get user to click link
        • enough to warrant its own service
  • drama: needed to talk to systems team to deploy
    • I did things badly
    • I didn’t get anything into production in my 6 months there
    • systems team: we need monitoring and config and stuff
      • if we’d had something early on which had gone through these barriers, we would have had much less stress
      • benchmarks end petty arguments

Q&A

can you share some experience with monitoring & resilience?

  • appdynamics
  • classnames are expected to be java-style class names
    • clojure ones are close enough
  • clj-metrics to expose more high-level metrics
    • requests/second from ring
    • number of bills paid
    • appdynamics could pick it up from jmx
  • nomad for configuration

with http-kit+core.async, what happens when server dies and there’s loads of threads?

  • bottleneck was amount of memory
  • when server runs out, it slows down a lot
  • way to get around that is to monitor resources on your machine and ideally have autoscaling

were the scala guys finally writing clojure in the end?

  • we have one person still hardcore for scala, but sees the merits of clojure
  • a few who did the online scala courses are clojure folks now
  • people who come from the java world of static typing feel they need that
  • but now they’ve written code that actually works, they’re more comfortable with that now

Tom Hall, Escaping DSL Hell by having parens all the way down

  • @thattommyhall

DSLs

  • languages made for specific purposes
    • config mgmt
    • science
    • learning
  • distinction between:
    • internal DSLs: embedded in another language
    • external DSLs: implemented in another language

problems with puppet

  • zen of python:
    • namespaces are a honking great idea, let’s do more of them!

puppet namespaces

  • Exec[‘install’] in two different modules will result in a naming collision
  • fail :(
  • end up with Exec[‘tom::install’] but this is a hack

iteration

  • file type lets you pass in an array
  • nagios_host doesn’t
  • iteration is responsibility of type, not language
    • as far as I know

but you need to know ruby anyway

  • if you want to extend puppet, you need ruby
  • if you need to know ruby, why do we bother with the puppet DSL in the first place?

experimental features: lambdas and iteration

  • any language where lambdas arrive late is not a good language

ansible

  • just YAML
    • oh wait, I might want to iterate
    • oh wait, I’ve got embedded ginger templates in my YAML strings
      • what’s the scope of names in my templates?

if you give people a “language” they will expect loops

  • maybe lambdas
  • probably namespaces
  • this has been done before

chef gets it right

  • it’s embedded in ruby
  • you get iteration and namespaces from ruby

teaching people to program

  • if you design a language:
    • you need a parser, which is hard
    • you need an interpreter/compiler, which is hard
  • if you embed it, you get that stuff for free

geomlab

  • minimal language for teaching
  • talks about pictures
  • intro to FP
  • gets you into recursion early on
  • man $ woman - “next to”
  • man & man - “on top of”
  • (man $ woman) $ tree = man $ (woman $ tree)
  • man $ (woman & tree) – scales nicely to get a nice aspect ratio
  • learn about operator precedence
  • de morgan’s laws
    • although not always held, due to scale
  • define functions
   define manrow(n) = manrow(n-1) $ man when n>1
                    ~ manrow(1) = man
  • builds up to an escher tiling
  • but once you’ve done that, where do we go?
    • only exists in this sim
    • if you want to extend it, you need java
    • “I’m really excited about FP now, but I’ve got nowhere to go”

what if we did it in clojurescript?

  • let’s use ‘below and ‘beside instead of $ and &
  • (below man woman)
  • (beside tree star)
  • http://cljsfiddle.net/fiddle/thattommyhall.geomlab.demo
  • let’s say I want to change man – what does it mean?
    • it’s implemented in the same sort of language
    • I can see there’s a url in there where I fetch an image from the internet
    • I know recursion, because I learned that from the geomlab exercises
    • I can extend the language itself

science languages

  • R
  • wolfram alpha
  • maple
  • matlab
  • these things just aren’t very good languages, even if they are good at their domain

another problem with DSLs

  • netlogo
  • If you’re based on applets, and Oracle drops applet support, you find you need to port your whole language to a new platform (in this case javascript)
  • again, reimplement in clojurescript?
    • anyone interested in hacking on this with me?

conclusion

  • you probably don’t need to make a new language
  • if you do it will probably be rubbish
    • at least for a while
  • think about power and reach
  • you should embed /deeply/ into clojure

links

Q&A

what makes a good first language?

  • clojure needs a better day 0 story
  • at some coder dojos where I’ve taught kids, some don’t even know about files and folders
    • so if you say “open a terminal, cd into a directory” you’ve lost them
      • and it’s not their fault

have you had any kids look at your examples here?

  • I’ve done the geomlab example
  • otherwise this is all a recent exploration
  • errors in cljsfiddle are not reported well
    • again problematic for day zero

Mathieu Gauthron, JVM-breakglass

troubleshooting a java application

  • debugger
    • only powerful when you can narrow down the problem to a series of breakpoints
    • when the problem is a race condition, it will change the nature of the problem you’re studying
  • log/print statements
    • you need to plan before compilation
    • when the problem is in production, it might be too late
  • jmx
    • again, you need to plan for it in advance
  • ad-hoc interactive mechanism

what is jvm-breakglass

  • open source
  • integrates with any jvm process
  • console onto a jvm process

main features

  • interactive prompt
  • see inside private members
  • call arbitrary methods
  • create new object instances
  • create new classes
  • monitor object state
  • no need to use clojure to develop the app

how does it work?

  • jvm-breakglass runs inside the JVM and starts an nrepl server
  • you can then connect using an nrepl client (eg lein)

how to use it?

  • add it to your maven dependencies
  • add an entry point (as a <bean> or in java code)
  • connect with lein repl :connect localhost:1112

demo (enterprise application)

  • tomcat JVM
  • employee/dept data structure
  • report generation
  • java/spring mvc webapp
  • jvm-breakglass
  • spring data
    • in XML, naturally

homepage

  • oh no! one of the reports isn’t working?
  • “list employees in london” is empty
    • but we know that employee Mick Jagger lives in london
    • what’s going on?

breakglass to the rescue

  • view environment:
    • current directory, System/getProperties
    • view conf directory
  • list all loaded Spring beans
  • instrospect into object private members
    • bean builtin fn
    • to-tree to do so recursively
  • view methods or fields for a given object
  • redefine a class
    • in this case, (proxy [Address] ["1 Mayfair", "SW1", "London"] (getCity [] "London")) to define the new version, overriding a method
    • (.setAddress (:Mick employees) address) to inject it into the live data

take a step back

  • remember what it’s like to be a java programmer?
  • working with jmx beans and suchlike to try to understand why production is down
  • this stuff looks like magic

Q: how do you convince production people to put nrepl server in place?

  • short answer: impossible
  • that’s not how you present it
  • either you do it sneakily (that’s bad), and only pull the trump card when the team is desparate
  • or you convince the team that it would be useful in the UAT environment, and “of course it’s never going to be used in production” -

Q: have you considered a high-level switch that would prevent you mutating anything in the host application?

  • don’t know how you’d be able to do that
  • have been thinking about it
  • maybe using clojail
  • kind of defeats the point

Q: have you tested this with a scala app?

  • haven’t tried
  • I’ve reverse-engineered the java bytecode, and it’s readable
  • as long as you know how it compiles, it seems reasonable

Q: you were using methods like get-obj and passing string name. how does breakglass know which object to get?

  • eg if you have multiple instances of Department, how does it know which department?
    • in Spring it’s a Spring bean which is named
    • if you’re not using Spring, what’s your entry point?
      • when you create your NreplServer to enable jvm-breakglass, you can add your entry points there
      • new NreplServer(port).put("department"),myObject);
      • static methods & fields can be used too

Gary Crawford, Using Clojure for Sentiment Analysis of the Twittersphere

stratified medicine

  • determine the best treatment for someone based on their genetic makeup to manage their chronic disease

sentiment analysis

  • Paper: “Twitter mood predicts the stock market”
    • predicted Dow Jones average through monitoring tweets
  • people who suffer chronic disease tend to be neurocompromised
    • what would normally be a minor illness can prove fatal
  • can we use twitter to predict spread of disease?

so we tried

  • score tweets for flu symptoms
  • the data science wasn’t very difficult
    • but scaling it was
  • 30 million geo-tagged tweets sent from UK
  • couldn’t scale, even with
    • HDFS/hadoop
    • mongo/aggregation
    • mongo/mapreduce
    • postgres

how can we do fast, real-time analytics of social media?

  • application: how do people feel about Scotland’s independence referendum?
  • data increases in value as we analyse it
    • tweets
    • analytically prepared data
    • analysis
    • insight
    • predictions
  • the raw data isn’t what you care about
  • don’t store the raw tweets, only store the analytically prepared data
  • stored in redis using ptaoussanis/carmine
    • it has great support for bitmaps

example

  • (car/setbit sentiment tweet-id 1)
  • (car/bitcount "SCOTLAND") – tells me how many tweets have mentioned Scotland
  • how many people in england are happy?
(wcar*
 (car/bitop "AND" "ENGLAND&JOVIALITY" "ENGLAND" "JOVIALITY")
 (car/expire "ENGLAND&JOVIALITY" 10) ;; don't keep the data longer than 10 seconds
 (car/bitcount "ENGLAND&JOVIALITY"))
  • further: “how many people in Scotland are tired or grumpy?”

getting the data in

  • adamwynne/twitter-api
  • you can specify you only want tweets from a certain geographical locality with a bounding box
    • but this is literally a rectangle
    • need it around Europe
  • LMAX-Exchange/disruptor to communicate
    • journaling
    • syncing
  • business logic

what sentiment?

  • this is hard!
  • “I’m loving #EuroClojure! :D”
  • Positive Affect: enthusiastic, active, alert
  • Negative Affect: subjective distress
  • actually two separate dimensions, not opposites
  • Watson et al, 1988
  • PANAS
  • then PANAS-x
  • then PANAS-t
    • accounts for bias on social media
    • outlines sanitisation
    • validate against 10 real events

sanitisation

where? reverse geocoding

  • don’t want to rely on external services
  • don’t want heavy IO
  • don’t want round trips to database
  • accuracy not too much of a concern
    • we already lose accuracy in interpreting the sentiment of the tweet
  • convert a map of the uk to colours:
    • look up geocode coords in map
    • check colour → get country code
  • problem: the world is a sphere
    • projecting a sphere onto a rectangle
  • prior art in d3.js
  • use JavaFX to exploit it

when?

  • there’s a lot of seconds in a day
  • and even more seconds in a year
  • really not interested in seconds anyway
  • want to group tweets by minute
  • and also group by hour
  • and also group by day, and month, and year

why?

  • why are we doing this?
  • online social media are surveillance
  • the line between public and private is becoming blurred
  • if we don’t need data, we shouldn’t collect it
    • in this example:
      • we’re never more granular than country
      • we’re never more granular than overall sentiment
      • we’re never more granular than minute
    • hopefully this is enough to prevent anyone being identified
  • Datensparsamkeit

Q: have you used Storm for this?

  • no

Q: any preliminary results on the Scotland referendum analysis?

  • I’ve had more luck with tech than data science?

Q: which way should we vote?

  • haha

Q: how do you verify your results?

  • it’s very crude at the moment?

Paul Ingles, Multi-armed Bandit Optimisation in Clojure

  • @pingles

problem statement

  • product optimisation cycles are long, complex, and inefficient
  • the multi-armed bandit model shows lots of things we’re getting wrong
  • eg: online newspapers
    • fundamentally human-led, editorially-led
  • people behave irrationally
  • Dan Ariely & Daniel Kahnemann
  • (@philandstuff suggestion: Stuart Sutherland, Irrationality)
  • economist subscription options
    1. online $59
    2. print $125
    3. print & online $125
    4. the ridiculousness of option 2. makes option 3. seem more reasonable
  • need machines to optimise at scale; but need humans to provide stuff only they can
  • running RCTs to optimise sites
    • doing so on a continuing basis
    • measuring big effects work with small numbers of participants
    • but measuring small effects requires ever larger numbers
    • to the extent that you can only run ~12 experiments a year
    • which is not really good enough

Bandit strategies can help

  • a product for procrastinators by a procrastinator
  • Product: Notflix!
    • video website
    • http://notflix.herokuapp.com/
    • shows 3 different videos
    • show good videos at top of page, and less good at bottom
    • show best possible thumbnail for each video
  • optimising with multi-armed bandits
    • optimising order and thumbnails

multi-armed bandit problem

  • slot machine = one-armed bandit
  • problem: you have a bunch of money you want to “invest” in a casino
    • you have a number of different machines to play
    • each machine has a different probability of reward
    • you don’t know what that probability is up front
  • need to balance “exploration” and “exploitation”
    • ie learning about the world vs using that knowledge to maximise income
    • analogy: trying new foods out vs sticking to what you like

bandit model

  • number of arms {1, 2, …, K }
  • number of trials: 1, 2, …, T
  • rewards: {0,1}
  • K-headlines
    • options of different text
  • K-buttons
    • options of button text, colour, etc
  • K-pages
    • whole page redesigns
  • explore this space with notflix

bandit strategy

;; choose which arm to pull
(defn select-arm [arms]
  ...)

;; update arm with feedback
(defn pulled [arm]
  ...)
(defn reward [arm x]
  ...)

(defrecord Arm [name pulls value])

ε-greedy

  • “hello world” algorithm
  • generally exploit
  • ε (epsilon) is the rate of exploration
  • eg if ε = 0.1, your strategy is:
    • with probability 10%, try a random arm with equal probability
    • with probability 90%, try the best arm based on current knowledge
  • if ε = 0, always exploit; if ε = 1, always explore
  • example with bernoulli-bandit
(bernoulli-bandit {:arm1 0.1 :arm2 0.1 :arm3 0.1 :arm4 0.1 :arm5 0.9})
  • with ε=0.2, you converge faster on the best arm
  • but ε=0.1, you exploit it more when you find it
  • once you’ve found the best arm, you should be able to double down
    • ie explore more at the beginning (when you have least knowledge) and less at the end
    • lots of extensions to ε-greedy to factor things like this in

Thompson sampling

  • Arm model
    • Θ_k: Arm k’s hidden true probability of reward (in range [0,1])
    • can build a distribution for Θ_k based on current knowledge
    • small number of pulls means wide distribution; large number means narrow distribution
    • captures uncertainty in value of Θ_k
  • each iteration, take a random sample from each distribution, take the largest sample
    • algorithm naturally balances exploration/exploitation trade-off
    • the more it learns, the narrower the distributions get, and so the more likely it is to choose an arm with a higher expected value
  • incanter example
  • Thompson-sampling example with same Bernoulli-bandit from above
    • compared with ε-greedy, explores much more much earlier, and exploits much more later on
    • considered optimal convergence
  • we can use it to rank things (not just select)
    • take a sample from each arm distribution, then order arms by that value
    • in notflix, can use for ordering the videos we show

applied to notflix

  • video rank bandit
  • for each video, a thumbnail bandit
  • at the end, the best video should be at the top
    • and each video should show the best thumbnail

results

  • videos, worst to best
    • “hero of the coconut pain”
    • “100 Danes eat 1000 chillies”
    • “3 year-old with a portal gun”
  • thumbnail bandit data
  • “we built a fictional but amazing product”

links

Q: this model assume bandits have same probability through time

  • can it readapt?
  • Thompson sampling does adapt
    • it won’t change back as quickly

Q: isn’t there an interaction between the two bandits?

  • if the thumbnail is crappy, they might not click the video
  • made an assumption about this
  • in general, if you leave it running over time and let the evidence build, it should be fine in the long run
  • but that is definitely a flaw

Tommi Reiman, Schema and Swagger to improve your web APIs

super simple web api in clojure

  • just using compojure
  • “sausage” as example data
  • PUT /foo/sausage/:id
  • example:
    • in Java: immutable value object
    • in Scala: case class
    • in Clojure:
      • free-form map?
      • constructor fn with bunch of validation?
      • prismatic/schema!

prismatic schema

  • define structure of sausage
  • then call s/validate to validate
  • schema can define functions
(s/defn get-sausage :- (s/maybe Sausage) [id :- Long]
  (@sausages id))

(s/defn ^:always-validate get-sausage2 :- Sausage [id :- Long]
  (@sausages id))

schema coercion

(defmodel Pizza {:id Long
                 :name String
                 :price Double
                 :hot Boolean
                 (s/optional-key :description) String
                 :toppings #{(s/enum :cheese :olives :ham :pepperoni :habanero)}})
  • allows slurping JSON data, but imposing extra types
  • eg above we can slurp toppings from a JSON array into a Clojure set rather than a vector

double schema

  • loose schema for first input
    • (def Customer {...})
  • tighter schema for validated input
    • (def ValidCustomer (merge Customer {...}))

schema selectors

  • accept but remove unrecognised params with select-schema

generative schema

contribs

  • sfx/schema-contrib
  • cddr/integrity

swagger

  • a specification for describing, producing, consuming, visualising RESTful web services
  • https://helloreverb.com/developers/swagger
  • existing adapters
  • clojure options:
    • octohipster
    • swag
    • ring-swagger
      • compojure-api
      • fnhouse-swagger
  • endpoint definitions in JSON
  • data models as a JSON Schema
  • swagger UI
    • visualises the API
  • code gen
    • no clojure support yet (anyone?)
  • swagger-socket
    • run it all on top of websockets

ring-swagger

  • https://github.com/metosin/ring-swagger
  • JSON-Schema has some dates
    • but prismatic/schema will never support dates, as it’s more generic
  • higher level abstractions on top of swagger, but nothing for the web developer

compojure-api

  • an extendable web api lib on top of compojure
  • macros & middleware with good defaults
  • schema-based models & coercion
  • GET* macro to define input and output schemas

fnhouse-swagger

  • prismatic/fnhouse
    • launched at clojure/west
  • defnk with metadata → annotated handler
  • fnhouse-swagger
    • metosin/fnhouse-swagger

summary

  • schema is an awesome tool
  • describe, validate, coerce your data
  • building on top of ring-swagger
    • compojure-api → declarative web apis
    • fn-swagger → meta-data done right
    • or do your own!
  • kekkonen.io
    • CQRS-lib

Renzo Borgatti, The Compiler, the Runtime and other interesting beasts from the clojure codebase

an amazing growth:

  • mar 2006: first commit
  • oct 2006: 30k loc (7 month old)
  • oct 2007: clojure announced!
  • oct 2008: invited to Lisp50 to celebrate 50 years of lisp
  • May 2009: 1.0 + book!
  • now: almost 90k loc

initial milestones

  • apr 06: lisp2java sources
  • may 06: boot.clj appears
  • may 06: STM first cut
  • june 06: first persistent data structure
  • sep 06: java2java sources
  • aug 07: java2bytecode started
  • right after: almost all the rest: refs, lockingtx

drew on lots of sources of knowledge

  • collection of papers

high-level view:

  • (def lister (fn [& args] args))
  • read → analyse → emit/compile → compile
  • although the lines between the stages get blurred at times

reader

  • takes stream, returns data structures
  • PersistentList, Symbol, etc

analyser

  • input: data structure
  • output: exprs
    • DefExpr
      • Var
      • FnExpr
        • Sym
        • PersistentList
          • FnMethod
            • LocalBinding(Sym(“args”)),
            • BodyExpr
              • PersistentVector
              • LocalBindingExpr

Emission

  • bytecode generation for Exprs
  • prerequisite for evaluation
  • emit() method in Expr interface
  • Notable exception: called over ??

Evaluation

  • transform Exprs into their “usable form”
  • eg
    • new object
    • a var
    • namespace
  • FnExpr is just getCompiledClass().newInstance

Compilation

  • Usually coordination for emit
  • Compiler.compile namespace -> file

Emit

  • input: Exprs
  • output: bytecode

monsters!

RT

  • this is how the RT class gets initialised: the first time it gets referenced:
final static private Var REQUIRE = RT.var("clojure.core", "require");
  • simply referring to it here causes the static initializers to run
  • RT has a lot of behaviour in static initializers
    • inside it is the doInit(); call
      • which loads all of clojure.core
    • all just from referring to RT in some otherwise unrelated class!

Compiler

  • inner classes for each Expr type

LispReader

  • inner classes for each token you might encounter
  • <clinit>
    • sets up reader macros
      • macros and dispatchMacros (latter for #{ #( #_ #^ etc)

analyze()

  • not a class, but a family of methods
    • analyzeSeq
    • new ConstantExpr
    • MapExpr.parse
  • FnExpr.parse
    • invokes the compiling phase during parsing phase

emission

  • ASM lib used to generate bytecode
  • FnExpr.emitMethods()
    • generate a method for each of the arities of the function

other beasts

  • LockingTransaction and Ref

DynamicClassLoader

  • clojure.lang.DynamicClassLoader.findClass(String)
    • RT.classForName()
    • Compiler$HostExpr.maybeClass()
  • Class.forName() goes up the hierarchy of classloaders and asks each what they know
    • an instance of DynamicClassloader is created for each namespace
      • and also for each form
    • (this is true for the bootstrap phase; not always true eg in AOT (ahead-of-time) compilation)
  • supporting dynamicity
    • in defineClass:
      • classCache.put(name, new SoftReference(c,rq));
    • in findClass:
      • Reference<Class> cr = classCache.get(name);
    • SoftReferences are used to save PermGen, since if we redef a var we don’t want it to keep consuming PermGen

Bonus: clojure was initially implemented in lisp

  • ~1600 loc to implement read, analyse, compile, eval
  • although emitting Java code, not bytecode
  • was also generating C♯

Q: some things in bytecode can’t be expressed in java

  • is there anything which clojure generates which can’t be decompiled back to Java?
    • I’m pretty sure yes, but not sure exactly what
    • Rich:
      • locals-clearing
      • constructs which use goto (which exists in bytecode but not Java)

Rich Hickey, the insides of core.async channels

aside: here’s what clojure looks like in a good IDE

  • (ie IntelliJ)
  • yes, Compiler.java is massive
    • but if your IDE has a structure editor, you can navigate them all easily
    • it’s all in one file because I don’t want 300 files

aside2: the classloader has a cache in a branch

  • fast-load branch

warning! implementation details ahead

  • subject to change!
  • informational only

the problems

  • single channel implementation
    • for use from both dedicated threads and go threads
      • simultaneously, on same channel
  • alt and atomicity
    • Java CSP libraries often didn’t support alt well
    • it’s tricky to do atomically
  • multi-reader/multi-writer
  • concurrency
    • construct deals with the ick of threads and mutexes
  • (this talk: focus on JVM impl; JS version has less of these issues)

API

  • >! >!! put! alt! → channel → <! <!! take! alt!
  • it’s not an RPC mechanism, it’s just a conveyor belt

SPI (service provider interface)

  • >! >!! put! alt!impl/put! [val handler] → channel → impl/take! [handler]<! <!! take! alt!

anatomy

  • channel has:
    • pending puts (fifo)
    • a buffer (optional) in the middle
      • contains data
    • pending takes (fifo)
    • flag indicating if channel is closed
  • fifos implemented as linked queues
  • important to distinguish queues of operations from buffer of data

invariants

  • never pending puts and takes simultaneously
  • never takes and anything in buffer
  • never puts and room in buffer
  • take! and put! use channel mutex
  • no global mutex
    • or even multi-channel mutex

put! scenarios

  1. one or more waiting take! operations
    • gets paired up, takes handler gets completed
  2. stuff in the buffer, but with room in buffer
    • puts its stuff in the buffer, succeeds and immediately completes
  3. buffer full (or no buffer)
    • enter puts queue, block
      • results in backpressure
  4. full buffer, but windowed
    • sliding buffer: latest information takes priority, drop head of buffer (oldest item in fifo), put! completes immediately and enters buffer
    • dropping buffer: drop put! on floor, but completes immediately
    • could have more sophisticated policies in future

take! scenarios

  1. nothing in buffer
    • enqueued
  2. buffer has stuff, but no puts waiting
    • get data, immediately complete
  3. buffer full (or no buffer), puts pending
    • get something (either head of buffer or get paired with first put!)
    • first waiting put! completes (either enters buffer or hands directly to take!)

close! scenario

  • all pending takes complete with nil (closed)
  • subsequent puts complete with nil (already closed) (relatively new)
  • subsequent takes consume ordinarily until empty
    • any pending puts complete with true
    • takes then complete with nil

queue limits

  • puts and takes queues are not unbounded either
  • 1024 pending ops limit
    • somewhat arbitrary, might change
    • will throw if exceeded
      • if you’re seeing this, it’s an architecture smell
    • most likely if you use put! on the edge of your system

alt(s!!)

  • attempts more than one op
  • on more than one channel
  • without global mutex
  • nor multi-channel locks
  • exactly one op can succeed

implications

  • registration of handlers is not atomic
  • completion might occur before registrations are finished, or any time thereafter
  • completion of one alternative must ‘disable’ the others atomically
  • cleanup

handlers

  • wrapper around a callback
    • callbacks are icky, so we want to hide them
  • SPI
    • active?
    • commit → callback-fn
    • lock-id → unique-id
    • java.util.concurrent.locks.Lock: lock, unlock

take/put handlers

  • simple wrapper on callback
  • lock is no-op
  • lock-id is 0
  • active? always true
  • commit → the callback

alt handlers

  • each op handler wraps its own callback, but delegates rest to shared “flag” handler
  • flag handler has lock
    • a boolean active? flag that starts true and makes one-time atomic transition
  • commit transitions shared flag and returns callback
    • must be called under lock

alt concurrency

  • no global or multi-channel locking
  • but channel does multi-handler locking
    • some ops commit both a put and a take
  • lock-ids used to ensure consistent lock acquisition order
    • (avoids deadlock)

alt cleanup

  • “disabled” handlers will still be in queues
  • channel ops purge

SPI revisited

  • handler callback only invoked on async completion
    • only 2 scenarios
  • when not “parked”, op happens immediately
    • callback is not used
    • non-nil return value is op return
  • only time ops park
    • put! when it gets blocked on full buffer
    • take! when it gets blocked on empty buffer
  • only time ops complete asynchronously
    • take! with pending puts
    • put! with pending takes

wiring !/!!

  • blocking ops (!!)
    • create promise
    • callback delivers
    • only deref promise on nil return from op
      • non-nil indicates immediate success (and so callback never gets called)
  • parking go ops (!)
    • IOC state machine code is callback

summary

  • you don’t need to know any of this
  • but understanding the “machine” can help you make good decisions

Q: why use alt! for putting? what’s rationale?

  • taking multiple channels is like a select(2)
  • when you have consumers of different capabilities
    • I want to try to write to everyone, but whenever the first one is ready, I give it to them
    • Q: what’s the difference between that and having four consumers on a single channel?
      • you might have a priority metric, or a cost metric
      • though yes sometimes you can achieve same result two different ways

Q: why is global or multi-channel mutex not good enough?

  • well it would be easy! :)
  • a global mutex could make registration atomic
  • you’d have to make disabling other alts atomic
  • you’d have to make rendezvous atomic
  • you could have two unrelated sets of channel operations, why should they contend?
  • people hate global locks
  • rules out by my aesthetic sense :)

Q: David Nolen had an example of 10000 go blocks updating a textarea, did he hit the 1024 limit?

  • no I don’t think so, but not sure exactly

Q: are buffer & queue sizes useful metrics to monitor?

  • that would be great, and making them monitorable is on the TODO list

Q: other possible extensions?

  • buffer policies
    • you might have logic about priority
  • core.async has proven its utility and it’s become important
    • go macro is a great PoC of what you can do with a macro with several kLoC behind it
      • has its own subcompiler inside it
      • kind of implements a subset of clojure
    • maybe build async support into the compiler?
      • move locals from the stack to fields on the method object
      • I don’t need the stack anymore
      • I can be paused and resumed on another thread
      • declare a fn as async
      • comply with this SPI
      • could build other things like generators & yield
    • the pride moment of “look you can do this with a macro” is not dominated by the desire to make this performant and more solid
  • Q: continuations? how do they differ?
    • continuations are more general
    • this won’t use continuation-passing-style
    • it’s related
    • it won’t be like call/cc
    • it won’t be first-class
    • you won’t be able to resume it more than once
    • for a specific set of use-cases
    • Oleg did a talk that just generators are enough to do stuff that people think you need a lot more for

Q: is there something planned for dynamic binding and the go macro?

  • there are fns which allow you to do the conveyance
    • don’t know if go allows all of them to work

Q: channels on the network?

  • it’s easy to have something you call a channel and put over a wire
  • pretty hard to have all the semantics of these channels over the wire
  • already have queues and all sorts of interfaces to do similar things
  • atomic alt! over more than one wire not going to happen
  • maybe semantics for ports
  • or limitations on alt!
  • the wire has its own semantics, this is the key thing here
    • failure, queueing, delays
  • really easy to just take something from the wire and call put!

Q: is there a typical way to monitor a go block?

  • what kind of monitoring?
  • see that it’s still working, still alive?
  • if the channels were monitorable, you could see if things were producing/consuming properly

Q: what other options did you consider & reject in the design of core.async

  • something other than CSP?
  • the generators stuff
  • continuations
  • I liked what golang did
    • they made a good choice
    • there’s a java csp lib that impls the same kinds of ops
    • it’s difficult to get the semantics correct
  • wanted alts! to be a regular fn, not syntax
    • which feels like an enhancement over go
  • what we’re putting on these channels is immutable
    • which gives extra robustness

Meta-eX, conference party

  • github: meta-ex
  • twitter: meta_ex
  • soundcloud: meta-ex
  • facebook: meta.ex.live
  • website: http://meta-ex.com
  • wooo!

David Nolen, Invention, Innovation & ClojureScript

  • @swanodette
  • recently left NYT for Cognitect

“The future doesn’t have to be incremental”, Alan Kay

  • talks about Xerox PARC
  • worked there for a decade
  • in that decade, inventions!
    • bitmap screens
    • laser printers
    • GUI
    • PC
    • WYSIWYG & DTP
  • innovating is taking inventions and bringing them to a wider audience

The Dream Machine, JCR Licklider and the Revolution that made personal computing possible

  • M Mitchell Waldrop
  • he believed human factors would play an important role
  • we would all have a computer
  • he helped create the future we live in today
  • he helped ARPA finance PARC’s research
  • he helped finance John McCarthy & Ed Fremkin (sp?)

Man-Computer Symbiosis, JCR Licklider, 1960

  • talks about the trie data structure
  • (clojure’s persistent data structures use these!)

invention is hard

  • but innovation is equally important
  • Douglas Engelbart’s original mouse wasn’t very usable
    • a tonne of work went into making it more natural, more durable
    • (apple computers reference)
    • this is innovation!

Purely functional data structures, Okasaki

  • this book is about “paper complexity” – stuff that looks good on paper
  • it’s a foundation which people can build variants on
  • Rich did this
    • he doesn’t get credit for inventing the bit-mapped vector trie

the state of clojurescript

  • released 2011-07-20
  • a lot has happened since then
  • early experiments:
    • clojurescript one
    • himera (from fogus)
      • “translations from javascript”
      • showed what value clojurescript provides over javascript

has 81 contributors <3

  • the reason we don’t have copy-on-write data structures is because someone put in the hard work to make them
  • the reason we have source maps, similar

lighttable - ~11,000 lines of clojurescript

also, the world hasn’t stopped

  • js hosts have improved
  • persistent data structures were a basic performance win
    • COW doesn’t scale well past (say) 100
  • V8 had a lead when we introduced persistent data structures
  • we hoped that others would catch up
    • javascriptcore
    • webkit is trying to get asmjs-level performance with JIT compilation
    • nashorn has come along

demo

  • mori: library for js devs
    • here used to demo performance of persistent data structures
  • comparison:
    • adding 1000000 items to a JS Array
    • adding 1000000 items to a persistent vector
    • 85 ms vs 235 ms (V8)
    • this is really good!
  • comparison:
    • adding 1000000 items to a JS Array
    • adding 1000000 items to a persistent vector (using transients)
    • 85 ms vs 47 ms (!) (v8)
    • transients are faster than mutable arrays
    • javascriptcore: 28 ms (arrays) vs 30 ms (transient vector)
  • nashorn demo
    • benchmark: react running at the command line with om
    • building a template 100 times
      • ~13 ms avg with v8
      • ~8 ms avg with jsc
      • ~14 ms avg with spidermonkey
    • nashorn: slow load time & long warmup time
      • starts really slow (>1s)
      • converges slowly, but:
      • approaches ~23 ms

now what?

  • typescript, dart?
    • these are under the opinion we want to build the same broken type of stuff
    • cljs: we can build things radically simpler

React

  • library from facebook
  • other libraries have a deep-seated notion that everything is mutable
    • angular, backbone, …
  • react is different: it has a functional mindset
  • the virtual DOM evolves from one value to the next
    • clojurescript allows fast diffing between these values
    • react will do the right thing
  • react has completely taken over the cljs world
    • Om
    • reagent
    • quiescent (much thinner)
    • reacl

Om

  • Om was an experiment to show that representing app state a single global value was a good idea
    • this had been done before in other areas:
      • databases
      • server-side
  • we’re not going to make interfaces that people haven’t seen before
  • prismatic’s blog post about moving to Om
    • simple components which don’t interact in crazy ways

Goya

  • by Jack Schaedler (sp?)
    • ui dev for ableton
  • “we can do real undo”
  • Jack saw this and wondered if it would scale
  • Goya: pixel editor
    • surface: immutable vector
  • gets undo without adding complexity to app
  • get almost unlimited number, without loss of performance
  • github: jackschaedler/goya
  • his app is complicated!
    • the UI is complicated
    • but cljs eliminates unnecessary complexity
  • how much memory does his app use? not much
    • (aside: use google chrome dev tools!)

innovate!

  • model story needs work
    • js MVCs backbone/ember/angular
    • notion of a model on the client
      • you can do operations on it
    • nothing particularly compelling for this in the react space
    • DataScript
      • export some elements of the datomic api to the client
      • store your data in a flat way
      • sensible query api over it (queries on trees aren’t so fun)
      • datomic allows you to ask for entities
        • lift a tree out of the flat database
  • react model can be further improved
    • addressability
    • immutable everything
      • they have to convert styles and DOM attributes back to javascript objects which have to be walked
    • (one benefit of react: it’s facebook’s problem 😺 )

Q: is it possible to implement Om all in clojurescript using a macro?

  • I suppose it could
  • you might want to compose things dynamically, and macros are static
  • you have to be concerned with the amount of code that a macro generates
  • I would not pursue that idea

Q: is there a community place for shared Om components?

  • I’m not going to spend much time on it
  • if Om needs to be improved to make this happen, I will do that
  • you want to be able to use other people’s code without jumping through too many hoops
  • things get tricky with events & communication between components
    • need some agreement on how people communicate between components

Q: what’s your vision for cljs 1.0? how can we help with the yak shaving?

  • basic things like sharing code
  • shared analysis over clojure and clojurescript
    • would open up a lot of tooling
      • eg linter to lint both languages
    • would like infrastructure for tooling to be much better
  • when you go to 1.0, people lock to that version and are slow to move off it

Q: are you seeing much evidence of cmd-line or server-side cljs?

  • most people doing it are doing node.js

Q: when is cljs going to be self-hosting?

  • it’s not that we don’t want it
    • we’re keen on self-hostability/bootstrappability, if not self-hosting
  • nice to remove the JVM dependency
    • eg lighttable might not want it
  • it’s last-mile stuff at the moment
    • which isn’t that fun
    • and I don’t personally need it so I won’t work on it

Q: do you forsee a pure cljs version of react?

  • if someone wants to shave that yak, that would be awesome
  • if the system is immutable all the way down, the optimizability explodes

Ali Asad Lotia, Why devops needs Clojure

  • @aalotia

background

  • was a dev who had helped get stuff to prod
  • our ops person left
  • they asked me to fill in
  • I said “okay, as long as you hire a replacement soon”
  • they didn’t arrive
  • I missed being able to write code

problem

  • we’re exec’ing a jar, and it keeps taking 3s
  • I saw an opportunity to write a very simple noir app
  • much improved performance
  • people were impressed, asked to see the source code
  • “what is this clojure thing, and why did you use it?”
    • I’m not a seasoned Java dev
  • moved to another company:

Beamly

  • TV focused social network
    • smart TV planner
    • personalised TV Magazine
  • availability

behind the curtain

  • AWS: us-west, us-east, eu-west
  • milli-services (ie not quite µservices)
    • scala
    • node.js

my team

  • build/release automation
    • but we don’t do deploys; we just enable them
  • persistence
  • platform performance/metrics/logs
  • core libs

deploys in the mutable days

  • generate build artefacts
  • define config in puppet
  • deploy artefacts
  • deploy config

phoenix servers

  • base server images, with some configuration changes
  • relatively short-lived
  • didn’t name them or worry when they were switched off

disconnected dev and ops

  • zed shaw:
    • “maybe you use a language like lisp that pretends the computer is some purely functional fantasy land with padded walls for little babies”
    • actually, yes I do

immutable servers

  • kill server for every deploy
  • package new server images (AMIs) in order to deploy new version

Requirement: examine server images

  • aws console
    • some config but manual and not all info
  • python + boto:
    • just got back a list of objects
    • we know there’s more available! we saw it in the console!
  • clojure + amazonica
    • it just gave us data back!
    • data trumps objects every time for this kind of use case
  • console → cli → sdk → repl/scripts

repls are awesome

  • exploration of APIs
  • minimise context switching
  • instant feedback
  • data rich (or richer, at least, in some cases)

team reactions

  • “Soooo many brackets!!!”
    • I don’t see them anymore – paredit deals with it
  • “How do I iterate over this?”
    • why do you need to iterate? what are you trying to do?
  • “I want to change this value”
    • again, what are you trying to do?
  • “Wow, this is really powerful”

Offloading state

  • Immutable servers
  • pass the buck to a service someone else manages
    • application data
    • metrics
  • but when you do autoscaling, it takes some problems away but gives us other problems
    • provider defined data model
  • clojure was a great fit for managing autoscaling groups
    • all the information we needed was made visible by a single clj fn
  • in a repl, with ad-hoc tasks, having some clojure code you’ve written and evalling it is really powerful

Observing platform performance

  • knock-on/trickle down effects
  • sensu handlers limited

riemann

  • http://riemann.io
  • had used graphite
    • very data-poor
  • riemann gives you a clojure map, which is a much richer model
  • embedded REPL
  • overridable
  • extensible
  • responsive primary author

tracking our services

  • zion - system knowledge base
    • who owns which service? what do I do when alert X fires?
  • component details

infrastructure as data

  • config
  • metrics
  • logs
  • we had powerful ways to analyse this data without having to resort to glomming 500 scripts together
  • we have a single language which is superlative
  • I sit next to extremely good Scala devs and ask how they would do it
    • “I’d write a case class”

future work

  • analyse logs + metrics
  • catch and correct misconfigurations
  • scripts with upcoming fastload?
    • if clojure fastload is fast enough that we don’t have to worry about startup time, could it replace some of our python scripts?
  • cyanite to replace graphite
    • cassandra/clojure
  • “lisp isn’t a language, it’s a building material”
    • Alan Kay

clojure summary

  • pros
    • core data structures
    • data manipulation
    • community
      • #clojure and #ldnclj on freenode
      • people accept PRs, give real feedback
      • projects move
    • shared aesthetic

refs

  • martin fowler posts above
  • mcohen01/amazonica
  • pyr/cyanite

Q: can you share more info about zion?

  • we will when it’s in decent shape
  • too coupled to our particular environment right now

Q: graphite data poor? can you elaborate, particularly with reference to storage backend?

  • how data is stored is poor
  • all you get is an arbitrarily long key name (hierarchical)
    • a timestamp
    • a single numerical value
  • with Riemann, you can add arbitrary tags to the events
    • persisting them – don’t have a great answer
    • looking at influxdb
    • store time-series data with a richer data model

Leonardo Borges, Taming Asynchronous Workflows with Functional Reactive Programming

  • who has used Reactive extensions?
    • do you think it’s FRP?
  • currently writing “Clojure Reactive Programming: RAW”
  • when people talk about FRP, they mean merely “inspired by FRP”

Naming is hard

what’s the difference?

  • every construct in FRP has a precise mathematical definition
  • free of side-effects
    • kind of like Haskell’s IO monad

history

  • 1997: created in haskell
  • other haskell libs
    • reactive-banana, netwire, sodium
  • FRP-insired:
    • Rx[.NET/Java/JS], baconjs, reagi (cljs)
  • main abstractions: Behaviours and Events
    • traditionally:
type Behavior a = [Time] -> [a]
type Event a    = [Time] -> [Maybe a]
  • this talk: compositional event systems

motivating example

  • imperative code to iterate over a list
    • lots of changing state
  • functional code
    • we describe what, but not how
    • no mutating variables
    • gain reusable single-purpose functions
  • CES has similar principles
  • think of key presses as a list of keys over time
  • http://bit.ly/rxjava-github
  • http://bit.ly/rxjs-github
  • subscribe to event sources, filter/transform them
  • map behaviour to event streams
    • say, by sampling every second
  • flatMap / selectMany

network IO

  • rather than events from keyboard, mouse etc
  • in javascript: callback hell :(
  • on jvm: clojure promises don’t compose
  • promises in js are slightly better but have limitations

demo: simple polling app

  • partition/zip

quote

  • “FRP is about handling time-varying values like they were regular values”

why not core.async?

  • core.async feels like it’s a lower level of abstraction
  • it’s a great foundation for an FRP-inspired framework
  • reagi is built on top of core.async ( http://bit.ly/reagi )

bonus example: reactive API to AWS

  • retrieve list of resources from a stack
  • for each EC2 instance, get status
  • same for each RDS instance

Q: when you do you jump from handling manually to observables?

  • my rule of thumb is if I need anything more than a single callback, I’ll use this (or core.async)

Q: have you used RxJava from clj? How nice is it?

  • works great, so does RxJs

Stuart Sierra, Components: Just enough structure

architecture

  • software architecture is very simple(!)
    • presentation
    • business logic
    • DB
  • actually, much more complex
    • config
    • connections to external resources
      • monitoring
      • queues
      • sessions/connections in pools
    • process state
      • thread pools
      • caches
      • schedulers

Java: structure built in

clojure: not much structure

  • clojure namespaces aren’t classes
    • they’re not instantiable
  • def creates a singleton
  • (def foo (atom ..)) creates global mutable state
  • bootstrapping
(defn start-all! []
  (database/connect!)
  (create-queues!)
  (start-thread-pool!)
  ....
  (start-web-server!))

component

  • immutable data structure (map or record)
  • public api
  • management lifecycle
  • relationships to other components
  • It’s an object (ssh!)
    • not using it to represent data

State wrapper component

  • (defrecord DB [host conn] ....)
  • opaque to most consumers (by convention)

Public API

  • fns take component as an argument

Lifecycle: Constructor

  • set up initial state
  • no side effects

Lifecycle: Transitions

  • side effects happen here:
(defprotocol Lifecycle
  (start [component])
  (stop [component]]))
  • start and stop return an updated version of the component

Service provider component

  • (defrecord Email [endpoint api-key ...] ...)

Domain model?

  • traditionally intermingle data and behaviour:
public class Customer {
    private String name;
    private Address address;
    public void notify() {...}
    //...
}
  • Let data be data
  • just use a map

domain model component

  • represent aggregate operations
  • (defrecord Customers [db email])
  • db and email are other components, used by the customer component
  • entirely interacts through their public APIs
  • to construct a Customers instance, need to get its dependencies

system map

  • takes created but unstarted components:
(defn system [...]
  (component/system-map
   :customers (customers)
   :db (db ...)
   :email (email...)))
  • to start the system, understands dependencies and works out correct dependency order to start each component
  • then wires each component up to the correct (started) dependency
  • stopping the system is similar but in reverse dependency order
  • Before start, dependencies not filled in yet (just nil)
  • after start, fill in dependencies
  • the system is just a map
    • so if I want to inject a test stub, I can just assoc it in:
(defn test-system [...]
  (assoc (system ...)
    :email (stub-email)
    :db    (stub-db)))
  • works as long as I do it before starting any services

DB for testing

  • fixtures to inject into database
  • mocking the db is too hard unless you use datomic 😏

var substitution & asynchrony

  • with-redefs and binding are delimited in time
    • problems if you dispatch to another thread
    • potential race conditions
    • tightly coupled to implementation
    • wrong level of granularity

Entry point: main

  • exactly one mutable global for the whole system
  • (def sys (atom nil))
    • use reset! not swap! because start and stop are side-effecting and swap! might call multiple times
    • (@samaaron ed: uses agents for this sort of thing)

Web app: static routes

  • defroutes considered harmful

renaming dependencies

  • you can merge systems
  • name common components with shared keys:
{:a/web-app ..
 :a/server ...
 :db ...
 :email ...}

{:b/web-app ..
 :b/server ...
 :db ...
 :email ...}

(merge system-a system-b)

core.async

  • components take channels as state
  • decouples components from one another
  • system creation can create the channels you want and wire them up

summary

  • advantages
    • once you’re used to the patterns of clear dependencies and boundaries, you maybe don’t even need the library anymore
    • isolation, decoupling
    • testing, refactoring
    • automatic ordering of start/stop
    • easy to swap in alternate implementations
    • everything at most one map lookup away
  • disadvantages
    • requires whole-app buy-in
      • won’t get a lot of the benefits without this
      • porting an existing system can be tedious
    • system map is too big to inspect visually
    • cannot start/stop only part of a system
      • may try to fix someday but don’t really understand how yet
  • possible future
    • “init” acquires resources but doesn’t start?
    • “close”/”stop” separation – close acquired resources and discard dependencies so they can be GC’d
    • (the “stop” method doesn’t dissoc anything)
      • dissoc stops a record being a record
      • you might want to use that state again
    • handle mutable containers for systems
      • currently, library code doesn’t care – you can use an atom or a var or whatever
      • allow individual components to start, stop, or change at runtime
      • deref container and get “current” component with latest deps
      • catch errors, mark component as “failed”
        • this is the tricky part

Philip Potter, Generative testing with clojure.test.check

Chris Ford, the hitchhiker’s guide to the Curry-Howard correspondence

  • number of papers published today by the foremost expert on the Curry-Howard correspondance…
    • 1
  • Don’t panic!
  • Gödel’s incompleteness theorem

introduction

  • a → a
    • this is a proposition in logic
    • but it’s also a type
      • the type of the identity function

our heroes

  • Haskell Curry
    • 1958: textbook on combinatorial logic
      • didn’t necessarily understand how revolutionary this idea was
  • William A. Howard (1969)
    • not only does a type correspond to a proposition, but:
      • a function with a type corresponds to a proof of a proposition
    • “The formulae-as-types notion of construction” - finally published in 1980

modus ponens

  • (a → b) → a → b
    • modus ponens
    • type of apply (haskell or idris):
apply :: (a -> b) -> a -> b
apply f x = f x
  • the implementation here corresponds to a proof of modus ponens
  • apply works with any types a and b
  • modus ponens works with any propositions a and b
  • view the type “Integer” as the proposition that integers exist
    • any example – say, 65 – counts as a proof of this proposition
  • here, we use (3==) to prove that Integer -> Bool is populated
(3==) :: Integer -> Bool

apply (3==) 4
False: Bool

composition

  • (a → b) → (b → c) → (a → c)
    • type of function composition
  • length : List a -> Integer
  • (3==) : Integer -> Bool
    • comp length (3==) : List a -> Bool
    • if we accept that List a exists, we now prove that Bool exists

axioms?

  • a → a
  • (a → b) → a → b
  • a → b → (a,b)
    • if I can build an a, and I can build a b, then I can build an (a,b)
  • (a,b) → b

bottom type

  • a → b
  • (a → b) → (b → a)
    • neither of these are true in general
    • the bottom type: ⊥ is guaranteed to have nothing in it
      • represents falsity in the Curry-Howard correspondance
      • represents something it’s impossible to prove, because it’s not true
AnythingGoes : Type
AnythingGoes = (a : Type) -> a

cantProveItAll : AnythingGoes -> _|_
cantProveItAll f = f _|_
  • cantProveItAll shows that AnythingGoes is uninhabited (because if it weren’t, it would imply that ⊥ was inhabited)

harmless(?)

  • types prove our program correct?
  • types only get us so far
    • can still get runtime errors if the types check out

mostly harmless.

  • are types defective?
  • haskell will crash at runtime despite an advanced type system
    • head [] isn’t defined

enter Idris

  • Edwin Brady, creator of Idris
    • (and whitespace)
  • killer feature of Idris:
    • allows you to make condescending remarks about the Haskell type system
    • although it’s really a dialect of Haskell
  • example:
Type Nat = Z or (S Nat)

Type List = [] or (x :: List)

data Vect : Nat -> Type -> Type where
  Nil  : Vect Z a
  (::) : (x : a) ->
         (xs : Vect n a) ->
         Vect (S n) a
  • in Haskell, types can be parameterized on other types
  • in Idris, they can also be parameterized on values as well as types
    • Vect 2 Integer is the type of vectors which contain exactly 2 Integers
      • or rather, Vect (S (S Z)) Integer
head : Vect (S n) a -> a
head (x::_) = x

head []
Can't unify Vect 0 a
with Vect (S n) iType
  • trying to take the head of an empty vector is a compile-time error

concatenation

  • signature: Vect m a -> Vect n a -> Vect (m+n) a
  • sort: Ord a => Vect m a -> Vect m a
    • would have caught Phil’s my-sort which dropped duplicates(!)
    • didn’t manage to get this implemented in the lunch break
      • it’s not theorems for free 😉

even number family of types

data Even : Nat -> Type where
  Zero : Even Z
  Next : Even n -> Even (S (S n))

Zero : Even Z
Next (Next (Next Zero)) : Even 6
  • can now show that even numbers sum to even numbers:
add : Even m -> Even n -> Even (m + n)
add Zero y = y
add (Next x) y = Next (add x y)
  • Although we’ve really proved that:
    • there exists an operation which takes Even n and Even m and returns Even (n+m)
    • we chose add but could have chosen any other name
  • can prove that 42 is even
  • Even 3 is a valid type
    • Even 3 -> _|_
    • proof in slides

the unit type

  • () represents truth
    • you don’t need anything else to prove this
    • you can construct it without context
    • (you could use Even 42 to represent truth too)
    • so LifeTheUniverseAndEverything -> Even 42 :)

References

Q: is there a way to specify that sort’s return value is sorted?

  • @bodil thinks it’s true :)

Q: is the type-checker guaranteed to terminate?

  • it’s equivalent to the halting problem

Anna Pawlicka, Reactive data visualisations with Om

Technologies

D3 (data-driven documents)

  • to visualise data
    • table of numbers, bar chart, whatever
  • data bound to DOM
  • interactive - transformations driven by data
  • huge community
    • huge number of plugins and extensions
  • Higher level libs available
    • hide the complexity of d3
    • but if you need to tweak the underlying d3 it’s still available

leaflet.js

  • layer on top of d3
  • mapping data
    • tile layers, vector layers
  • user interaction

dimple.js

  • charting library on d3
  • bar charts

react

  • (interface components)
  • solves one problem: complex UI rendering
  • just the V of MVC
    • say no to “two-way data binding”
  • re-renders the entire UI
    • sounds like a bad idea
    • actually quite performant, due to:
  • virtual DOM
    • diffs between previous and next renders of a UI
  • less code
  • shorter update times

react lifecycle

  • IInitState →
  • IWillMount →
  • IShouldUpdate →
    • IRenderState
    • IRender
  • Om handles most of these for us (particularly IShouldUpdate)

Om

  • entire state of the UI in single piece of data
  • immutable data structures = reference equality check
    • shouldComponentUpdate() can be overridden to take advantage of this
  • snapshottable, free undo

Liberator & core.async

  • component interaction
  • liberator: RESTful apis via defresource macro
  • core.async
    • js developers would freak out looking at it
    • get blocking calls without browser freezing

data sources

  • for example, local sensors
  • may want to perform some sql queries to see patterns in your data
  • may wish others to retrieve data through API (if they don’t like our chart)

Chart & API demo

  • user interacts, triggers API calls to fetch data, updates graph in real time
  • chart implementation
    • om/IInitState to construct
    • om/IRenderState
      • to update
  • device-form
    • om/IWillMount to read shared info to find API endpoint
    • om/IRenderState
  • form-row
  • chart-figure

last.fm chart

  • chart based on last.fm playlist
  • input box takes username, calls last.fm to find that user’s playlist
    • chart then shows bands from most popular downwards

interactive maps

  • input box for postcode lookup
    • uses google geocoding api to get coords
  • can click map to create marker & display coordinates
  • app-model stores map location and coordinates panel contents
  • nice use of core.async sliding-buffer 😎
  • (go (while true ...)) could be (go-loop ...) ? dunno

summary

  • fast rendering and interactivity is yours with js + cljs + om
  • immutability = efficiency
  • sane application structure
  • (philandstuff ed: this presentation is very visual, just watch the video!)

algernon, The Face of Inspiration, or how Clojure helps bring Lisp to Python

  • github: algernon
  • twitter: algernoone
  • sorry, I was in the hallway track for this 😦

Malcolm Sparks, Assembling secure clojure applications from re-usable parts

  • @malcolmsparks, juxt

warning! research, evolving ideas, alpha quality

juxt/modular

  • set of components compatible with stuartsierra/component
  • http-kit, bidi router, mustache templating

juxt/cylon

  • security components
    • login form
    • session
    • user domain
    • hashing
    • authn and authz

assertion

  • libraries are great
  • systems are complex

assertion 2

  • a meta-architecture, that can scale to hundreds of diverse projects, is useful

architecture

  • don’t want to just port Spring MVC to clojure
  • components, dependencies, protocols

components

  • reusable bits

dependencies

  • wiring of components together
  • since the system is in a var, I can do a tree-walk on the system, and show it
  • can visualise it with dagra and svg rendering, and react
  • slide deck which shows its own wiring
    • I’m So Meta, Even This Acronym

protocols

  • integration surface
  • necessary for component interchangeability
  • example: bayonet light-bulb fitting
    • can plug a light bulb into it
    • light bulb dies – replace it!
    • over time, can replace entire system by replacing parts
  • hidden couplings
    • copied code
    • database schemas and sql queries
    • URI formation & URI dispatch
    • have to change things in multiple places to effect change
  • juxt/bidi
    • dispatch and forge URIs from the same route data

component example

  • constructor
    • defaults with merge
    • schema/validate
  • components are units of cohesion
    • implements multiple protocols:
      • component/Lifecycle
      • WebService
      • JavaScripts
      • TemplateData

intermission

  • maze creation in cljs
    • “drunken walk” algorithm
      • the first time you visit a space, you break down the wall
  • core.async visualisation and demo of map<

the index pattern

the interceptor pattern

  • a component is wired in between two components

shared dependency pattern

security

challenge

  • don’t want to re-implement security components
  • tried-and-tested security by default
  • flexibility of ‘roll-your-own’

example: website

  • router routing between sub-websites A and B
  • add a login form to the router, which uses:
    • user-domain
      • password-algo (eg scrypt)
      • user-store (eg cassandra)
    • session-store
    • all comes from cylon
  • add authorization component to website B (again from cylon), using:
    • authenticator
    • session-store (same dep as above)

summary

demo

  • lein new modular myapp
  • lein new modular myapp +cljs
  • lein new modular myapp +cljs +security
  • lein new modular myapp +cljs +security +devtools
  • (dev) fn
    • if your code doesn’t compile on your repl, then you just get loads of stack traces
  • secure content, rather than URI routes
    • there may be multiple routes to the content
    • restrict-handler to wrap a response in a RestrictedHandler which implements IFn to look like a fn and make it invokable

Q: what about hypermedia to decouple URI dispatch & formation?

  • we don’t have HATEOAS because it’s quite hard
  • want to make it easier

that’s all folks!

  • thanks for reading :)

Amazing! Link to geomlab is incorrect (but guessable)

Thanks for doing these - really helpful.

Thanks, amazing job!

Thanks for writing these down!

bowika commented Jun 27, 2014

awesome, thanks for doing this

minimal commented Jun 28, 2014

Generative schema gist: https://gist.github.com/davegolland/3bc4277fe109e7b11770

Herbert is like Schema and comes with test.check integration:
https://github.com/miner/herbert#testcheck-integration

beppu commented Jun 29, 2014

These are great notes to what looks to have been a high quality conference. Thank you for sharing your notes.

tzach commented Jun 29, 2014

Thanks for taking the time and effort putting this together.
Much appreciated

Thank you!

kasz commented Jun 29, 2014

Thank you very much. Great supplement where my own notes are lacking.

Anna's slides are here:
http://www.slideshare.net/annapawlicka/reactive-data-visualisations-with-om
and her demo is here:
https://github.com/apawlicka/om-data-vis
And thank you for taking these notes, fantastic job.

See you all next year!!! Thomas

Thanks for these! Regarding React.js re-implementation in pure ClojureScript, take a look at Tesseract: https://github.com/scottrabin/tesseract

Thanks for sharing !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment