Skip to content

Instantly share code, notes, and snippets.

@stefanjenkner
Forked from jstaffans/clojured_2015.md
Last active August 29, 2015 14:14
Show Gist options
  • Save stefanjenkner/462d1351811efdeaf517 to your computer and use it in GitHub Desktop.
Save stefanjenkner/462d1351811efdeaf517 to your computer and use it in GitHub Desktop.

Notes from ClojureD 24.1.2015

Albrecht Schmidt: "Start your engine: My Clojure Bot in the Hello World Open 2014"

  • Hello World Open: programming contest organised by Reaktor and Supercell with ~2500 teams, worldwide distribution
    • Client-server car race, cards driven with some parameters such as decelerate/accelerate, change lanes
    • Organisers provided test server with simple testing UI
  • Parameters for car are quite simple: current angle, position etc
  • Clojure works well for processing simple data structures like this - analysis, storing, examining
  • Used Incanter to plot bot data during test (throttles vs. angles)
  • numeric.expresso used to reverse-engineer physics: basically guessing what the formula might look like
  • Other option for solving physics problem: interpolate with lots of data

Code should be on Github by Monday.

My take-home points

  • Incanter is useful for visualising data flowing through a live system - easy to create simple graphs. Doesn't have to be used just for stand-alone analytics applications!
  • Loved the recorded coding in the presentation. https://twitter.com/dl1ely/status/558906547133505536

Paulus Esterhazy and Christian Betz: "Big Data Processing with Spark and Clojure"

Slides, Github

  • Many people in audience use Hadoop, only a few have used Spark
  • Spark brings big data, distributed systems and the JVM together
  • Spark value proposition:
    • Keeps stuff in memory - no I/O needed for intermediate results
    • Can work with many different data sources and data structures
  • Spark vs Hadoop:
    • Hadoop (= MapReduce + HDFS) is the de-facto standard
    • Spark replaces MapReduce, but you can still use HDFS with Spark as well
    • What's wrong with MapReduce? Performance - performs I/O after each step. Bad for tasks with many steps, e.g. machine learning
    • The main innovation of Spark is data sharing between processing steps
  • Resilient Distributed Datasets (RDDs)
    • could be on any node
    • are partitioned into blocks, e.g. 64 Mb large, and distributed across nodes
    • resilient = doesn't matter if some blocks get lost e.g. due to node failure, Spark keeps track of lineage and will re-compute
  • Workflow:
    • Get RDD from datasource, e.g. HDFS, JDBC query, Cassandra, text file
    • "Transform" and get new RDDs
    • Perform an "action" and get an end result

Now for some live coding - or not!

  • Example: compare access log parsing with plain Clojure and sparkling.
  • Spark code is easier to unit test than Hadoop equivalents
  • Spark solution uses Spark's built-in support for datastructures with key-value pairs
    • e.g. reduce-by-key only works with RDD in tuple format
    • important for performance
    • makes code shorter than the pure Clojure version

Spark and Clojure

  • Clojure is a great fit for Spark's already FP-inspired approach
  • Attaching to nREPL on running node is very handy - GorillaREPL used for Notebook-style interface

Tips

sparkling vs flambo

  • flambo promises to be simple, but breaks the rule of "as simple as possible but not simpler" by converting Scala Tuple2 to Clojure vector and back when necessary -> loses Partitioner information and becomes slow

My take-home points

  • Spark feels like a more modern approach compared to Hadoop
  • I don't really have big data use cases at the moment, but Spark offers a good scaling path for growth scenarios
  • Should check out GorillaREPL for introspection of live systems

Hugo Firth: "Continuous Delivery in Clojure"

What we're talking about: ability to automatically deploy tested code to production whenever

Testing

  • Test pyramid in Clojure:
    • UI/top level: clj-webdriver (Selenium), kerodon (ring HTML), peridot (any ring app) -> impure, side effects
    • Integration: impure, side effects
    • Component: pure kind of integration tests
    • Unit: clojure.test, midje, with-redefs for simple harnesses. pure
  • Contract tests: useful in today's mobile-centric world where we can't influence update rates of mobile apps
    • Can act as a stub for clients
    • pact, pact-jvm (doesn't have clojure wrapper it seems), janus (not maintained)
  • Component/integration tests can be implemented using the same tools as unit tests, the difference is mostly conceptual. Can e.g. be separated into namespaces (acceptance, integration, core, ...). Midje has good support for wildcards.
  • Open question: where does generative tests fit in the test pyramid?

Deployment

Heroku

  • Does magic behind the scenes, not very transparent but includes lein uberjar
  • -main function, :main key in project.clj, Procfile all needed
  • Breaks the "Only compile binaries once" principle

Own infrastructure

  • Aim for deploying a single JAR
  • Will probably include shell scripting, Ansible or something

Configuration

  • One way: create your own .clj file with configuration values. Read it in, merge with default options
  • (My note: Component, leaven, duct are better options)

My take-home points

  • Think about where in the test pyramid your tests fit
  • You can get pretty far with standard Clojure testing tools

Konrad Szydlo: "The power of the Datomic database"

  • Main features of Datomic:
    • No update-in-place, only assertions and retractions (where retraction != removal)
    • Immutable data, database as a value
    • ACID
    • Datalog query language
    • Doesn't do storage by itself, instead uses Riak, RDBMS, DynamoDB ...
  • Datom!
    • Entity, Attribute, Value, Tx, Operation (successful or not)
    • 1234, name, John, "at work", true
  • Time: easy access to past and future states
  • Indexes: EAVT, AEVT, AVET, VAET
  • Value of DB passed to queries and functions
  • Datalog:
    • declarative like SQL, logical language, pattern matching
    • :find, (:in,) :where
    • most logic will be in :where
    • order of statements usually doesn't matter - as a rule of thumb, provide specific datoms first
    • Classic Datalog is the Entity API, there's also a new, higher-level Pull API
    • Java methods can be used too, e.g. .startsWith
  • Powerful DB functions in Clojure, use with invoke
  • Filters can be used to use different queries for different parts of the database. e.g. as-of (DB value at a certain point of time), since (auditing)

Transactor

  • Single instance (one standby)
  • Handles writes
  • Ensures ACID

Use cases for Datomic

  • Most useable when going back in time, immutability are important capabilities
  • Not suitable as replacement for e.g. ElasticSearch/Solr full text search, where query is done over lots of data

My take-home points

  • Datomic has a really powerful query language - the fiddly SQL queries that I sometimes use to for example generate reports can probably be replaced by a few lines of Datalog
  • So much can be done using validators, transactions, filters.. logic moves to the DB layer
  • Wonder what the biggest win would be - the immutability or the query capabilities

Andreas 'Kungi' Klein: "Frameworkless Web Development in Clojure"

Slides

  • Web developing in Clojure is liberating - no dominant frameworks, many ways of doing things
  • ring is de-facto
    • similarities in other languages: Rack, WSGI
    • called "ring" b/c of client -> request -> response -> client feedback loop
  • Other functions handled by small libraries: clout, compojure, hiccup, enlive, friend ..
  • Everything works well together, everyone uses plain Clojure data structures
  • For more complex applications: Component is nice. But start with it! Very hard to retrofit

Tobias Bayer: "Clojure Testing with Midje"

Slides

  • Rationale: lisp inside-out syntax not nice for reading example-based tests - left-to-right is nicer
  • Useful add-ons: simple stubbing and mocking, auto-test feature ...
  • future-fact: like JUnit @Ignore annotation, for TODOs
  • Lots of checkers, that can be used on RHS instead of values - check the Midje wiki
    • anything/irrelevant: useful for cases when return value is not interesting b/c we are checking for side effects

Stubbing

  • provided: does essentially the same thing as with-redefs

Mocking

  • Testing for interactions - "called 5 times" scenario
  • Also using provided

Integration tests - how to separate them

  • Can tag tests with e.g. :slow metadata, run with lein midje :filter -slow
    • Can run into cases with nested facts
    • Separate namespaces another option

My take-home points

  • Testing is a good organisational inroad for Clojure and Midje is a great testing tool
  • Second talk with failed live coding already. Albrecht's approach of having recorded coding paid off!

Jan Stepien: "Generative Testing: Properties, State and Beyond"

Slides (generated with test.check generators!)

  • Converting example-based tests: think about what properties must hold
  • The point is to use only "generators" and "properties" for your test
  • Running loads of iterations is great for finding edge cases
    • Re-use seed to get the same input sequence

Real world examples - www.stylefruits.de

  • Complicated routing - /hosen/lee/farbe-hellblau/...
  • Simplify by mapping paths to descriptors and back. Property: (comp descriptor->path path->descriptor) is the identity function
  • Generator: all valid descriptors
  • Found lots of bugs

Stateful things

  • Paper: "Testing Telecoms Software with Quviq QuickCheck"
  • jstepien/states
  • michalmarczyk/ctries.clj
  • Integration tests: could have generators that seed databases, creates events as well

Resources

  • QuickCheck paper (Claessen, Hughes)
  • "Testing the hard stuff and staying sane", John Hughes
  • "Generative testing with clojure.test.check", Philip Potter

My take-away points

  • Use sample for testing complicated generators
  • Can use exploratory/generative testing for finding edge cases and implement unit tests for these - very valid use case for generative testing!

Jelle Akkerman: "Clojurescript and user interfaces: Simplicity yields possibilities"

  • 2012: painful experience with AngularJS building a soulseek clone
    • State handling main pain point
  • Since 2014: success with ClojureScript and Om
    • Besides state management lots of other goodies: hotswap (figwheel), time travelling ...
    • Compiler is not a bad thing
    • core.async - "if you do not understand yet, you are probably thinking too hard - it's simple"
  • People have done quite cool samples for others to learn from, e.g. http://shaunlebron.com/t3tr0s-slides
  • bhauman/devcards
  • Om and Chestnut (lein) vs Reagent and Tenzing (boot)

Talks I didn't see but were probably great

Falko Riemenschneider: "JavaFX GUI architecture with Clojure core.async"

Slides

Martin Klepsch: "Boot - Build Tooling For Clojure"

Slides, template

Meikel Brandmeyer: "Hay - a concatenative language"

Github

Michael Klishin (via Video): "Scalable Way of Doing Open Source: The ClojureWerkz Story"

Slides


Missed the lightning talks unfortunately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment