Skip to content

Instantly share code, notes, and snippets.

@bruntonspall
Last active January 3, 2020 11:52
Show Gist options
  • Save bruntonspall/b9ee2486484b19683fe0 to your computer and use it in GitHub Desktop.
Save bruntonspall/b9ee2486484b19683fe0 to your computer and use it in GitHub Desktop.
Pycon UK 2014

Keynote

It's successes

10 years ago, python was still an esoteric language - paul graham's python paradox, the best language to learn is one that you don't want to learn for a job, but for the enjoyment of it. Today python is one of the biggest langauges.

CTO magazine says Python is a solid enterprise choice.

Increasingly popular is various places, animation pipelines, etc.

Python is most common teaching language at universities in the US, and probably similar in education here in the UK.

Python has it's problems

Threats

Referenecing the "why django sucks: tradition from djangocon"

What is it that others are doing well, that python isn't doing well.

Three primary competitors, Java, Javascript(Node.js), Go.

Why should we care? We don't need to beat other languages, we need to know our strengths. But we worry about people moving from python to another, because it reduces the ecosystems. Can't develop in a vacuum. Sponsors want to sell to us, becuase there is an underpinning.

These three languages have dead code. Code that will never be executed. Reu-use from Java into python, or python into Go. C/C++/Rust there is a progression, I can interoperate from a rust program into a C library.

There is a large disconnect between the languages, they force you into their ecosystem.

Java

venerable, unusual to be an ecosystem. Traditionally an enterprise tool. Hackerness scale, when hacking a program togher, what language will people use? It used to be python was almost hte top here, but java has moved up for some time. The JVM as a platform is impressive, lots of corporate investment in making it fast, deployable and work correctly. Consistent user runtime, used to be write-once, debug everywhere, actually now works everywhere. Also Apache Software Foundation seemed to favour java, which builds a more feature filled ecosystem. Plus Android, running the DalvikVM, which supports the JVM system.

Javascript / Node.JS

Javascript doesnt really have a community, it is part of the web, but also things like node. Javascript had ubiquity, it could be done in any browser. GMail was the first major javascript application, and moved the acceptableness of javascript. Chrome and the browser wars made the javascript vm significantly faster, making it more useful. Also there was a major requirement to move towards asynchronous programming, due to browser requirements, and this became a strength. Python has difficulty with this concept as well. The rise in the success of JSON (Note: he's wrong on some of this. JSON is not about eval'd javascript, and not about the javascript sandbox). XML was big and heavy, JSON was lightweight. Stack swap between frontend and backend, between python and javascript causes mental issues. Hence the rise of node.js

Go

Why did it get so fast so quickly "Because Rob Pike" (I think this is bobbins too, most developers don't give a flying fancy who designed the language. Ask developers to name langauge writers?) Is designed for solving some very specific problems. It's a small language, with some good abstractions. Go-routines and Channels as language concepts of course It wasn't just C++ with extra syntax or removed features. It's more like python, it feels designed and purposeful Also gofmt, see also pep-8, it gives rise to readability, but people will always fight.

Really got deployment right, since it's a staticly compiled tool. Everything in one place, just drop it.

HTTP/2 - Because the web isn't hard enough

REquests core contributor, urllib3 contributor as well. IETF HTTPBis WG. Early implementaiton of HTTP/2

The web is HTTP, it's almost everywhere. It runs over TCP/IP, but most people don't care about the lower levels. HTTP is imperfect, a couple of pretty substatial flaw. It's quite complicated, used to be RFC2636. Now spans 7230-7235 RFCs, 305 pages, some portions of which are optional.

It looks quite simple, but toy implementations are problematic because you don't know what optional features htey implemented. For example, chunked encoding.

It also uses TCP quite badly, for exmaple no connection re-use, each connection must go through renegotiation and latency controls. Latency is also a consequence of these connections.

HTTP is holding us back, making us invent crazy features like image spriting, image inlining, css sharding. Caching gets more difficult, since you now invalidate larger chunks of the cache.

What's the solution? Lets start a new protocol, HTTPBis Working Group. IETF - standards are done by rough consesus and working code. HTTPBis - the next HTTP, started in 2012. Work started on SPDY, the new version of HTTP.

If HTTPBis didn't complete soon, people would just use SPDY.

Big changes:

  • Binary protocol, not a text protocol
    • Engineers like text protocols, I can debug it with telnet.
    • But text protocols are unsafe, difficult to parse, inefficient etc.
    • Simple mistakes in the parser lead to very werid bugs that are hard to find and fix.
    • Buffer allocation is easier for binary protocols, since sizes are more standard.
    • However, HTTP/2 is a binary framing of HTTP/1, so there are things we can't fix.
  • Adding efficnecy gains
    • Multiplexing - you should use a single connection and multiplexing to avoid the current issues of opening multiple connections
    • Streams, a request/response pair is a stream. The priority and flow control enables low power devices to prioritise repsonses that might only be returnable one at a time
    • Flow-control, please stop sending more, or send more. Again for embedded devices
    • Header compression, headers from browsers are pretty large, adding compression. This isn't just GZIP, since apparently compressing hte headers could opena vulnerability in TLS. Instead they wrote HPACK, which is HTTP specific header compression. This is more changeset based, so you can now say: remember this header, and reuse it again.
      • Also includes privacy based tools, so you can declare a header as "never compress" which means you can't use it as an attack vector.
    • Early stream termination, until now, only a TCP RST packet can stop the stream. Now inside the multiplex system, you can terminate a stream, without terminating the connection.
    • Server Push, servers are now allowed to pre-emptively send responses to requests you didn't yet make. Can be aborted by the browser if needed.

How is it going to work?

It's not perfect, was designed very fast. It was based on SPDY, which was designed to solve browser problems that Google had. Non browser clients have different problems that HTTP2 wont solve. There's a ton of state to maintain, because streams inside connections etc. This means debugging it is really hard. TCPDump isn't going to help as much. You won't be able to read headers unless you've read the entire connection. Certain systems, such as kerberos, allows you to send giant datablocks in the headers, this causes issues. It is inherently concurrent, which means synchronous libraries are much harder to write now. Play with it

  • hyper - Python, client
  • nghttp2 - C, client and server
  • Firefox - Browser
  • curl/libcurl - client
  • wireshark

Existing implementation, Twitter.com uses it, firefox uses it, wireshare can view it.

http2-explained.ietf.org

@lukasa @lukasaoz

Rachel Sanders @trustrachel LinkedIn

  • Transisitoning from a single company just in MountainView, to being a more multinational company Bigger shift than you might think. Bigger companies work differently. In small companies, they work adhoc, you just go talk to Tanya when it's a database problem. They breakdown as soon as you can't just walk down the hallway to talk to someone. You replace peple with systems.

This sounds bad, we don't think these sound good. Big companies do mediocre work, they stagnate.

Can you keep culture, do these systems work? Depends?

It depends on how you build those systems, are they fair and equitable.

It's just engineering, bulding large, scalable, fault-tolerant systems of people. That's a subject we don't do well as engineers

The biggest problem in computing is "people".

What can we learn from non-engineers?

Research: What makes people happy at work? Kept a daily journal, what happened that day, how they felt. Anaylsed and scored, If you had a good day at work, 75% of the time the thing that made people was "getting work done".

When managers are asked, only 5% thought getting progress was the main motivator.

Engineers are makers, people use our work to do their work. We care about performance, complexity of sofware, BigO notation etc. We don't seem to care about human performance metrics, "how long does it take to get up to speed ony our coe base? How does that compare".

Engineer and User Interface are sepereat,e let the art majors do with people stuff.

Don Marvin - the design of everyday things. There's no magic to design, everybody can do it and everybody can do it. You can get better at it via one rule. The rule is, nobody wants to use your stuff, they want to have used it.

When someone picks up a new thing. They ask themselves two questions.

  • What is this?
  • What can it do?

Make the answers as obvious as possible.

"If you have to refer to the documentation everytime you use the module, then use a new module".

So how do you change culture

Patrick Hudson, Delft University. The evolution of safety culture in oil/gas industry.

Culture change starts:

  • Shit happens, personal safety is important, we dont like unsafe practices, but you know, bad stuff happens.
  • Approach 1 - science - make the tools better. But you are not allowed to "science" your employees.
  • People make mistakes, dont follow the rules, push the envelope, all because employees are pushed
  • Approach 2 - Beauracarcy - Make a manual for everything that might happen. ** These are really only paper inventions that are only good for covering your ass.
  • Approach 3 - Acceptance, people will screw up. You need leadership not just management.
  • Crowd sourcd to the company, each team is responsible for their own accidents and managing them.

Culture can changed. If even oil and gas industrys can do it, then we can do it.

We are all leaders, often of teams and people. We're paid to have opinions.

Make stuff that's love, not tolerated.

Marc-Andrew Lemburg

Calls and Loops

  • Python function calls are slow, C functions are fast

    • operater.add is faster than def f(a,b): return a+b
    • Python operatins are even faster, specifically integer operations.
    • Method calls are even slow (due to extra lookup
  • Loops: for in range()

    • for loop is slowest
    • map is next fasters
    • list comprehensions are the fastest
    • If the function is a c-function, maps get more inefficient
  • Range returns a list, xrange returns a generator

    • range(1000) uses 32k of memoery
    • xrange(1000) uses 24 bytes!
    • in oyhton 3, range is now xrange by default
  • itertools

    • Tends to be much faster, use this if possible
    • Custom libraries, like PIL, NumPy etc.
  • Sorting

    • l.sort vs sorted
    • l.sort is inplace, destructive
    • sorted(l) does a copy.
    • both about as fast.
    • Sorting by 2nd parameter, use a decorator to put sort item first, then sort, then undecorate
    • Alterntely, use the l.sort(key=lambda x:x[1])
    • but twice as fast is l.sort(key=operator.itemgetter(1))
  • Join strings

    • Use +, but ''.join() is much faster
    • strng formating, is slower, so don't use it
  • Dictionary lookups

    • integer keys are much faster than string keys
    • Interned strings are faster, but an interned string will never be garbage collected
  • Lookups

    • Global is ever so slightly slower than local lookups
    • You can localise the global variable as a keyward argument default.
    • Inside a tight loop, you can declare a local variable at beginning of function as a cache
    • Attribute lookups, instance and class attribute lookups are about the same speed.
    • Old classes are slightly faster than new style classes, but not a lot
    • slots = optimised instance attributes, save memory, not hugely faster.
  • Exceptions

    • Exceptions are very slow, so use them only in exceptional circumstances
  • General advice

    • Use the right algorithm and datastructure
    • Refactor and reorganise the code where possible.
    • Move loops to C, profile repeat

@egenix - http://github.com/egenix/when-performance-matters for sample code

Try a little randomness Larry Hastings larry at hastings dot org

What are random numbers

A number produced on request, from a specific range, cannot be predicted in advanced Non deterministic

Usages

Simulations - monte carlow, weather, physics, economics Games, like gambling, video games in general

How do I get one?

Best way, random physical process Computer suck at placing bets on horses, so we need noisy signals Download high quality numbers from some websites (presuming you trust them). Listening from AM radios and weather static, wathcing ceaseam, watching lava lamps. These days, new CPU's have a random number generator

Modern operating systems use something called an entropy pool. The OS itself pays attention to random outsie world events, and mix into the pool. Things like network packets, key presses etc.

More often, we talk about psuedo-random numbers

Psuedo-random numbers are far more commen They are numebrs produced on request, from a specific range, that mostly cannot be predicted in advance. Deterministically produced.

Why use one?

Psudo random numbers are plentiful, so we can get as many as we want. Psuedo random numbers are repeatable, so I can generate the exact same stream again.

How do we get one

Using math, we use a lot of it. PRNG concepts

  • The seed - the internal state, might be one number, might be many more
  • The period - A PRNG will always end up repeating at some point, so we need to know how long before it will loop Good PRNG's have a long periods, indistinguishable from noise. Mersenne twister, from 1997. Period 2^19937-1, seed is 19937 bytes = 624 32bit words Used as default PRNG in python from 2.3 onwards. Very common, used in a lot of software

Cryptographically secure PRNG

Two tests, next bit test.

  • If you have as much data as you want from the PRNG, you cannot calculate the next bit easily no state compromise extension
  • You cannot take some output, work out the state, and run it backwards

Bad PRNG

Bad if they have short period Bad if they have a hidden pattern, that enables you to know what PRNG. People use it to exploit ai behaviour in games.

Linear congruent generator

Take (seed * M + C) % wordsize Feed back in. Fast, small code and small date

Randu algorithm, (seed * 65539) % 2^31 -1 - really that bad visual c rand() Still a linear congruent algirhtm, (214013 + 25531011) % (2^31 -1)

Visualisation of the random numbers helps us see patterns. Looks like tv static, poisson distribution 3d visualisation of randu shows very bad - 15 hyperplanes, where numbers won't be generated in some of the entropy space.

Docker that python app

What is it?

It's being adopted and used now. Including Ebay, Google, Youtube, Spotify. And some finance companies talking about using it in production now.

Google trends shows that docker is even better than sliced bread

So therefore Docker must have been around for ages right? (Actually only 18 months old)

What does it do?

"Docker lets you run isolated application with a single linux server"

1app = 1 container

Each DB, Web App, message queue etc is isolated to a single container.

Host Operating system can be any, providng it supports LXC (the underlying tech)

That sounds like virtual machines

  • Containers don't virtualise hardware
  • They have the same performance as the host - isolation not emulation
  • Minimises space consumption - VM's everything is isolated, but in containers, much of it is shared

Q: How does hte guest share the host os, if they are different A: The cotnainers only store the diffs, based on AuFS, so only the bits of guest OS that are different to Host OS are being run

Creating a container is very fast - practically instaneiously

Summary: Fast, reduced space, memory is conserved

Use cases

Development

Can run multiple shared services really easily. Can test real world production architecture easily You can test infrastructre as code on the docker images

Docker with Jenkins - is popular use case.
Tests can fire up containers with app very fast, enabling good integration tests, docker makes it much easier

Demo time

Two flask sample applications Dockerfile

  • from python27
  • some run steps to apply apt-get installs and updates, for the base
  • add . /srv/app
  • run flask --app=flaskr initdb
  • expose 5000
  • CMD["flask", "--app=flaskr", "run", "--host=0.0.0.0"]

We could add anything to the apt-get bits, say salt stack, ansible or something.

Docker command line tool, but there's a bunch of tools over the top to simplify. docker ps to list containers docker build -t flaskr . <- flaskr is the name, . tells it to use the dockerfile from the local directory This builds and spins up very fast

[Demo as expected, run the flaskr python app, very simply]

There are PAAS docker solutions out there now, that do this for you, for example adeus.

Qs: What are the other use caes? What should I be using it for? As: Lots, but not many using it in production

Qs: When are the docker commands run? a: Docker uses caching, so everything from before is cached

12 Factor App

Kristian Glass

Presenter has experience of doing this stuff

12 factor methodology

  • A lower case m for methodology - language independent
  • Core concept - clean interface between the app and the underlying system.
  • Allows for migration from platform to platform, makes portability and scalability better

One code repository - many deploys

  • Web devs find this easier.
  • If you have multiple code repositorys, then you have multiple apps.
  • Better to go to one code repository

2 - Explicitely declare and isolate dependencies

  • requirements.txt and pip etc.
  • But things like gevent, which requires libevent, not well solved.
  • Things like Docker, vms make it clearer what the requirements are
  • OS Packages do some of this.

3 - Read config from the environment

  • Everything likely to vary between deploys

  • Some web frameworks do this wrong

  • Database config is environment specific

  • List of installed apps (in django) doesn't change per environment

  • Enables separation of responsibility

    • Can be language independent
  • Avoids leakage of secrets

  • Environment not config file as it's language independent

How to populate the environment

  • Lots of tools that do it

  • Foreman, supervisor, honcho, bash scripts, envdir and many many more

  • Note, 12-factor doesn't say where envrionment should be populated from, that can be files and config and so forth.

4 - Treat backing services as attached resources

  • Try to avoid distinction between in memory databases and external databases.

  • Makes it easy to switch later.

  • For example, django dj-url schemes

  • This promotoes loose coupling

5 - Distinct build/release/run stages

  • Buildng and deploying are different stages.

  • Build does compilation, asset apckaging etc.

  • release takes an executable, combines with environment into a deployable artifact

  • New config -> new release

  • New code -> new build and new release

  • Run stage, as simple as possible.

  • Do all the hard work in advance.

  • Running things tends to happen at stressful times, so optimise it.

    • e.g. don't fetch packets at deploy time

Stateless Process

  • Don't persist to disk for long term. Assume nothing about ram or disk.
  • You want to not care about the server
  • You don't want to implement sticky sessions, avoid them

7 Export services by port binding

  • Your app should offer it's service by some protocol on some port.
  • If you can switch server from django to node.js and nobody notices, that's good.
  • Not just HTTP, rabbit vs ActiveMQ, just looks the same to everyone else

8 - Scale out with processes

  • Threads aren't necessarily good.

  • For scaling (rahter than mutliple workloads), run more processes

  • There's only so much you can scale up.

  • Add background jobs by doing more processes

  • Your app shouldn't daemonise either, let a supervisor/manager do it for you.

  • Every one does it differently

9 - Start quickly, stop nicely

  • They should be disposable.
  • This makes it easy to autoscale

10 - Keep environments as similar as possible

  • Environements differ in time, i.e. prod is 2 weeks behind dev.
  • Solution to time gaps is to deploy more often. Reduce the time lag between feature in dev and feature in prod
  • People Gap - I.e. DevOps. If people throw code over a wall.
  • Make developers own production
  • Tool Gap - It works on my machine.
  • Use the same tools everywhere. Use the same version everywhere. (Don't use SQLite in dev and PostGresQL in Prod).

11 - Treat logs as event streams

  • Every app has different ideas aboutlogging.
  • Django apps email on every 404, but java apps haven't been configured with logging.
  • Spit out logs to stdout, and then something pulls the logs together.
  • One event per line.

12 - Run admin tasks as part of the app

  • Thgat moment where you find the code that cleans up old users is a bit of code on one developers machine, that doesn't run any more
  • Bundle your one-off tasks as part of the app.
  • Django management are a good example.

Questions

  • q: How do you handle multiline stack traces in log aggregation?

  • a: It depends - how about different line endings?

  • q: What runs the bash script that populats the environment?

  • a: The "DevOps" team? It should live in a repository and be controlled just like code.

High Performance Python Landscape

Ian Ozsvald New book

What is high performance

often we are talking about making a machine run as fast as possible. Often you have to contort the code. That's a bad thing, it makes the code unmaintainable We can only drive improvments by knowing where the bottlenecks are Therefore profiling is the first step

Who is Ian

Consulting - ModelInsight.io Lots of AI consulting for clients over 10 years. Lots of iterative work, lots of trying again and again. Useful to be able to prototype ideas, then optimise it

cProfile

Shows you function level profiling, it tells you which functions take the longest to run. Run Snake Run - visualiser for the profiling It's a good high level view, it doesn't tell you which lines need to be optimsed

line_profile

Easy to use, add the @profile above function name decorator tells line_profiler to pay attention to each time. Can tell how often each line is execiuted How long each line took %-age breakdown of time in the function spent on the line

Can show interesting things, n += 1 turns out to be unexpectedly high

memory_profiler

Like line_profiler, records memory usage rather than cpu time. Python doesn't really know, so it estimates how much memory is allocated to the process, and then builds an approximate increment per line.

This can help you understand where the memory bottleneck (Also interactive %memit )

memory_profiler mprof

Can visualise memory usage, nice pretty graphs Good for internal discussion about changes to code

That was profiling tools, now how to speed up

cython

Forked from pyx, it is an ahead of time compiler. You annotate your python code and it does into a c++ annotated with semi python code, making it no longer python code. cdef unsigned int i, n cdef deboule complex z, c Turns the pyhton into a statically compiled module, the function can then be called as usual. Converts python into C, so goes really fast Plain python version takes about 12s Switch to Cython takes about 0.19s (approx 100x faster) For numeric code, that you need to run faster, it works very well.

Cython + numpy + OMP nogil

Combining the lot together, with nogil to remove the GIL. Now drops to 0.05s (240x faster than orgiinal)

Have to change the code a bit, since you need to declare which bits are parallelisable

ShedSkin

Annotate the code with comments showing you what the types are. This can help you work out the types needed for cython Doesn't work with NumPy

Pythran

A new project, following on from ShedSKin Take the idea one step futhter Decorate the function, with types, such as complex list or complex numpy array Again, compiles down to c-library It has some issues, they have to implement the functions, so when it works it goes relaly fast, but when it doesn't work, it just plain doesn't work.

Because of annotation, the python code doesn't change

PyPy nightly

Just works on python, it's still python code It has some clever strategies, it can run 6 - 10x speedup Which is nice. It doesn't play well with libraries that use C libraries, again NumPy is an issue There is a pypy version of numpy (numpypy) but it's buggy

Numba

Provide one line annotation, an @jit. Uses LOVM to improve the code at runtime Get's fast after a few runs. Almost no effort. Missing prange so no openMP support

Julia

How about writing some bits in Julia, and intergrating between them Hard to debug, you must keep both languages in head at time Could apply OpenCL to improve even more

Summary

PyPy has no learning curve, (pure Python only) Cython, takes half a day to learn - team cost is low hopefully. Cython, numpy, OMP = days to learn, harder team costs Numba/Pythran - installation is difficult, but looks good

Other options

Remember clustering is hard, buying a single big box on EC2/GCE is easier than ever. You can scale up much more easily now.

Final thoughts

Team cost is important, you can make the code run fast, but it needs maintaining.

Functional Programming in Python @petexgraham

What is FP, and what we can do in python, and when to look at alternative langauges

Timeline

  • LEarnt haskell at university
  • Promptly forgot it
  • Never thought about it again
  • thought off as academic, not pragmatic
  • Then something strange happened.

The FP renaissance

  • Sparked interest again
  • Is this worth doing or learning again?

## What is it?

  • It's not imperative - not a helpful answer
  • Ven diagram, shows it's not oo, procedural etc.
  • Also mixed paradigm programming

So what is it?

  • First class functions ** Means a function can be passed as a parameter to a function, and can return a function
  • MEans we can use higher order functions, like map and reduce
  • Some advanced concepts, pattern matching, tail recursion etc

What is it?

  • Lots of other views, somewhat newer definition ** Writing code that doesn't change state ** Not changing state, always calling a function returns the same result
  • FP is like describing your problem to a mathmatician
  • Imperative programming is like giving instructions to an idiot

Already functional that we use

  • Excel
  • Logo etc

Why learn it?

  • Pragmatic programmer recommends learning a new language each year
  • means learning different ways to solve problems
  • You can bring the knowledge back to your main language
  • Python paradox, is FP the next paradox?

What is it good for?

  • Immutability leads to no side effects ** Systems without side effects and weird behaviour is very desirable.
  • Concise and elegant code, which means less code ** Could take longer to get your head around?
  • Due to conciseness, can be more testable
  • Also strongly suits parallelism and concurrency ** I.e. Moores Law, CPU's have not got faster anymore, but number of cores have been added instead

Who's using it

  • Twitter: Scala
  • The Guardian: Scala
  • Amazon: Clojure
  • Facebook: Using Haskell internally
  • WhatsApp: Erlang

Where do we use it in the stack?

  • Passed from traditional backend, via message queue.
  • The clean architecture covers some of this
  • A number of clientside: ClojureScript, Purescropt etc

Where to learn enxt

  • Coursera functional principles in Scala
  • Learn you haskell for great good

What is an API?

Go see Requests, this is one fo the most downloaded python libraries becayse it has an awesome api But we are not talking about programming API's, we're talking about web apis REST is not Hypertext, it's a set of guidelines. Turning things into resources. You shouldn't be calling functions through HTTP, you are finding or creating resources. Examples includes Stripe or Paypal API's.

We interact via HTTP. HTTP provides us with standard ways of interacting with resources, via methods and status codes. A good rest api uses the methods GET, PUT/PATCH, POST, DELETE Twitter api is not RESTful

Response codes, i.e. 404 is not found, 200 is found, 201 is created.

REST API's should be stateless. Everything the API request needs should be included in the request. If someone can steal a session, they can do whatever they want.

For API's this is basically required.

HATEOAS - Hypermedia links should be included in the API. This means you don't need to trawl through the documentation itself. Things like next page, wrappers should never need to remember or construct the urls, but can follow them.

So why do I care?

API's provide extra ptnetial for platforms. API's enables the capability to be heavily extended, Twitter is a good example of this. Uber has started an API, so people can use UBER through other systems, like Google Maps or Expensify etc.

Turn users into /users Increasing platforms that pureply APIs, wrapping a complex service with a very simple service. Stripe for payments, SendGrid for email and distribution lists Twilio for SMS and voice

Whats hte connection with python

Python naturally applies to rEST, so think of it like PEP-8. Good interfaces in python.

Flask-Restful Does nice restful system, with classes, get, put methods and it does the rigth thing. Tastypie - Imtegrate an API into a django application

Live coding demo - Tastypie ModelResource used as superclass, then define Meta class, which has querysets, allowed_methods Add hte url routing. and it mostly just works bit.ly/tastypie

Django REST framework

Good for flexible, complex apis. EventBrite and so forth.

London API Meetup - come along

Whats the next thing after REST apis?

WebHooks API's that make HTTP requests to your application when something happens Traditional model is Application > Request > Server > Response Webhooks invert that model, Server > Request > Application > Response

  • Note always return a response

How does a spreadsheet work

@hjwp

Take us abck to 2005 Gmail was still invite mode

A group of idealist programmers thought, what if you could have a better spreadsheet. Decided to rebuild as a web based spreadsheet. Diribgable - web based spreadsheet, a tour of what it does.

Srpeadhseet, is basically a set of cells. We need a worksheet, like a dctionary, keyed on row and column We also want to type =1+1 and see 2.

So now we have a cell, that has a value and formula. I can eval the code. I need to catch errors

We also need to use special references, =A1+A2 We therefore need to transform A1 into worksheet['A',1] and that kind of stuff.

Cool parse tree stuff.

But now we also have dependencies, I can't calculate A1+A2 until I've resolved A1 and A2. We need to be able to process the dependencies, building a two way graph, so each node knows it's parents and each parent knows it's children.

But what about cycles in the dependency. Cycle detection. We track path as well as the graph, so we can detect coming back on ourselves.

Now we put python code int he system. Eval needs to be given a context, and before we eval the cell, we eval the user code.

But users want to address the results of formula evaluation as well, so we do it twice.

(At this point, my head exploded a bit) http://github.com/pythonanywhere/dirigible-spreadsheet/

A time travellers guide to Python

Jessica McKellar

  • Handy this is the same weekend that I also rented a time machine.
  • Let's go back in time
  • A director for the Python Software Foundation, and engineer at Dropbox
  • The past, present and future of python.

1989 - Python is born

  • Guido starts writing the python language
  • Inspired mostly by ABC, a teaching language
  • Also by Modula-3
  • A lot of the data types, white space, syntax etc is inspired by ABC, which was a teaching tool

1994 - Python 1.1

  • Release notes include double quotes, and dictionaries can use non-strings as keys
  • 1995 - Pickle
  • 1998 - Python 1.5 - We get standard exceptions.
    • Previous to now exceptions are strings, not classes
  • 2000 - Python 1.6 - Unicode support, plus PEP-1
  • 2000 - Python 2.0 - List comprehensions, distutils, and finally get a documented development module. Also Pygame started here
  • 2001 - Python 2.1, import from future
  • 2001 - Python 2.2 - Iterators and generators
  • 2003 - Python 2.3 - Boolean datatype, what a good idea. First official pycon, in Washington DC.
  • 2004 - Python 2.4 - Added decorators
  • 2005 - Django release
  • 2006 - NumPy
  • 2007 - PyPy
  • 2008 - Release 3.0 and 2.6.
  • 2010 - Python 2.7, last release of python 2.x
  • You can see an incredible language evolution.
  • Incorporated great ideas from other languages and communities
  • Pycons are showing the global community, over 300,000 registered to show that they are python fans.
  • Increase in diversity outreach, pyladies and so forth
  • There are changes and threats to python. Go, Javascript, JVM evolution.
  • If we read hacker news, then perception is that Python 3.0 is the biggest damage to python.
  • Is this how languages die? Is this true?
  • there's a lack of good data, so here's what Jessica thinks then.

What matters for the language and the community

  • Some background, Jessica comes from a series of startups.
  • Used to thinking about products that users actually use
  • Python is a product, it needs users.
    • It has competition, it has to adapt to meet those challenges
  • For this afternoon, your are the CEO of Python Inc. So what matters?
  • Clearly articulating the value proposition.
    • Why is python the best language for X, Y or Z.
    • It's got to be compelling enough that someone would pick it
  • Optimising the onboarding experience
    • New users need to get going really fast
  • Promoting network effects
    • Python is only useful if lots of people know it
    • If I can get hired for it.
  • Building and protecting beachhead domains
    • Where do we need to win, and keep on winning
  • Anticipate market changes
    • Does it matter that we don't do python on the mobile?
  • Python 3 is an internal detail that most external users don't care about

The value proposition

  • We think it's easy to learn
  • Fast enough
  • Good for rapid development
  • We need to challenge the fast enough one
    • It's only true if people believe it ## Optimising the onboarding experience
  • You are a 14 year old, on windows and you want to get started making a game
  • The tutorial doesn't set us up for success
  • Installing python, why do I have to pick python 2.7 or python 3?
  • Installing pygame is hard
  • Are we prioritising the onboarding experience?

The network effect

  • Things are changing for how to teach programming, python is increasingly popular.
  • people teaching in Java or C++ tend to say "I wish I was using python"
  • These are the future directors, CTO's, startups. Are they learning python?
  • This is possibly the most important thing. If programming were a required class, and they are all learning in python.
  • This is a virtual cycle, students -> employers who look for students and so forth
  • Are we doing this as best we can?
    • It shouldn't be an accident

Beachheads

  • Where are we winning?
    • Education, Scripting.
  • What do we think about Mobile?
  • Are we still fast enough?
  • Can we invest in the beachhead domains while anticipating the market changes
  • Python 3 is not a priority, because the number of people who benefit is orders of magnitude.
  • Sick of reading about Python 2 to 3 transition, that time could be better spent

The PSF

  • How can we help?
  • Python Sprints - Working on python itself.
    • If the thing that matters to you is a good tutorial, then get involved
    • No language focuses on this, so we can win
  • Outreach and education
    • Again a committee dedicated to that
  • Grants
    • Maybe even for Python 3 ports!
  • The challenge is by Pycon UK 2015 - find a tackle a python project that motivates you and that really matters.
  • A lot of things don't matter, but some do.

Questions and Answers

  • Meetups - Are they geographically disparate enough? (I think, there's not a question there)
    • There's no magic skill required to start a user group or conference.
    • If you want to make it happen, apply for a grant and do it
  • What do you propose we do about python 3? Like how does one address the tutorial?
    • The current plan is ok, old projects may stay on Python 2.7, new projects will likely be python 3.
    • Let's just stop complaining.
    • So how do we fix the python download page?
    • It should just say python 3, people who know python 2 know what they are doing
  • How much does python usage by influential companies matter?
    • there are some. We don't put any effort to putting marquee companies on the list. You should form a working group if you want to do that
  • Give examples of projects that matter
    • Onboard experience - in particular on windows and stuff
    • In specific, dependencies and things should be seemless and easy
    • Perception problems around performance are an issue, we should either fix the perception or fix the performance
  • Is the python software foundation doing anything with the UK Government stuff Talk to @ntoll who is advancing exactly that, with a Python in UK Education Working Group

#Using Pyland to teach programming

  • A programming game, for children
  • Motivation was to provide fun and creative environemnt
  • Children start with Scratch
    • They get visual and feedback
    • Then they have to move to python
    • Feels like a step back
    • Not as appealing or fun
    • You have to eadvanced to do the same sorts of stuff
  • Game play is motivating for the children
  • We can demonstrate why code is not a chore, you do it to improve the gameplay

Where are we

  • 10 week project
  • Done first 90%, now the second 90% to go!
  • Has a game engine, runs smother
  • Has 3 sample challanges
  • Not quite ready for testing with children thoug

Demo time

  • This looks quite cool

Implementation

  • User scripts are run in a python3 system
  • When they make api calls, they get pushed onto an event queue
  • Game engine is C++, SDL2 and OpenGL
  • Challanges are implemented with C++ code
  • Tiled map editor

Whats next to do

  • Good chat with teachers to work out what the challenges need to be
  • Need to simplify further and make it easier to get started
  • Multiplatform support would be nice
  • Better error messages, that are kid understandable
  • Khan academy does some of this already, so see what they are doing
  • What if kids could make their own levels?
  • How about collaboration?

How can I help?

  • Sponsorship - are you lucky enough who works for an employer who would want to sponsor?
  • Code, especially if you are a total newbie, as you can remember what trips you up
  • Test it out with some children
  • Any advice or ideas

@projectpyland @asbradbury http://github.com/pyland/pyland

From prototyping to production in government.

  • Michal - Not a code ninja

  • Jim - Developer consultant

  • GOV.UK - aiming to transform transactions in government

  • One of those transactions is to apply for redudnecy payments

  • Projects go trhough a 4 phase system

    • Discovery - find out what hte user need is
    • Alpha - build a prototype and test it
    • Beta - buil the real thing
    • Live
  • Going to talk about Alpha and Beta

  • If your comapny goes insolvent you cna apply. Less than 300,000 per year

  • Digital skills into the civil service

  • Integrate with backend systems

  • Digitize paper forms

  • It's a lifeline for people

Online forms

  • bread and butter

  • The standard thing we think off is the HMRC self assessment

  • Lots of fields, lots of HR information

  • If you've been redundent, you are in an axious state, so filling in forms can be a pain

  • Lots of user research, which shows that complex forms don't work well

  • Came up with very simpl format

  • Each page is collecting just a small amount of information, because anxious people struggle to fill them in

  • It reduces the chance that people pick up the phone, which saves government some money

  • Challange was to use forms

    • Lots of other form frameworks
    • We couldn't find anything that matched what we wanted, i.e. django forms etc
    • So we built it ontop of Flask
  • Alpha: Prototype

    • one controller per page
    • lots of bread and butter stuff, so manually created pages, verification and so forth
    • This rapidly would become 65 controllers, each hand cranked
    • So instead created forms through configuration
    • A dictionary to define the form
  • Form navigation is also part of the configuraiton, so we treat them a bit like a linked list.

    • Has some nice properties, enables inserting or reording pages easily
  • Pretty pleased with end result

  • Only one controller rather than 65 controllers

Developing Flows

  • Berry pickers, people who can enter a start date, end date and number of berries counted
  • We don't know how many periods the user might want to enter
  • In Alpha, we didn't do repition of pages
  • In Beta, we have to tackle this challange
  • An obvious pattern, every period loops the pages again
  • Put inot a container we called subsection.
  • Each subsection knows what pges should be shown and what order
  • The section knows the flow through the subsections works
  • Section can also make it easy to add subsections, dynamically during the user flow.
  • Section can link with the rest of the webflows, so it knows how to move to teh next section
  • (Code on screen to explain how this code works)

Lessons learned

  • SOlve one problem at a time.
    • i.e. solve how to link forward
    • Then add features again as needed
    • Build on top of solution
  • Problem: Conditional pages
    • Some pages mgiht only be shown if certain conditions are met
    • User research showed in beta that users don't want to ask questions that aren't relevant to them
    • Implementation 1: Conditional pages
    • Can link with looping, so the conditions apply inside the loop
  • Some problems can wait
    • In prototype, we learnt that some problems can wait. That helped us figure out what was impotant
    • Explicit is better than implicit, so the navigation is explicit.
    • Complex solution by combining simple components

Hard coded text

  • Fast feedback loop during Alpha.

  • Developers had to spend lots of time fixing minor text issues

  • Easy solution was to use internationalisation to mkae easy text change

  • getText and flask-babel

  • Now use po-edit and non developers can change the text very easily and quickly

  • That reduced the amount of work that the team has to do

  • Making web forms really easy for users, is really hard for developers

    • That's the right thing
    • We could have done an HMRc style mass form really easily
    • Doing the hard work to make it simple
    • Tried to start and keep as simple as possible
  • Insolvency service team are hiring

  • Q: Does the internationalisation mean that we can internationalise it later

  • A: Yes, but it's not an upfront requirement

  • Q: How are you carrying state

  • A: Backend, we cant rely on javascript in the frontend

Haircuts for your code

  • Carl Crowder

    def append_one(start=[]): start.append(1) return start

  • Known bug

  • PyLint will find and report this

  • Another common error is to do logging.debug('%s %s' % (a,b)) which should be logging.debug('%s %s', a, b)

  • Who am I

  • Carl Crowder]

  • Runs landscape.io - runs lint and nice metrics on your code as a cloud service

  • links to github.com

  • github.com/carlio

  • A gateway drug for cade analysis

  • Provides a nice health badge you cna put on your project

  • Not just about catching obious problems

    • Like importing a module that you aren't using
    • But also can nag you about things you know you should do
    • Tools to find dead code
  • Can also run pep-8

  • Can also check dependencies for out of date or security vulnerability

  • There are lots of other static anaylsis tools

  • MCCabe does complixity checking - cyclomatic complexity measures

  • But what if my IDE does this?

  • Your IDE doesn't necessarily warn you about all the issues

  • And it doesn't check all files all the time.

  • Also easy to ignore, not everyone is as fastidious as you are

How to use them

  • These tools are not gospel

  • They'll give you picky things you don't agree with

  • This of it more as a code review, is this correct? You can disagree

  • At the begining you probably just use the defaults, but take time to tune hte tools

  • TRack how well you are doing, do you have less errors than yesterday

  • For example, Jenkins plugins

  • Taking time up front, to tweak it, for exmaple django uses meta programming that pylint doesn't understand, tweak it to only give you real warnings

  • If you don't want the hosted version, prospector is a command line verison of landscape.io

  • It comes with pre-configured linting rules

  • You'd be surprised how useful it is

  • The number of errors you can get via static ananlysis is non-trivial

  • Slide links are at http://carlcrowder.com/pyconuk

Python on the Raspberry Pi

  • Interesting projects you can do on the Pi
  • Ben Nuttall, @ben_nuttall Development and outreach for education

Raspberry Pi

  • Cheap computer, want to get all kinds of people, including kids on to computers
  • Interested in computing not just coding.
  • Building physical things, not just writing code
  • Education team, 3 hear at PyConUK
  • As well as being like a desktop computer, the Pi has some GPio pins.
  • You can just wire up electronic components and talk to it from the PI.
  • That's kinda cool
  • 40 pins on the B+ PI as well, so it's getting better
  • Enough pins and power to make your own cool little robots

Python

  • Fuzzy Duck Brewery, they brew everything with PI's and Python

  • GPIO library allows python to talk to the output controllers

  • Also have a camera module for the Raspberry PI, cheap ribbon cable, cheap camera but does full HD.

  • You can again incorporate into the code

  • picamera, can take pictures

  • Good library, great docs up on readthedocs

  • Can then wire up a button on breadboard, use GPIO to take photo when button is pressed

  • Nice project, using pycamera as a cat detector.

  • Watches for movement, can take pictures when it happens

  • Then used CPIO to fire waterhose at the cat, and upload video to youtube!

  • A lot of this stuff is already built into the libraries

  • Lots of people want to build robots, often just a raspberry pi on wheels

  • There's a section of resources on the RaspPi website, as interest has grown

  • Morse Code tapper, wired to GPIO, allows you to learn morse code and turn it into text

  • Of course the Minecraft Pi as well. Installed by default

  • Has a python interface as well, allows you to interact with the world around in python

  • Can do cool things, like a house that follows you around.

  • Also use energenie, allowing you to control power sockets, like for example to turn off christmas lights

  • The API was pretty bad, so they got a 14 year old girl, interning at RaspPi, to build a great module, because it turns out 14 year old girls are better python coders than the sellers of this

  • Sous-Vide cooking

    • Vacuum Pack food
    • Cook in a water bath
    • Need to control the temperature
    • Used a raspberry pi to watch the temperature, use energenie to turn cooker on and off to maintain temperature.
    • Very simple abstractions that can be applied

Trouble at t' Leeds Data Mill Nicholas Tollervey, Dr Simon Davy

Big Data meets Big Brass

Ompash.hm

  • Leeds Data Mill
  • Civic hacking organisation, take civic data and do interesting stuff
  • "We tell stories with data"
  • Hack day
  • Can we tell an engaging story with the data?
  • What data sets were available?
  • Yorkshire water released some data
  • City Center data center.
  • Count number of people who walk past CCTV cameras
  • Released the data
  • Turn the footfall data into something musical
  • Simon is a purcussionist, Nick is a tuba player
  • Can we do something musical?
  • What kind of music epitimizes leeds?
  • Brass bands
  • Aimed to turn the raw footfall data into music for a brass band
  • Data analysis - or munging.
  • Data was in spreadsheet format, aggregated per hour
  • There's not a lot of numbers to do generation
  • Did a bit of analysis, looking at all tuesdays for example, showing the histograms of the data
  • Then smooshed it all together into two big lists
  • Two musical movements, weekdays and weekends
  • Just mash all the data together, from the past 5 years.
  • That gives enough data to do something with

Music from the data

  • Choose a scale, and pick a random note from the scale, based on the number
  • Rhythm, for any given hour and location, see how busy it is, turn that into intensity
  • Merge Pitch and Rhythm, general 8 bars of music in 4/4

Musical Mischief

  • Now we have some raw data, so got to turn it into music
  • It's very easy to make data into music that sounds unmusical
    • Sounds like a chimp on a keyboard
  • Not very useful
  • Charles. Ives, had a quote - What has sound got to do with music
  • Sound is just the medium, but music is the feeling, narrative and structure you hear
  • You intuitively hear the structure in music
  • So can we reveal the story of data, not just make bleeps and bloops
  • Need to map elements of data onto the music
    • Music is highly dimensional
    • Pitch: Means how high or low
    • Rhythm: The sequence of different durations of pitches
    • Melody: Combined pitch and Rhythm
    • Timbre: The quality of the sound, a violin and flute play C differently
    • Dynamics: The volume of the music
    • Key: The set of available pitches when you play, i.e. C Major
    • Harmony : How pitches sit on top of each other
    • If Melody is the horizontal, then Harmony is the Vertical!
    • Texture: How the harmony is voiced
    • etc etc etc
    • As a drummer, Simon points out that Nick missed Tempo!
  • How do we tell a narrative
  • Each instrument represents different areas of the city, i.e. timbre
    • For example railway station might be the horns, and the city center might be clarinets
    • The footfall changes the intensity of dynamic and rhythm, so more footfall = louder / busier melodies
  • Picked a pentatonic scale, so the notes sound nice together
  • Time is indicated by key, using the circle of fifths.
  • Music changes keys as the day goes on
  • Each hour is only represented by an 8 bar fragment
  • We used tubular bells to sound out the hours, so you get a auditory geography

Does it work?

  • Nicholas is explaining why he is awesome for playing the tuba
  • Live demo of Nicks Tuba, 1am, 7am (waking up), 12 noon (busy lunch)
  • (I have no idea how to textually represent live tuba playing.)
  • 1am is like mating call of the humpback whales
  • 7am is much more busy. 12 noon is busy and active and upbeat.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment